Customizing Extract, Transform and Load (ETL) processes

WebSphere Commerce Analyzer extracts, transforms and loads (ETL) data from a default set of tables and columns. If you add new tables or change existing ones, customize the ETL process in order to include the information from the new or changed tables in the data mart. Changing the ETL process involves writing new ETL scripts and editing existing ETL scripts.

ETL command tags and pre-processor

WebSphere Commerce Analyzer performs many ETL steps in the process of preparing the data mart. Most of these steps are processed by ETL implementation Java classes which point to SQL input files with XML tags describing how processing should occur. These XML tags are referred to as ETL command tags. To customize ETL to process new or modified tables, you customize the ETL implementation java classes or SQL input files that contain the ETL command tags.

The optional SQL input file is used as input to the ETL implementation java classes. The implementation Java classes analyze the command tags and attempts to ensure that the commands are passed to DB2 in an optimal way. For example, if the command tags insert several thousand records into the data mart, the implementation Java classes may decide to perform a commit every thousand records to improve performance. The implementation Java classes then passes the commands to DB2 as SQL statements to be executed against the data mart.

The ETL implementation Java classes:

 

Supported command tags

The ETL implementation Java classes are controlled through a set of command tags. These command tags are embedded in DB2 batch file comments. There are two types of ETL command tags; global and local.

You can use any of the WebSphere Commerce Analyzer supported global commands or local commands in your own customized .SQL file. However, it is important that you do not change these commands. In general, do not change SQL files predefined in WebSphere Commerce Analyzer. If you do, you are changing the logic of ETL functions.

Global commands

Global command tags affect every SQL statement in the batch file, and can be located on any line in the batch file. The global command applies to all SQL statements that are defined in the specified .SQL file.

Global command lines are prefixed with:

 --G--

Local commands

Local command tags affect only the first statement following the local tag. Insert the local command before the statement that it applies to. The SQL statement always ends with a semicolon (;) delimiter character. If you decide to comment out a locally tagged SQL statement, you should also comment out that statement's local ETL Driver tags or delete the tags altogether. Otherwise those tags are applied to the next uncommented SQL statement and can lead to unexpected results.

You can perform these types of operations using the following local command tags:
  • CursorInsert
  • CursorUpdate
  • LoadInsert

Local command lines are prefixed with:

--L--

For more information about ETL command tags refer to:

 

Error handling

The ETL implementation java classes pre-processors stops when an error occurs. See the ETL logs, located in WCA_installdir/log. for more information. These logs contain information generated by the extract, transform, and load processes during normal operation.