Customizing ETL processes

In this process, the WebSphere Commerce Analyzer extracts, transforms, and loads (ETL) data from the staging tables. This process transforms the data into a star schema that can be used for reporting and stores it in the WebSphere Commerce Analyzer's database, which is known as the data mart.

If you add new tables or change existing ones, customize the ETL process in order to include the information from the new or changed tables in the data mart.

In ETL customization, you insert the unprocessed data from the IWH.ORDERITEMGIFT_R to the data mart table, WCA.ORDERITEMGIFT. The process_row function checks if the LOAD_STATUS in the _R tables is -1 and checks some timing conditions. Then, using the same process_row function, you can update the LOAD_STATUS to the EXTRACTION_COUNT, which is a positive integer, so this row will not get processed again when you run the ETL script. In this ETL script, you will be only moving data from the replication target tables to some new data mart tables (WCA.ORDERITEMGIFT), which you have created in the previous step.

To customize the ETL process...

  1. Create an ETL script file, for example, pop_orderitemgift.sql.

  2. Store the script file in the WCA_installdir\bin\db2\60 directory.

  3. In the newly created script file, enter the following SQL statement:

    1. To insert new records:
      insert into wca.orderitemgift
      (
      ORDERITEMS_ID     ,
             EXTERNAL_ID     ,
             GIFTMESSAGE ,
             SHIPTORGSTRNT ,
             LAST_UPDATED    ,
             LAST_UPDATED_ID   
      )
      (
      SELECT 
             ORDERITEMS_ID ,
             EXTERNAL_ID  ,
             GIFTMESSAGE,
             SHIPTORGSTRNT,
             CURRENT TIMESTAMP,
             1 AS LAST_UPDATED_ID
      FROM
       IWH.orderitemgift_R ordgft
      WHERE
              iwh.process_row('I', ordgft.IBMSNAP_OPERATION,ordgft.LOAD_STATUS,ordgft.IBMSNAP_LOGMARKER )=1
      );
      commit;
      

    2. To update the LOAD_STATUS to the EXTRACTION_COUNT
      UPDATE IWH.orderitemgift_R SET LOAD_STATUS = INT(WCA.PARAM_VALUE('EXTRACTION_COUNT')) 
      WHERE iwh.process_row('I', ibmsnap_operation, load_status, ibmsnap_logmarker)=1;
      
      COMMIT;
      
      UPDATE WCA.orderitemgift OG SET 
      (
             EXTERNAL_ID  ,
             GIFTMESSAGE,
             SHIPTORGSTRNT,
             LAST_UPDATED    ,
             LAST_UPDATED_ID  
      )
      =
      (
      SELECT 
             EXTERNAL_ID  ,
             GIFTMESSAGE,
             SHIPTORGSTRNT,
             CURRENT TIMESTAMP,
             1 AS LAST_UPDATED_ID
      FROM
       IWH.orderitemgift_R ordgft
      WHERE
       og.orderitems_id=ordgft.ORDERITEMS_ID  AND
              IWH.PROCESS_ROW('U', ordgft.IBMSNAP_OPERATION, ordgft.LOAD_STATUS, ordgft.IBMSNAP_LOGMARKER )=1
      )
      where
      
       exists (select 1 from IWH.orderitemgift_R  o_g where
        iwh.process_row('U', ibmsnap_operation,load_status, IBMSNAP_LOGMARKER)=1
        and o_g.orderitems_id = OG.orderitems_id );
      
      
      UPDATE IWH.orderitemgift_R SET LOAD_STATUS = INT(WCA.PARAM_VALUE('EXTRACTION_COUNT')) 
      WHERE iwh.process_row('U', ibmsnap_operation, load_status, ibmsnap_logmarker)=1;
      
      COMMIT;
      

  4. To automate the ETL process for the customer-defined extensions you should add the lines of code in the build.xml file. The ETL process is executed by ANT. The ANT script is located in WCA_installdir\bin\build.xml. Add the following lines of code within <target name="WC ETL"> ----- </target>

    Make sure that you take a backup before modifying the file. In the build.xml file, update all the names of the renamed files.

    <target name="WC_ETL">
    - - -  - 
    <etl stepName="orderitemgift" fileName="pop_orderitemgift.sql" />
    </target>
    

  5. Run the Configuration Manager to create the customized tables and configure WebSphere Commerce Analyzer to WebSphere Commerce. (For more information on how to run the Configuration Manager, refer to the WebSphere Commerce Additional Software Guide.)

  6. Run the WebSphere Commerce Analyzer for replication and extraction of the data. During replication the data will be copied from the source table, ORDERITEMGIFT to staging tables, IWH.ORDERITEMGIFT_r. During ETL the data will be extracted, transformed and loaded from IWH.ORDERITEMGIFT_r table to WCA.ORDERITEMGIFT. For more information, refer to Run replication and ETL.

Related tasks