Administer > Transforming, loading, and extracting data > Overview of the data load utility > Run the data load utility


Configure the data load order

The data load order configuration file controls the load order of the data load utility. The data load file has a pointer to the environment settings file, business object configuration file, and input file. You can also define the mode that the data load utility uses to load data.


Procedure

  1. Create a file that specifies the data load order.

    Some of the tags that are used in the data load order file...

    <_config:DataLoadEnvironment> Pointer to the environment settings file.
    <_config:LoadOrder> Data load mode.
    <_config:LoadItem> The businessObjectConfigFile attribute specifies the business object configuration file.
    <_config:DataSourceLocation> Pointer to the input file.

    The following sample shows a configuration file for catalog data.

    <?xml version="1.0" encoding="UTF-8" ?> 
    
    <_config:DataLoadConfiguration 
        xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
        xsi:schemaLocation="http://www.ibm.com/xmlns/prod/commerce/foundation/config ../../../../xml/config/xsd/wc-dataload.xsd" 
        xmlns:_config="http://www.ibm.com/xmlns/prod/commerce/foundation/config">         
    
         <_config:DataLoadEnvironment configFile="wc-dataload-env.xml" /> 
        
         <_config:LoadOrder commitCount="100" 
                            batchSize="1" 
                            dataLoadMode="Replace">             
         
             <_config:LoadItem name="CatalogGroup" 
                               businessObjectConfigFile="wc-loader-catalog-group.xml">                    
    
                 <_config:DataSourceLocation location="CatalogGroups.csv" /> 
    
             </_config:LoadItem>            
    
             <_config:LoadItem name="CatalogEntry" 
                               businessObjectConfigFile="wc-loader-catalog-entry.xml">                    
    
                 <_config:DataSourceLocation location="CatalogEntries.csv" /> 
    
             </_config:LoadItem>        
         </_config:LoadOrder>
        
    </_config:DataLoadConfiguration>
    

    Set appropriate values:

    Option Description
    commitCount
    0 Nothing is committed until the load item finishes processing all of its input data.
    N Where N is a positive integer value. The commit count specifies how many lines of records processed when it calls the database commit. The default value is 1.
    batchSize
    0 Uses JDBC batch update. All batches, for the entire load item, are processed for input data.
    N Where N is a positive integer value, indicating how many lines of data is processed before the JDBC batch is executed. The JDBC batch is enabled if and only if the batch size is greater than 1. The batchSize value should be less than or equal to the commitCount value. The default batch size is 1, which means that the JDBC batch is not enabled and the SQL statements are executed one by one directly.
    dataLoadMode
    Insert All data is inserted into the database. The utility generates insert SQL statements. This mode is recommended for initial loading of data. If there are delete flags in the CSV data file, the flags are ignored.

    In insert mode, you can specify a primary key range to use when the object does not exist in the database and it requires a new generated key. Specify the value within the <_config:BusinessObjectMediator> element. For example:

    startKey="100001" endKey="200001"
    

    The data writers supported in the insert mode:

    • JDBC data writer

    • native load data writer

    Replace Default: All data is replaced in the database. The utility generates insert, update, or delete SQL statements depending on the data. Replace mode replaces existing data contained in the database with the input data. That is, if some column information is not in the input data, the column value is updated to null or the default value if any. For example:

    • If one record (line) in the CSV file represents a new object, it is inserted.

    • If the object is in the database already, it is replaced.

    • If there is a flag in the data to indicate this object should be deleted, it is deleted.

    In replace mode, specifying a primary key range value is not recommended as it can result in key conflicts within the database.

    The data writers supported in the replace mode:

    • JDBC data writer

    • native load data writer

    To prevent accidentally replacing information in the database with null data, modify the original input data used in the initial load when replacing a subset of the original data. Do not enter empty fields in the source file, unless you want the fields to contain null data in the database.

    Delete All data is deleted from the database. The utility generates delete SQL statements. Delete flags in the CSV data are ignored.

    The CatalogEntryMediator mediator supports the mark for delete operation.

    To set the mark for delete flag, type the following information in the <_config:BusinessObjectMediator> element of the business object configuration file:

    <_config:BusinessObjectMediator 
        className="com.ibm.commerce.catalog.dataload.mediator.CatalogEntryMediator" 
        componentId="com.ibm.commerce.catalog">     
    
            <_config:property name="markForDelete" value="false" />
    
    </_config:BusinessObjectMediator>
    

    Only the JDBC data writer is supported in this mode.

    maxError
    0 Continue to run the data load utility, regardless of how many errors occur.
    N Where N is a positive integer value. The max error count specifies the error tolerance level during the data load process for a load item. The default value is 1.

  2. Save and close the file.


Next topic: Configure the data load environment settings


Related tasks

Configure the data load environment settings
Substitute attribute values with variables in data load configuration files

Related reference

Data load best practices


Related information

Data load order configuration file


+

Search Tips   |   Advanced Search