Administer > Transforming, loading, and extracting data > Overview of the data load utility


Data load best practices

  1. Configuration for the initial loads
  2. Configuration for the delta loads
  3. Run the data load script file
  4. Data load configuration files
  5. Configure wc-dataload.xml
  6. Configure wc-dataload-env.xml
  7. Configure data load business object configuration
  8. CSV files
  9. Load by unique ID
  10. Load data into a workspace


Configuration for the initial loads

See Initial load scenario for more information about recommended configurations during initial loads.


Configuration for the delta loads

See Delta load scenario for more information about recommended configuration during delta loads.


Run the data load script

Optionally turn off XML validation if you are using some variable substitution for integer attributes...

-DXmlValidation=false

Turn on more tracing in...

WC_INSTALL/logs/wc-dataload.log

...by specifying the following option to turn on the FINEST tracing for all packages:

-D.level=FINEST

For large loads, specifying FINEST trace level, causes too much tracing in the log file; you can turn on tracing for one package. If there is an SQL error, turn on the database-related trace specifying the FINE trace level:

-Dcom.ibm.commerce.foundation.dataload.database.level=FINE

Customize the Java logging configuration file...

WC_INSTALL/wc.ear/xml/config/dataload/logging.properties

You can...

By default, you only have 1 log file, and the log file is overwritten every time you run the data load utility.


Data load configuration files

There are three types of data load configuration files:

Load order configuration file wc-dataload.xml You can have multiple load order configuration files, or one file with all items.

To load just a few load items, run the data load utility with options...

-DLoadOrder="loadItemName1, loadItemName2, loadItemName3"
Environment configuration file wc-dataload-env.xml You only need one copy of this configuration file.
Business object configuration files wc-loader-<business object>.xml One business object configuration file generally corresponds to one type of input data, which loads one type of business object.

Defines...

...used for the data load.

Keep all the data load configuration files relative to wc-dataload.xml.

Ensure that the configuration files specified in wc-dataload.xml use relative paths to make it easy to move the configuration files from one workstation to another.


Configure wc-dataload.xml

Specify commitCount, batchSize, dataLoaderMode at the LoadOrder level, so you do not need to specify them at each LoadItem level.

Specify the commitCount to be greater than or equal to the batchSize. The commitCount is a multiple of the batchSize.

To minimize the impact to the production server, specify the commitCount and batchSize to 1. Specifying a large commitCount and batchSize improves the data load performance, but might have impact on the database; that is more database tables and rows are locked for longer time.

For easy debugging of some SQL errors, specify the batchSize to 1 and turn on the database tracing. This allows you to find out which SQL statement or input line caused the SQL error. If the batchSize is greater than 1, the JDBC batch update is enabled; it is hard to relate the SQL error to the input line or SQL statement that caused the error.


Configure wc-dataload-env.xml

Consider the following configurations:


Configure the data load business object configuration

Consider the following configurations:


CSV files

Consider the following tips when editing or maintaining the CSV files:


Load by unique ID

Specify the unique ID is optional when using the data load utility. However, if you specify the unique ID, you save the overhead of resolving the ID, and performance is improved.


Load data into a workspace

You are recommended to load the data into a workspace when using the data load utility. Benefits of loading data into the workspace:


See

  1. Catalog: data load best practices
  2. Inventory: data load best practices
  3. Price: data load best practices


Related concepts


Workspaces
Workspaces locking policies


Related tasks

Substitute attribute values with variables in data load configuration files
Load data into workspaces using the Data load utility
Configure the data load order

Related reference

Examples: Mapping data
Initial load scenario
Delta load scenario
Data Load utility


+

Search Tips   |   Advanced Search