massload utility (Server environment)
This utility loads an XML input file into a target database. Loading the XML file updates the WebSphere Commerce database. The massload utility allows column-level updates to a table. It also allows you to delete data from a database.
For information about the massload utility in the WebSphere Commerce development environment, see, massload utility (Development environment).
The massload utility only loads one file at a time.
The massload utility requires valid and well-formed XML as input to the database. In this input, element names correspond to table names and attributes of the element correspond to column names in the table.
The massload utility includes the following features:
- Error reporter
- The massload utility includes an error reporter that generates an exception document if there is an error.
- Product Advisor search-space synchronization
- If you enable the loading utilities' Product Advisor search-space synchronization feature, you can maintain near real-time synchronicity of Product Advisor search spaces and WebSphere Commerce catalog tables being updated by the massload utility.
Before running this utility, ensure that you have completed the required configuration tasks:
- Configure the loading utilities.
- Configure tracing and logging for the loading utilities.
- Configure the massload utility.
- Optional: Configure the MassLoadCustomizer.properties file.
In addition to the trace log and message log for the loading utilities, this utility produces the following log file:
- WC_INSTALL/logs/massload.db2.log
- WC_INSTALL/logs/massload.oracle.log
Run this utility as the non-root WebSphere Commerce user ID. Do not run this command as root.
Utility command
The massload utility has the following file name:
- massload.sh
- massload.cmd
Parameter values
- -dbname
- Name of the target database.
- Depending on the JDBC driver you use to connect to the database, specify one of the following:
- Native i5/OS JDBC driver
- The database name as displayed in the relational database directory (WRKRDBDIRE).
- IBM Toolbox for Java JDBC driver
- The fully-qualified host name of the database system. If you are using the IBM Toolkit for Java JDBC driver, specify the -schemaname parameter.
If the database is on a remote IASP and the database name is different than the hostname, the value of dbname that is passed to a utility is:
-dbname "hostname/schemaName;database name=db_Name;cursor hold=false"For example,
-dbname "TORASCAT.yourcompany.com/demo;database name=CATDB;cursor hold=false"
- Required: The Oracle TNS name for the database.
- For DB2 UDB databases, the Type 2 database name is deprecated, where the database names do not contain a prefix.
That is, the DB2 Type 4 JDBC driver is used instead, where the Type 4 database name is prefixed with the database server and port. For example, db_server:db_port/db_name.
However, if you use the massload native load methods such as load, import, loadonly, and createonly, the Type 2 database name must be used because the generated command file needs the Type 2 database to connect to the database.
See How JDBC applications connect to a data source in the DB2 Information Center for more information.
- -dbuser
- Name of the user connecting to the database. Verify the user has database-level load authority and table-level INSERT privileges for native load methods.
- The ID of the instance user.
- Oracle user ID connecting to the database.
- -dbpwd
- Password for the user connecting to the database
- -method
- Mode of operation for the massload utility to use when modifying the database using input data. The default method is import.
Specify one of the following methods:
- import
- The import method uses the native database import or update functions, if they are available from the database vendor. If the import or update functions are not available, the import method uses SQL statements using JDBC update the database.
If the data being loaded exists in the database, the data is updated with new values from the XML file.
Restriction:
- This method does not support column-level updates.
- This method can only insert or update tables that have primary keys defined on them. Tables with keys that consist entirely of foreign keys cannot be loaded using this method.
- This massload method cannot be used to load data if you are using Product Advisor search-space synchronization. Use the sqlimport method.
- The import method cannot insert or update data in bit data fields.
- The import method only inserts or updates tables that have primary keys defined on them. The import method cannot insert or update data in tables that do not have a primary key. If the input record only has values for primary columns, the record is rejected.
- You cannot use the import method for remote databases.
- The import method does not support bit data or DBCLOB fields.
- This method is not available for Oracle database.
- load
- The load method uses the native database loading functions from the database vendor (DB2 Load or SQLLoad).
The DB2 load command requires the user to have database-level load authority and table-level INSERT privileges.
The load method expects the data and the target database to have the following properties:
- The data is clean. The data contains no conflicts or foreign reference problems.
- The target database tables do not contain any of the data being loaded. If the data exists in the database, the massload utility fails with a duplicate key error.
Restriction:
- This method does not support column-level updates.
- This massload method cannot be used to load data if you are using Product Advisor search-space synchronization. Use the sqlimport method.
- When you use this method, the massload utility does not check data integrity, including foreign references.
- You cannot update existing data using the load method.
- You cannot use the load method for remote databases.
- The load method does not support bit data or DBCLOB fields.
The massload utility sets the permissions of the WCS/temp directory to 755. Revert the permissions back to 750 after finishing the load, as these are the default permissions of the WCS/temp directory.
- sqlimport
The sqlimport method uses JDBC calls to insert and update data for local and remote databases. This method allows column-level updates and allows you to update existing data.
The sqlimport method also ensures that records meet the database schema constraints which makes the sqlimport method safer than the import or load methods.
- delete
- The delete method is used to delete data that is in the input XML document from the database. The element must contain the values for the primary key or the unique index for the table. If the data being deleted has dependencies to data in another table with "cascade on delete" enabled, the dependent data is also deleted.
If you are using Product Advisor search-space synchronization, use either the sqlimport or the delete method.
- createonly
- To improve performance during instance creation, use the createonly method. Use the createonly method to create mass load data (MLD) files without loading the data into the database.
The files that are created when you use this method (.mld and .cmd files) are placed in a directory named "MassLoadOutputFiles". This directory is created as a subdirectory under the WCS/temp directory.
You can load the MLD files that you created into a WebSphere Commerce database by running the massload utility using the loadonly method.
When creating an instance, the MassLoadOutputFiles directory is created in the temp directory, under the instance's root directory. The default location of the directory is WC_USER/instances/instance/temp/MassLoadOutputFiles.
- loadonly
- Use the loadonly method to load MLD files that were created using the createonly method. When you use the loadonly method, also use the -directory parameter.
The -directory parameter replaces the -infile parameter that you would specify if you were using any method other than loadonly.
For the value of the -directory parameter, specify the fully qualified path of the MassLoadOutputFiles directory that was created using the createonly method.
Here is an example of running the massload utility using the loadonly method (and the required -directory parameter):
./massload.sh -dbname mall -dbuser db2admin -dbpwd db2admin -method loadonly -directory WC_INSTALL\temp\MassLoadOutputFiles -schemaname wcsadmin
Always specify the name of the target database schema using the -schemaname parameter when you run the massload utility using this method. Otherwise, the program uses the name of the database schema obtained when the MassLoadOutputFiles directory and its files were originally created.
When you use the loadonly method, errors and other messages are saved in files that have a .log extension. These log files are written to the MassLoadOutputFiles directory specified for the -directory parameter.
The massload utility sets the permissions of the WCS/temp directory to 755. Revert the permissions back to 750 after finishing the load, as these are the default permissions of the WCS/temp directory.
Use the loadonly method only for instance creation. If you use it at any other time, the result may not be desirable.
- -infile
- Name of the input XML file.
Do not specify this parameter when using the loadonly method.
- -directory
- Full path of the MassLoadOutputFiles directory that was created using the createonly method.
You can only use this parameter with the loadonly method.
- Deprecated feature:
- -noprimary
- Optional: Action the massload utility must take when the primary key is missing for a record in the input file. The following are valid values for the -noprimary parameter:
- error
- The error option indicates that it should report the missing primary key as an error and terminate.
- insert
- The insert option tries to process (insert or delete) the data.
If you do not specify this parameter, the -noprimary parameter is set to error.
- -contentcontext
This parameter does not apply to WebSphere Commerce - Express. Optional: Tells the utility to use the base schema (production-ready data). This parameter cannot be specified with the -schemaname or the -workspaces parameters.
- -schemaname
- Name of the target database schema.
This parameter is required if there are multiple schemas in the database into which you are loading data.
If this parameter is not specified when running the utility, the utility looks for a name=value pair in the customizer property file that specifies the value of SchemaName. If this pair is present in the property file, the utility uses the value specified.
If neither a command-line, nor a property-file specification for the -schemaname parameter exists, the utility defaults to the value of the -dbuser parameter.
- -workspcname
- This parameter can only be used when loading data into a workspace on an authoring server. This parameter cannot be used when loading data on a staging server or a production server.
Optional: The workgroup code which is the system generated identifier for the workspace, not the name assigned to the workspace by the Workspace Manager. Specify this parameter if you want the massload utility to load a workspace. When loading into a workspace, the massload utility respects the locking policy set in the workspace. For more information about workspaces locking policies, see Workspaces locking policies If you specify the workspace parameter, you must specify the following parameters:
- -taskgrp
- This parameter does not apply to WebSphere Commerce - Express.
This parameter can only be used when loading data into a workspace on an authoring server. This parameter cannot be used when loading data on a staging server or a production server.
- The task group code which is the system generated identifier for the task groups, not the name assigned to the task group by the Workspace Manager.
- -task
- This parameter does not apply to WebSphere Commerce - Express.
This parameter can only be used when loading data into a workspace on an authoring server. This parameter cannot be used when loading data on a staging server or a production server.
- The task code which the system generated identifier for the task, not the name assigned to the task by the Workspace Manager.
- This parameter does not apply to WebSphere Commerce - Express.
- -commitcount
- Optional: Number of records processed before the database commit occurs when using the SQL update method of operation.
If you do not specify this parameter, the -commitcount parameter is set to 1. Transactions are committed for every update or insert into the database.
To improve the performance of the massload utility for large input files, the commit count should be increased. After considering the size of the input XML file, you might use a commit count larger than the number of records in the file. This enables rollback of the entire input XML file if an error occurs.
Do not set the -commitcount value as large as the number of elements in the input file for the following reasons:
- A high -commitcount value causes high memory consumption.
- When the -commitcount value is smaller than the number of elements in the input file, some data is written to the database. Depending on the value of -maxerror, a smaller value for -commitcount ensures that some data is written to the database before the maximum number of errors is exceeded and the tool terminates. The default value for -maxerror is 1.
- -maxerror
- Optional: Number of errors after which the massload utility will terminate in the SQL update method of operation.
If you do not specify this parameter, the -maxerror parameter is set to 1.
- -customizer
- Name of the customizer property file to be used for the WebSphere Commerce database. When specifying the customizer property file with this parameter, omit the ".properties" file extension.
Specify one of the following customizer values:
- (Not required) Do not specify this parameter.
- Required: Specify one of the following customizer files:
Required: DB2390ConnectionCustomizer
- ISeries_LODWCSDTA_Customizer
- Specify this customizer value if you are using the native system i JDBC driver.
When you specify this value, the massload utility uses the values specified in the following file:
WC_INSTALL/properties/ISeries_LODWCSDTA_Customizer.properties
- Toolbox_LODWCSDTA_Customizer
- Specify this customizer value if you are using the IBM Toolkit for Java JDBC driver.
When you specify this value, the massload utility uses the values specified in the following file:
WC_INSTALL/properties/Toolbox_LODWCSDTA_Customizer.properties
If you specify this customizer value, specify the hostname as the -dbname parameter. The following is an example of invoking the massload.sh script:
./massload.sh -dbname MY.HOSTNAME.COM -dbuser instance -dbpwd mypass -method sqlimport -customizer Toolbox_LODWCSDTA_Customizer -infile /path/file.xml -schemaname instance
This customizer properties file is located in the WC_INSTALL/properties directory.
When you specify this value, the massload utility uses the values specified in the following file:
WC_INSTALL/properties/DB2390ConnectionCustomizer.properties
- (Not required) Do not specify this parameter if you are using Oracle Database. By default, the massload utility uses the Oracle thick JDBC driver.
To use the thin JDBC driver, see Configure the massload utility.
If you do not specify this parameter, the -customizer parameter is set to MassLoadCustomizer.properties.
Related concepts
Overview of the mass load utilities
Transforming, loading, and extracting data using the WebSphere Commerce loading utilities
Related tasks
Configure the environment variable setting scripts
Configure the massload utility
Load data using the mass load utility
Configure timestamp and date data handling
Related reference
Resolve identifiers for records that share identifiers with existing data
Example: Loading current timestamps
massload utility (Development environment)