Data extraction utility for dynamic recommendations in IBM Product Recommendations
The Intelligent Offer data extraction utility is a command-line utility that we can use to create the Enterprise Product Report (EPR) data for dynamic recommendations required by . The utility extracts catalog data from the database and generates ECDF and EPCMF files in the correct format to load into IBM Product Recommendations. We can provide these two files to IBM Product Recommendations regularly for processing dynamic recommendations. The IBM Product Recommendations data extraction utility generates two CSV (comma-separated value) file types that contain the catalog data for each IBM Digital Analytics client ID:
- EPCMF (Enterprise Product Content Mapping File)
- This file contains data that represents catalog entries, that is, products that can be bought, pre-built kits, and dynamic kits for a store. This file also specifies the master catalog category to which the catalog entry belongs.
- ECDF (Enterprise Category Definition File)
- This file contains data that represents the master catalog category hierarchies for a store.
Sample of the generated EPCMF file
This sample shows the catalog entry data that the utility extracts for the EPCMF file:
This file contains up to 55 columns:
- The first five columns contain mandatory data that IBM Product Recommendations requires:
- File date
- The date that the utility created the CSV file, in YYYYMMDD format.
- Client ID
- The IBM Digital Analytics client ID.
- Item ID
- The part number of the catalog entry.
- Item
- The name of the catalog entry.
- Items Primary Category ID
- The master catalog category to which the catalog entry belongs.
- The remaining 50 columns are for customer-defined static attributes for catalog entries. Data mappings for the first six of these static attribute columns are predefined to contain specific catalog entry data, but we can change the predefined contents. See the data mapping descriptions in ../refs/rmtepcmfsample.htm.
Sample of the generated ECDF file
This sample shows the catalog hierarchy data that the utility extracts for the ECDF file:The five columns in this file contain mandatory data that IBM Product Recommendations requires:
- File date
- The date that the utility created the CSV file, in YYYYMMDD format.
- Client ID
- The IBM Digital Analytics client ID.
- Category ID
- The category identifier.
- Category Name
- The name of the category.
- Parent Category ID
- The category identifier of the parent category.
Configuration files for the IBM Product Recommendations data extraction utility
The IBM Product Recommendations data extraction utility uses three types of configuration files. Samples are provided, but we must update the samples with configuration information specific to the site and environment. These configuration files are based on the Data Load utility configuration files, but they include some extensions.
- wc-dataextract.xml
- This file is the main configuration file that we must point to when you run the utility. This file specifies the paths to the environment configuration file and to the business object configuration file.
- wc-dataextract-env.xml
- This file is the environment configuration file. Configure the language of the store and the currency for the price data before you run the utility.
- wc-dataextract-business_object.xml
- This file is the business object configuration file. For this utility, you need two versions of this file:
- wc-dataextract-catalog-entry.xml: This business object configuration file is used to extract catalog entry data for the EPCMF file.
- wc-dataextract-catalog-group.xml: This business object configuration file is used to extract category data for the ECDF file.
These files contain:
- Business context information.
- Data mappings required to transform WebSphere Commerce business objects to the data that is written to columns in the EPCMF or ECDF file. The EPCMF file supports up to 15 customer-defined static attributes for catalog entries.
- Definitions for the order that the utility writes the data to the columns in the file.
- Pointers to interfaces and implementation classes that the utility uses.
Use the utility in different environments
The data extraction utility for IBM Product Recommendations can be run in the staging and production environments. However, we are recommended to run the utility in an environment that has all of the information required. For example, the staging environment might not have inventory or pricing information. In this case, run the utility on the production environment.
We can generate the CSV files in your staging environment to load into your IBM Product Recommendations test environment. We can also generate the CSV files in the production environment to load into your IBM Product Recommendations production environment. The utility is not intended to be run in the development environment. Support is provided in the development environment with a Derby database for customization purposes only. For example, when we are testing changes to the business object configuration file to include custom catalog entry attributes for the EPCMF file.