Generate a file that captures the catalog hierarchy

Classification of products and contents requires a Category Definition File (CDF). This file needs to be sent to IBM Digital Analytics before pages are tagged. IBM Digital Analytics provides a command-line utility that an administrator can launch to generate the product data as a text file.

A CDF file is a text file that defines the category tree of the product catalog and page content. The CDF file captures this data as a CSV (comma-separated value) file with four columns - Coremetrics Client ID, Category ID, Category Name, and Parent Category ID.


Procedure

  1. Navigate to the appropriate folder:

  2. Run the CDFGenerator utility.

  3. Locate the output file. This file is either in the /bin/ directory from which you launched the utility, or in a directory specified as part of the utility's command.

    Note: If you need to capture page content information, we must manually append the generated output file with this data. See the implementation guide for IBM Digital Analytics.

  4. Send the output file to IBM Digital Analytics. For more information, implementation guide for IBM Digital Analytics.

    Note: If the catalogid value specified is for a sales catalog, there might be more than one record in the CDF file with the same category ID. In a sales catalog, a unique category ID can have multiple parent categories, as shown in the last two lines of this CDF file excerpt:

      99999999,101,MENS, 
      99999999,102,SALE, 
      99999999,123,MENS SALE,101 
      99999999,123,MENS SALE,102

    In the CDF file, however, a unique category ID can have only one parent category. When you upload a CDF file that contains records with duplicate category IDs, IBM Digital Analytics, formerly known as Coremetrics Analytics will issue warnings and reject the additional duplicate records. You have three options:

    • Ignore the warnings from IBM Digital Analytics, formerly known as Coremetrics Analytics; the additional duplicate records will be rejected by IBM Digital Analytics, formerly known as Coremetrics Analytics automatically.

    • Remove the additional duplicate records manually from the CDF file before uploading.

    • Consult IBM Digital Analytics, formerly known as Coremetrics Analytics for other implementation alternatives.

Previous topic: Configure the site to communicate with IBM Digital Analytics
Next topic: Run the auto tagging utility


Related concepts
Integrate a store with IBM Digital Analytics for WebSphere Commerce


Related reference
Track IBM Product Recommendations for IBM Digital Analytics