WebSphere Commerce Search index schema definition

The WebSphere Commerce Search index schema definition contains information for processing data in WebSphere Commerce.

For more detailed information, check the solrhome/MC_masterCatalogId/locale/CatalogEntry/conf/schema.xml file for the search index schema definition that matches your version of WebSphere Commerce search.


Catalog Entry extended sites override

When an extended site overrides any of the following catalog entry properties: Name, short description, and keywords, both the asset store, and the extended site properties are being indexed. If a hit matches either the asset store, or the extended site store, the product is returned and updated by the run time with the extended site override display text.


Category Name

The category name is being indexed under the CatalogEntry index. This is a searchable field, which is used to enhance the relevancy of the product that is being returned. Runtime components boost this field. See Search relevancy and merchandising.


Master catalogs and multiple languages

A WebSphere Commerce instance might have multiple master catalogs, with each master catalog that belong to one store, or in extended sites, might belong to a cluster of stores. The master catalog is not reflected in the index schema. That is, it is reflected at the level of index cores. In addition, each master catalog contains its supported languages through its attached stores. A separate index core is created for each language and master catalog. The following are the main benefits of maintaining separate index cores for each language:

Where the WebSphere Commerce instance contains two master catalogs, with each master catalog possibly containing different languages. Separate catalog entry index cores are created in each language, under each master catalog. When new WebSphere Commerce objects are indexed, new cores are created using the same structure. Each index then provides its own set of search functionality in the storefront, such as its own automatic keyword suggestions, spelling corrections, and dictionaries.


WebSphere Commerce store models and catalog entry types

Search indexing supports various store models by indexing the top-level Catentry store_id. This information is stored in the STORECENT table. In extended site, the store path information is built in query conditions, including both the shopper's extended site store_id, and parent asset store_id.

WebSphere Commerce catentry types such as Product, Item, Bundle, and Package are supported by default for WebSphere Commerce Search indexing. They are in the same index, with their data extracted separately from the WebSphere Commerce database. For example, the query to extract attribute values for products differs for items. The CATENTTYPE_ID is indexed as a field in the search index to distinguish each catentry's type.


Prices in WebSphere Commerce Search

WebSphere Commerce Search indexes the default offer price. The default offer price is typically considered as the offer price under the catentry's owning store's master catalog's default trading position container. The default offer price can also be complex by considering the following variables:


Contract entitlement

The runtime contract entitlement filtering is performed through the product set. Based on the shopper's eligible contract in the session, where the contracts are represented as a list of included and excluded product sets, the PRSETCEREL table is checked against each catentry to ensure that the shopper is eligible to access the catentry. Therefore, the PRSETCEREL.PRODUCTSET_ID is indexed and queried with the shopper's list of included and excluded product sets. When the includeEntireCatalog flag is true the entitlement logic ignores productset checking. That is, the entire catalog is selected and productset_id is not evaluated against the search index.


Spell check

A dedicated indexed multivalue field is being used to maintain the source and scope of the spell check suggestions. The content of the spellCheck field is populated by copying other column's content. The following are all the fields that are copied into the spellCheck field:

The spell check index field is the source of the keyword suggestions under the Keyword auto-suggest section, and also the source of the suggestions of misspelled keywords in the storefront.


Parent catalog groups and catentries

There are two parent catalog group fields in the index:

The parentCatgroup_id_search field is used for search filtering. For example, searching for a catentry under a specific parent catalog group that can be in master catalog or sales catalog, where the parent catalog group is not necessarily the direct parent of the catentries. To improve the runtime search performance, parentCatgroup_id_search includes both the catentry's direct parent group and all its parents in the catalog hierarchical graph. In addition, since a catentry can contain different parent groups under different catalogs, the parent catgroup must be prefixed by the catalog_id: CatalogId_ParentCatgroupId. This multivalued field improves the search performance, but reduces the catgroup faceting performance. This is because the search engine calculates the faceting count for every parent catgroup, even if most of the calculation results are not useful for faceting.

Therefore, the parentCatgroup_id_facet field is used for search faceting, and only indexes the catentry's direct parent catalog groups in one specific catalog. That is, catgroup faceting needs only the faceting numbers on leaf catalog groups to build the navigation tree. The format of the field is parentCatgroupId.

For parent catentries in WebSphere Commerce, an item can belong to one product, many packages, many bundles, and many kits. These parent catentry's IDs are indexes in a field in the index, so that the parent catentry can be easily identified.


Attribute Dictionary

Attributes in the Attribute Dictionary are indexed as catentry properties. Not only are the attributes indexed, but business users can dynamically add attributes from the index.

The mapping is stored in the ATTRDICTSRCHCONF table. ADS_Fn columns are for string type of attributes, ADI_Fn columns are for integer type of attributes, and ADF_Fn columns are for float type of attributes.


Attribute Dictionary attributes

Some of the catalog entry's associated Attribute Dictionary Attributes properties are being index under the multivalue ad_attribute catalogEntry index field. The properties are tokenized into a string, which is separated by a delimiter. The following snippet is an example of the raw Attribute Dictionary attributes data as it appears in the catalog entry index:

The following properties are being indexed and mapped to the following internal index field name:

Attributes properties
Index Database source table/column name Internal field name as defined in the wc-component.xml Description
0 CATENTRYATTR.ATTR_ID attr_id The attribute unique ID.
1 ATTR.IDENTIFIER identifier The attribute identifier.
2 ATTRDESC.NAME name The attribute display name.
3 CATENTRYATTR.SEQUENCE sequence The attribute sequence with its product.
4 CATENTRYATTR.ATTRVAL_ID attrval_id The attribute value unique ID.
5 ATTRVAL.IDENTIFIER identifier The attribute value identifier.
6 ATTRVALDESC.STRINGVALUE,ATTRVALDESC.VALUE value The actual attribute value display value (if string type, uses the STRINGVALUE field; otherwise, uses the VALUE field.
7 ATTRVALDESC.SEQUENCE sequence The attribute value sequence with its attribute.
8 ATTR.DISPLAYABLE displayable The attribute is marked searchable.
9 ATTR.SEARCHABLE searchable The attribute is marked displayable.
10 ATTR.COMPARABLE comparable The attribute is marked comparable.
11 ATTR.STOREDISPLAY storeDisplay The attribute is being used for ribbon ad.
12 ATTR.FACETABLE facetable The attribute is marked facetable.
13 CATENTRYATTR.USAGE usage The attribute is assigned to a product as either descriptive or defining attribute.
14 ATTRVALDESC.IMAGE1 image1 The attribute value image 1.


Bundles and Packages components

Some of the catalog entry's components properties are indexed under the CatalogEntry component's multivalue index field. The properties are tokenized into a string, which is separated by a delimiter, and sequenced according to their sequence value in the CATENTREL.SEQUENCE, DKPDCCOMPLIST.SEQUENCE database columns.

The following snippet is an example of the raw components data as it appears in the catalog entry index: Bundle, and Package component index field:

The following Bundle and Package properties are being indexed and mapped to the following internal index field name:

Bundle and Package properties
Index Database source table/column name Internal field name as defined in the wc-component.xml Description
0 CATENTREL.CATENTRY_ID_CHILD, and DKPDCCOMPLIST.CATENTRY_ID catentry_id The component unique ID.
1 CATENTREL.QUANTITY, DKPDCCOMPLIST.QUANTITY quantity The component quantity.


Related concepts
Workspaces in WebSphere Commerce Search