Tutorial: Search Performance Improvement: 11-step program for iSeries

 

Introduction

This tutorial will consider creating a unique summary table to enable a single Commerce type of search. The example used in this tutorial cannot be directly reused in any given Commerce solution; nevertheless, the basic elements of this design and implementation can be followed for customizing for any given solution. The tutorial is provided in an 11-step program format. This format can provide for you a high level format on how to approach future search implementations and customizations.

11-step program:

The following steps are the basic elements of putting together a customized solution in order to greatly enhance search performance:

  1. Acknowledge search performance improvement areas.
  2. Acknowledge search limitations
  3. Examine the catalog search features to see how they can fit the requirements.
  4. Perform a search comparison against all out-of-the-box features provided in WebSphere Commerce.
  5. Consider and decide which parameters are to be used against a specific search.
  6. Run tests and output the generated SQL. Dissect this SQL to design the summary tables.
  7. Create the summary tables based on the expected fields to be searched against
  8. Develop and implement the customization of the search bean to work against each summary tables.
  9. Run tests and output the generated SQL.
  10. Measure performance improvements
  11. Refresh the summary tables

 

Example 1: Using MQTs in an iSeries environment

  1. Acknowledge search performance improvement areas.
    For this particular example, DB2 UDB summary tables will be created that match common queries employed by the search bean on the catalog subsystem. A basic set of summary tables is available and can be created using the Configuration Manager during database schema creation. To have a committed enhanced performance improvement, a new summary table will be created instead of using the WebSphere Commerce out-of-the-box summary tables. This new summary table will be based on materialized query table (MQT) design specific for the iSeries platform. The MQT is similar to the DB2 UDB summary table for other platforms except that the refresh of the table is called manually using REFRESH TABLE statement.
  2. Acknowledge search limitations.
    The new summary table will not have LONGVARCHAR data types. Any pre-existing LONGVARCHAR fields will be broken up into substring pieces of summary table fields. This part of the design will allow us to better customize a performance-friendly solution.
  3. Examine the catalog search features to see how they can fit the requirements.
    For this particular example, we would like to minimize the features to a controlled set of parameters to search on. The Boolean search will impose an OR expression across all fields since there is only one summary table to search against.
  4. Perform a search comparison against all out-of-the-box features provided in WebSphere Commerce.
    For this particular example, the simple search method will be used since we are illustrating a basic example of improving search performance.
  5. Consider and decide which parameters are to be used against a specific search.
    For this particular example, the following parameters will be used:

    resultCatEntryType (String)

    For this example, only products are to be returned in the search result - sending a value of 2 through form values.

    Note: the code will be customized such that this parameter will not be needed.

    manufacturer (String)

    The value of this variable is used in a search on a manufacturer's name (table columns searched: Catentry.Mfname). This parameter is only passed from search to locate products within a manufacturer.

    beginIndex (String)

    This variable is used for paging the result set. The value must be the index of the first result row in a page. This parameter has not been changed from out-of-the-box representation.

    pageSize (String)

    The value of this variable specifies the number of search result rows to be displayed per page. The value was changed to represent 18 (that is, 18 displayed items per page).

    resultType (String)

    Merchant can specify if they want to show Products or Items or both Products and Items in a search result. The value must be 1 (Items only), 2 (Products only) or 3 (both products and items). The value was restricted to 3 (both products and items).

    Note: the code will be customized such that this parameter will not be needed.

    searchTerm (String)

    The value of this variable is used in a search on a word (table columns searched: Catentdesc.name, Catentdesc.Shortdescription, Catentdesc.Longdescription). This parameter has not been changed from out-of-the-box representation. Note: the code will be customized such that this parameter will search the summary table fields instead.

    sku (String)

    The value of this variable is used in a search on SKU (table columns searched: Catentry.Sku). This parameter has not been changed from out-of-the-box representation.

    Note: the code will be customized such that this parameter will search the summary table fields instead.

    orderBy1 (String)

    Result set will be sorted by this attribute. The value was changed to point to a new external RuleQuery class. Base code must be checked to find the extended RuleQuery to point to the summary table field declared in the class reference: 'ENTRYSHORTDESCAttributeInfo'

    coSearchenabled (Boolean)

    Search input terms (one contigious term only) for descriptions will be also co-search with SKU column. This parameter has not been changed from the out-of-the-box representation. This has to be tested using two or more search term value inputs because the out-of-the-box representation will not include SKU search if more than one search term is entered in the search term parameter.

    isProduct (Boolean)

    Returns results of 'product data bean' catentry type, which is false by default. This parameter has not been changed from out-of-the-box representation.

    Note: the code will be customized such that this parameter will not be needed.

    isItem (Boolean)

    Returns results of 'item data bean' catentry type, which is false by default. This parameter has not been changed from the out-of-the-box representation.

    Note: the code will be customized such that this parameter will not be needed.

    narrowBy (Boolean)

    Returns results of a narrowed search for Categories and Vendors. This is a new parameter for customized search.

    categoryList (String)

    Passes the user selected category to filter search results. This is a new parameter for customized search.

    vendorList (String)

    Passes the user selected vendor to filter search results. This is a new parameter for customized search.

    pageIndex (String)

    Passes an index for page size in order to structure the 10-pages listing per view. When pages exceed 10, only a given 10 pages are shown while the other remaining pages are only provided with a forwarding hypertext link. This is a new parameter for customized search.
  6. Run tests and output the generated SQL. Dissect this SQL to design the Summary Table.
    An initial test of the search bean will be run. Since Commerce catalog search is readily available out-of-the-box on the starter stores, the Consumer Direct starter store can be used to run a simple search. The WC_SEARCH component must be enabled to write the appropriate logs. A test search term that is used for this particular tutorial is 'component'. There is a search text input available on the Advanced Search page. From the logs produced as a result of the submission of the search term 'component', they will show output as follows:
    XX:XX:XX.XXX * com.ibm.commerce.search.rulequery.RuleQuery.execute(Cursor aCursor) Servlet.Engine.Transports :
    0 com.ibm.websphere.commerce.WC_SEARCH
      Query = select distinct CATENTRY.CATENTRY_ID, CATENTDESC.SHORTDESCRIPTION
      from CATENTRY, CATENTDESC, CATGPENREL
      where ((((((upper(CATENTRY.PARTNUMBER) like '%COMPONENT%'
        or  upper(CATENTDESC.SHORTDESCRIPTION) like '%COMPONENT%'
        or  upper(CATENTDESC.LONGDESCRIPTION) like '%COMPONENT%'
        or  upper(CATENTDESC.NAME) like '%COMPONENT%')
       and  CATGPENREL.CATALOG_ID = 10001))
       and  (((CATENTDESC.PUBLISHED = 1
       and  CATENTDESC.LANGUAGE_ID = -1)))))
       and  (CATENTRY.CATENTRY_ID = (CATGPENREL.CATENTRY_ID)
       and  CATENTRY.CATENTRY_ID = (CATENTDESC.CATENTRY_ID)))
       order by CATENTDESC.SHORTDESCRIPTION asc
    
    Note: This SQL is a result of some small customizations already completed from a previous development cycle to prepare this tutorial. The previous development cycle removed the buyable constraint.

    The SQL above shows that expected fields to be searched will be refreshed into a summary table. This SQL above is just one example of a number of possible fields. Other fields to include would be the CATGRPDESC table as part of a requirement to search parents of products for a search term match (particular to a business requirement).

  7. Create the summary tables based on the expected fields to be searched against.
    From the SQL that was extracted from the previous step, a summary table can be created based on a prediction of catalog fields that would be searched. The summary table, ITEMSFORPERF, follows. It was created for this tutorial based on the MQT design for iSeries:
    CREATE SUMMARY TABLE ITEMSFORPERF
    AS (SELECT STORECENT.STOREENT_ID, CATENTRY.CATENTRY_ID,
     UPPER(CATENTDESC.SHORTDESCRIPTION) AS ENTRYSHORTDESC,
     UPPER(CATENTRY.PARTNUMBER) AS PARTNUMBER,
     UPPER(CATENTRY.MFPARTNUMBER) AS MFPARTNUMBER,
     UPPER(CATENTRY.MFNAME) AS MFNAME,
     UPPER(CATENTDESC.NAME) AS ENTRYNAME,
     UPPER(CATENTDESC.KEYWORD) AS KEYWORD,
     UPPER(CATGRPDESC.NAME) AS GROUPNAME,
     UPPER(CATGRPDESC.SHORTDESCRIPTION) AS GROUPSHORTDESC
    FROM CATGPENREL CATGPENREL
    LEFT OUTER JOIN CATENTRY CATENTRY ON CATGPENREL.CATENTRY_ID = CATENTRY.CATENTRY_ID
    LEFT OUTER JOIN CATENTDESC CATENTDESC ON CATENTRY.CATENTRY_ID = CATENTDESC.CATENTRY_ID
    LEFT OUTER JOIN CATGRPDESC CATGRPDESC ON CATGPENREL.CATGROUP_ID = CATGRPDESC.CATGROUP_ID
    RIGHT OUTER JOIN STORECENT STORECENT ON CATENTRY.CATENTRY_ID = STORECENT.CATENTRY_ID
    WHERE CATENTDESC.PUBLISHED = 1 AND CATENTDESC.LANGUAGE_ID = -1 AND CATGRPDESC.LANGUAGE_ID = -1
    AND CATENTRY.CATENTTYPE_ID = 'ProductBean'
    AND CATGPENREL.CATALOG_ID =
    (SELECT CATALOG_ID FROM CATALOG WHERE IDENTIFIER = 'ConsumerDirect'))
    DATA INITIALLY IMMEDIATE REFRESH DEFERRED MAINTAINED BY USER ENABLE QUERY OPTIMIZATION;
    
    Some background information about MQTs: With the release V5R3M0, DB2 UDB for iSeries(TM) announced support for creating and maintaining materialized query tables (MQTs) which are also referred to as automatic summary tables or materialized views. On April 29, 2005, query optimizer support became available, which allows the optimizer to recognize MQTs implicitly and use them when optimizing and running queries, with the goal of improved performance. MQTs are created using the SQL CREATE TABLE statement. Alternatively, the ALTER TABLE statement may be used to convert an existing table into a materialized query table. The REFRESH TABLE statement updates the results stored in the MQT. For user-maintained MQTs, the MQTs may also be maintained by the user via INSERT, UPDATE, and DELETE statements. The MQT contains the refreshes of the results of a query in the materialized query table. The database engine can use these refreshes of the results instead of computing them for a user-specified query. The query optimizer will analyze any applicable MQTs and can choose to implement the query using a given MQT, providing a faster implementation than accessing the base tables.
  8. Develop and implement the customization of the search bean to work against each summary table.

    Since we have refined the catalog search fields into one single table, then only one TableInfo class will be created for ITEMSFORPERF. There are several new fields searchable within the ITEMSFORPERF table, and therefore, multiple AttributeInfo classes will be created for each field with the following naming convention: CATENTRY_IDAttributeInfo, ENTRYSHORTDESCAttributeInfo, PARTNUMBERAttributeInfo, MFPARTNUMBERAttributeInfo, MFNAMEAttributeInfo, ENTRYNAMEAttributeInfo, KEYWORDAttributeInfo, GROUPNAMEAttributeInfo, GROUPSHORTDESCAttributeInfo. The RuleQuery class must be extended in order to map these new attributeInfo class references. Also, some extra methods must override in order to control the orderBy clause. An entry is not required in search.xml file since we are only searching within one table for optimal performance improvement.

    The class CatEntrySearchListDataBean must be extended in order to revamp the SQL in order to map to the MQT table. The execute method will override in order to remove references to product type constraints. The summary table will already have the product types embedded within. This tutorial must override the method setPredefinedAttributes. We will only construct the MQT SQL reference for this search bean for the DB2 UDB database. Therefore, if the database type is DB2J (Cloudscape), then the method setPredefinedAttributes of the class CatEntrySearchListDataBean will be used instead to call the out-of-the-box tables. Otherwise, the MQT summary table ITEMSFORPERF is referenced and the setPredefinedAttributes needs to override all table and field references. To do this, all references to q.addFilterAttribute method will include the new AttributeInfo classes instead of the out-of-the-box AttributeInfo class references.

  9. Run tests and output the generated SQL.
    An initial test of the search bean will be run. Since Commerce catalog search is readily available out-of-the-box on the starter stores, the Consumer Direct starter store can be used to run a simple search. The WC_SEARCH component must be enabled to write the appropriate logs. A test search term that is used for this particular tutorial is 'component'. There is a search text input available on the Advanced Search page. From the logs produced as a result of the submission of the search term 'component', you should see output as follows:
    XX:XX:XX.XXX * com.ibm.commerce.search.rulequery.RuleQuery.execute(Cursor aCursor) Servlet.Engine.Transports :
    0 com.ibm.websphere.commerce.WC_SEARCH
    Query =
     select distinct ITEMSFORPERF.Catentry_id, ITEMSFORPERF.EntryShortDesc
     from ITEMSFORPERF
     where ((((((
      ITEMSFORPERF.PartNumber like '%COMPONENT%'
      or ITEMSFORPERF.EntryShortDesc like '%COMPONENT%'
      or  ITEMSFORPERF.EntryName like '%COMPONENT%')
      or  ITEMSFORPERF.Keyword like '%COMPONENT%')
      or (ITEMSFORPERF.GroupShortDesc like '%COMPONENT%'
      or  ITEMSFORPERF.GroupName like '%COMPONENT%')))))
     order by ITEMSFORPERF.EntryShortDesc asc
    
    The SQL above does have some performance improvements over the earlier out-of-the-box catalog search bean SQL. There is only one table being searched for this customized instance. As well, the upper function has been removed. On an iSeries platform, the upper function does impose constraints on performance. Since iSeries does not have RUNSTATS and REORG and the database is actually part of the iSeries operating system itself, then a materialized query table has been the best option for this particular example because it is specific for the iSeries platform. For a Linux or AIX solution, the out-of-the-box search bean may be fine to meet performance requirements because the indexing and statistics can be updated regularly. Therefore, this example is best suited for an iSeries platform. Nevertheless, the high-level components of this solution can be reused for any other platform.

    Note: The LIKE clause as shown in the preceding SQL statement does impose some performance constraints on the iSeries server. Another option for this customization to get optimized results is to combine all the search fields into one single field. That way, the like clause would be minimized for one field only.

  10. Measure performance improvements. The performance improvements can be measured by executing the SQL statements directly on the database apart from the application. This would provide true performance results at the database level without confounding variables of application constraints. Some tools are available that could stress test the server and could provide feedback on application response to multiple users. With stress feedback in place, the searching could be verified against the customer performance requirements.
  11. Refresh the summary tables.
    The summary tables that are in place must be refreshed. The DB2 UDB summary table IMMEDIATE clause will allow the summary tables to be updated automatically. On the iSeries server, the REFRESH TABLE statement must be run manually and can be called from a scheduled command. The REFRESH TABLE command itself may impose memory usage on the database (e.g., tempspace). Therefore, running the REFRESH TABLE command during offline periods outside of peak traffic times is best.