Develop > Business logic layer > Work with the data service layer
Techniques for improving the performance of SQL queries under workspaces in the Data Service Layer
SQL queries under workspaces can suffer performance issues. Performance degradation often occurs if the database is not being properly maintained or if queries can be rewritten more efficiently. In addition to making sure that the database is properly maintained, there are several techniques that can help improve the performance of SQL queries under workspaces. While no single technique yields significant results, a combination of several techniques can help achieve considerable performance improvements in many applications.
See Maintain a WebSphere Commerce DB2 Database for tips on how to maintain a WebSphere Commerce DB2 database. For Oracle databases, see the Oracle documentation on how to update the table and index statistics.
The performance techniques include:
- General READ SQL optimization for DB2 and Oracle
- Optimize queries based on the query optimization guidelines
- Push predicates into the OUTER JOIN clause whenever possible
- Duplicate constant condition for different tables whenever possible
- Use nested table definitions to replace workspaces views
- Splitting queries
- Use common expression syntax in Oracle
General READ SQL optimization for DB2 and Oracle
These techniques apply to both DB2 and Oracle, however, the rules appear to yield a better response time on DB2.
Optimize queries based on the query optimization guidelinesFollow the SQL best practices to ensure query optimization:
- Index all the predicates in JOIN, WHERE, ORDER BY and GROUP BY clauses.
WebSphere Commerce typically depends heavily on indexes to improve SQL performance and scalability. Without proper indexes, SQL queries can cause table scans, which causes either performance or locking problems. IBM recommends that all predicate columns be indexed. The exception being where column data has very low cardinality.
- Avoid using functions in predicates.
The index is not used by the database if there is a function on the column. For example:
SELECT * FROM TABLE1 WHERE UPPER(COL1)='ABC'As a result of the function UPPER(), the index on COL1 is not used by database optimizers. If the function cannot be avoided in the SQL, create a function-based index in Oracle or generated columns in DB2 to improve performance.
- Avoid using wildcard (%) at the beginning of a predicate.
The predicate LIKE '%abc' causes full table scan. For example:
SELECT * FROM TABLE1 WHERE COL1 LIKE '%ABC'
This is a known performance limitation in all databases.
- Avoid unnecessary columns in SELECT clause.
Specify the columns in the SELECT clause instead of using SELECT *. The unnecessary columns places extra loads on the database, which slows down not just the single SQL, but the whole system.
- Use inner join, instead of outer join if possible.
The outer join should only be used if it is necessary. Using outer join limits the database optimization options which typically results in slower SQL execution.
- DISTINCT and UNION should be used only if it is necessary.
DISTINCT and UNION operators cause sorting, which slows down the SQL execution. Use UNION ALL instead of UNION, if possible, as it is much more efficient.
- Oracle 10g and 11g requires that the CLOB/BLOB columns must be put at the end of the statements.
Otherwise, it causes failure when the input value size is larger than 1000 characters.
- The ORDER BY clause is mandatory in SQL if the sorted result set is expected.
The ORDER BY keyword is used to sort the result-set by specified columns. Without the ORDER BY clause, the result set is returned directly without any sorting. The order is not guaranteed. Be aware of the performance impact of adding the ORDER BY clause, as the database needs to sort the result set, resulting in one of the most expensive operations in SQL execution.
Push predicates into the OUTER JOIN clause whenever possibleFor SQL queries with the LEFT OUTER JOIN, pushing predicates of the right table from the WHERE clause into the ON condition helps the database optimizer generate a more efficient query. Predicates of the left table can stay in the WHERE clause.
Similarly, for the SQL queries with the RIGHT OUTER JOIN, predicates for the right table should be moved from the WHERE clause into the ON condition.
For example, the suboptimal query is rewritten by pushing predicates applicable to the table TAB_B into the ON clause. The TAB_A specific predicates in the WHERE clause can either stay, or be pushed into the ON clause:
Suboptimal SQL statement:
SELECT TAB_A.COL1, TAB_B.COL1 FROM TAB_A LEFT OUTER JOIN TAB_B ON TAB_A.COL3 = TAB_B.COL3 WHERE TAB_A.COL1=123 AND TAB_B.COL2=456;Optimized SQL statement:
SELECT TAB_A.COL1, TAB_B.COL1 FROM TAB_A LEFT OUTER JOIN TAB_B ON TAB_A.COL3 = TAB_B.COL3 AND TAB_B.COL2=456 WHERE TAB_A.COL1=123;
Predicates for any INNER joins can stay in the WHERE clause. If tables TAB_A and TAB_B are defined as views, the optimizer can push these predicates into the views.
Duplicate constant condition for different tables whenever possibleWhen two tables, A and B, are joined and there is a constant predicate on one of the joined columns, for example, A.id=B.id and A.id in (10, 12), the constant predicate should be duplicated for the joined column of the second table. That is, A.id=B.id and A.id in (10, 12) and B.id in (10, 12).
For example, TAB_A has a LEFT OUTER JOIN relationship with TAB_B. If there is a TAB_A specific conditions and a cross table condition with TAB_B, create an extra TAB_B specific condition based on TAB_A requirement and keep cross table conditions in the ON clause:
Suboptimal SQL statement:
SELECT TAB_A.COL1, TAB_B.COL1 FROM TAB_A LEFT OUTER JOIN TAB_B ON TAB_A.COL3 = TAB_B.COL3 WHERE TAB_A.COL1 IN (123, 456) AND TAB_B.COL2=TAB_A.COL1;Optimized SQL statement:
SELECT TAB_A.COL1, TAB_B.COL1 FROM TAB_A LEFT OUTER JOIN TAB_B ON TAB_A.COL3 = TAB_B.COL3 AND TAB_B.COL2 IN (123, 456) AND TAB_B.COL2=TAB_A.COL1 WHERE TAB_A.COL1 IN (123, 456);
In particular, if the constant predicate has only 1 value (e.g COL1=123), the second predicate should be converted to a constant predicate too.
For example:
Suboptimal SQL statement:
SELECT TAB_A.COL1, TAB_B.COL1 FROM TAB_A LEFT OUTER JOIN TAB_B ON TAB_A.COL3 = TAB_B.COL3 WHERE TAB_A.COL1=123 AND TAB_B.COL2=TAB_A.COL1;Optimized SQL statement:
SELECT TAB_A.COL1, TAB_B.COL1 FROM TAB_A LEFT OUTER JOIN TAB_B ON TAB_A.COL3 = TAB_B.COL3 AND TAB_B.COL2=123 WHERE TAB_A.COL1=123;
Use nested table definitions to replace workspaces viewsTo improve performance in the workspaces environment you can use nested table definitions to replace the standard views defined for the content-managed tables. The following example query uses the default views for tables CATENTRY, ATTRIBUTE, and ATTRVALUE:
SELECT CATENTRY.CATENTRY_ID, ATTRIBUTE.ATTRIBUTE_ID, ATTRIBUTE.NAME, ATTRIBUTE.LANGUAGE_ID, ATTRIBUTE.CATENTRY_ID, ATTRVALUE.ATTRVALUE_ID, ATTRVALUE.ATTRIBUTE_ID, ATTRVALUE.CATENTRY_ID, ATTRVALUE.STRINGVALUE FROM CATENTRY, ATTRVALUE, ATTRIBUTE WHERE ATTRVALUE.CATENTRY_ID = CATENTRY.CATENTRY_ID AND CATENTRY.CATENTRY_ID IN (10683) AND ATTRVALUE.LANGUAGE_ID IN ( -1,-2,-3,-4,-5 ) AND ATTRIBUTE.LANGUAGE_ID=ATTRVALUE.LANGUAGE_ID AND ATTRIBUTE.ATTRIBUTE_ID=ATTRVALUE.ATTRIBUTE_ID ORDER BY ATTRIBUTE.SEQUENCE
This query can be rewritten to use nested tables definitions for tables CATENTRY, ATTRVALUE, and ATTRIBUTE:
SELECT CATENTRY.CATENTRY_ID, ATTRIBUTE.ATTRIBUTE_ID, ATTRIBUTE.NAME, ATTRIBUTE.LANGUAGE_ID, ATTRIBUTE.CATENTRY_ID, ATTRVALUE.ATTRVALUE_ID, ATTRVALUE.ATTRIBUTE_ID, ATTRVALUE.CATENTRY_ID, ATTRVALUE.STRINGVALUE FROM ( SELECT CATENTRY_ID FROM DB2INST1.CATENTRY WHERE NOT EXISTS ( SELECT '1' FROM WCW101.CATENTRY WHERE DB2INST1.CATENTRY.CATENTRY_ID = WCW101.CATENTRY.CATENTRY_ID) UNION ALL SELECT CATENTRY_ID FROM WCW101.CATENTRY WHERE WCW101.CATENTRY.CONTENT_STATUS <> 'D' ) CATENTRY, ( SELECT ATTRVALUE_ID,LANGUAGE_ID,CATENTRY_ID,ATTRIBUTE_ID,STRINGVALUE FROM DB2INST1.ATTRVALUE WHERE NOT EXISTS ( SELECT '1' FROM WCW101.ATTRVALUE WHERE DB2INST1.ATTRVALUE.ATTRVALUE_ID = WCW101.ATTRVALUE.ATTRVALUE_ID) UNION ALL SELECT ATTRVALUE_ID,LANGUAGE_ID,CATENTRY_ID,ATTRIBUTE_ID,STRINGVALUE FROM WCW101.ATTRVALUE WHERE WCW101.ATTRVALUE.CONTENT_STATUS <> 'D' ) ATTRVALUE, ( SELECT ATTRIBUTE_ID,LANGUAGE_ID,CATENTRY_ID,SEQUENCE,NAME FROM DB2INST1.ATTRIBUTE WHERE NOT EXISTS ( SELECT '1' FROM WCW101.ATTRIBUTE WHERE DB2INST1.ATTRIBUTE.ATTRIBUTE_ID = WCW101.ATTRIBUTE.ATTRIBUTE_ID) UNION ALL SELECT ATTRIBUTE_ID,LANGUAGE_ID,CATENTRY_ID,SEQUENCE,NAME FROM WCW101.ATTRIBUTE WHERE WCW101.ATTRIBUTE.CONTENT_STATUS <> 'D' ) ATTRIBUTE WHERE ATTRVALUE.CATENTRY_ID = CATENTRY.CATENTRY_ID AND CATENTRY.CATENTRY_ID IN (10683) AND ATTRVALUE.LANGUAGE_ID IN ( -1,-2,-3,-4,-5 ) AND ATTRIBUTE.LANGUAGE_ID=ATTRVALUE.LANGUAGE_ID AND ATTRIBUTE.ATTRIBUTE_ID=ATTRVALUE.ATTRIBUTE_ID ORDER BY ATTRIBUTE.SEQUENCE
Occasionally, further performance gains can be achieved by pushing in predicates into nested table definitions. This might reduce the amount of data that needs to be fetched for nested tables. Continuing with the preceding example, the predicate LANGUAGE_ID IN ( -1,-2,-3,-4,-5 ) is injected into the definition of the ATTRVALUE nested table in the following query:
SELECT CATENTRY.CATENTRY_ID, ATTRIBUTE.ATTRIBUTE_ID, ATTRIBUTE.NAME, ATTRIBUTE.LANGUAGE_ID, ATTRIBUTE.CATENTRY_ID, ATTRVALUE.ATTRVALUE_ID, ATTRVALUE.ATTRIBUTE_ID, ATTRVALUE.CATENTRY_ID, ATTRVALUE.STRINGVALUE FROM ( SELECT CATENTRY_ID FROM DB2INST1.CATENTRY WHERE NOT EXISTS ( SELECT '1' FROM WCW101.CATENTRY WHERE DB2INST1.CATENTRY.CATENTRY_ID = WCW101.CATENTRY.CATENTRY_ID) UNION ALL SELECT CATENTRY_ID FROM WCW101.CATENTRY WHERE WCW101.CATENTRY.CONTENT_STATUS <> 'D' ) CATENTRY, ( SELECT ATTRVALUE_ID,LANGUAGE_ID,CATENTRY_ID,ATTRIBUTE_ID,STRINGVALUE FROM DB2INST1.ATTRVALUE WHERE NOT EXISTS ( SELECT '1' FROM WCW101.ATTRVALUE WHERE DB2INST1.ATTRVALUE.ATTRVALUE_ID = WCW101.ATTRVALUE.ATTRVALUE_ID) AND DB2INST1.ATTRVALUE.LANGUAGE_ID IN ( -1,-2,-3,-4,-5 ) UNION ALL SELECT ATTRVALUE_ID,LANGUAGE_ID,CATENTRY_ID,ATTRIBUTE_ID,STRINGVALUE FROM WCW101.ATTRVALUE WHERE WCW101.ATTRVALUE.CONTENT_STATUS <> 'D' AND WCW101.ATTRVALUE.LANGUAGE_ID IN ( -1,-2,-3,-4,-5 ) ) ATTRVALUE, ( SELECT ATTRIBUTE_ID,LANGUAGE_ID,CATENTRY_ID,SEQUENCE,NAME FROM DB2INST1.ATTRIBUTE WHERE NOT EXISTS ( SELECT '1' FROM WCW101.ATTRIBUTE WHERE DB2INST1.ATTRIBUTE.ATTRIBUTE_ID = WCW101.ATTRIBUTE.ATTRIBUTE_ID) UNION ALL SELECT ATTRIBUTE_ID,LANGUAGE_ID,CATENTRY_ID,SEQUENCE,NAME FROM WCW101.ATTRIBUTE WHERE WCW101.ATTRIBUTE.CONTENT_STATUS <> 'D' ) ATTRIBUTE WHERE ATTRVALUE.CATENTRY_ID = CATENTRY.CATENTRY_ID AND CATENTRY.CATENTRY_ID IN (10683) AND ATTRIBUTE.LANGUAGE_ID=ATTRVALUE.LANGUAGE_ID AND ATTRIBUTE.ATTRIBUTE_ID=ATTRVALUE.ATTRIBUTE_ID ORDER BY ATTRIBUTE.SEQUENCE
Queries using nested table definitions instead of views need to explicitly reference the base and the write schemas. These queries are only applicable to the workspaces environment, and, therefore, need to be defined in a query template under a special content management section ('cm'). The following is an example of the preceding query, defined as a query template for both the runtime and the workspaces environments:
The following snippet uses $CM:READ$, $CM:WRITE$, and $CM:BASE$ tags. They will be replaced at runtime by the READ, WRITE and BASE schema names when the SQL is executed by the Data Service Layer.
See Workspaces data model for more information.
BEGIN_XPATH_TO_SQL_STATEMENT name=/CatalogEntry[CatalogEntryIdentifier[(UniqueID=)]]+IBM_Attributes base_table=CATENTRY sql= SELECT CATENTRY.$COLS:CATENTRY_ID$, ATTRIBUTE.$COLS:ATTRIBUTE$,ATTRVALUE.$COLS:ATTRVALUE$ FROM CATENTRY, ATTRVALUE, ATTRIBUTE WHERE ATTRVALUE.CATENTRY_ID = CATENTRY.CATENTRY_ID AND CATENTRY.CATENTRY_ID IN (?UniqueID?) AND ATTRVALUE.LANGUAGE_ID IN ($CONTROL:LANGUAGES$) AND ATTRIBUTE.LANGUAGE_ID=ATTRVALUE.LANGUAGE_ID AND ATTRIBUTE.ATTRIBUTE_ID=ATTRVALUE.ATTRIBUTE_ID ORDER BY ATTRIBUTE.SEQUENCE <!-- the following query is optimized for workspaces --> cm sql= SELECT CATENTRY.$COLS:CATENTRY_ID$, ATTRIBUTE.$COLS:ATTRIBUTE$,ATTRVALUE.$COLS:ATTRVALUE$ FROM ( SELECT CATENTRY_ID FROM $CM:BASE$.CATENTRY WHERE NOT EXISTS ( SELECT '1' FROM $CM:WRITE$.CATENTRY WHERE $CM:BASE$.CATENTRY.CATENTRY_ID = $CM:WRITE$.CATENTRY.CATENTRY_ID) UNION ALL SELECT CATENTRY_ID FROM $CM:WRITE$.CATENTRY WHERE $CM:WRITE$.CATENTRY.CONTENT_STATUS <> 'D' ) CATENTRY, ( SELECT ATTRVALUE_ID,LANGUAGE_ID,CATENTRY_ID,ATTRIBUTE_ID,STRINGVALUE FROM $CM:BASE$.ATTRVALUE WHERE NOT EXISTS ( SELECT '1' FROM $CM:WRITE$.ATTRVALUE WHERE $CM:BASE$.ATTRVALUE.ATTRVALUE_ID = $CM:WRITE$.ATTRVALUE.ATTRVALUE_ID) AND $CM:BASE$.ATTRVALUE.LANGUAGE_ID IN ($CONTROL:LANGUAGES$) UNION ALL SELECT ATTRVALUE_ID,LANGUAGE_ID,CATENTRY_ID,ATTRIBUTE_ID,STRINGVALUE FROM $CM:WRITE$.ATTRVALUE WHERE $CM:WRITE$.ATTRVALUE.CONTENT_STATUS <> 'D' AND $CM:WRITE$.ATTRVALUE.LANGUAGE_ID IN ($CONTROL:LANGUAGES$) ) ATTRVALUE, ( SELECT ATTRIBUTE_ID,LANGUAGE_ID,CATENTRY_ID,SEQUENCE,NAME FROM $CM:BASE$.ATTRIBUTE WHERE NOT EXISTS ( SELECT '1' FROM $CM:WRITE$.ATTRIBUTE WHERE $CM:BASE$.ATTRIBUTE.ATTRIBUTE_ID = $CM:WRITE$.ATTRIBUTE.ATTRIBUTE_ID) UNION ALL SELECT ATTRIBUTE_ID,LANGUAGE_ID,CATENTRY_ID,SEQUENCE,NAME FROM $CM:WRITE$.ATTRIBUTE WHERE $CM:WRITE$.ATTRIBUTE.CONTENT_STATUS <> 'D' ) ATTRIBUTE WHERE ATTRVALUE.CATENTRY_ID = CATENTRY.CATENTRY_ID AND CATENTRY.CATENTRY_ID IN (?UniqueID?) AND ATTRIBUTE.LANGUAGE_ID=ATTRVALUE.LANGUAGE_ID AND ATTRIBUTE.ATTRIBUTE_ID=ATTRVALUE.ATTRIBUTE_ID ORDER BY ATTRIBUTE.SEQUENCE END_XPATH_TO_SQL_STATEMENT
Splitting queries
If applying these techniques did not provide the necessary performance improvements, you might consider splitting queries joining many tables into multiple queries. The objective of splitting the query is to minimize the number of views used in the join. If you are splitting a single-step query, we will need to convert it to a two-step query with one or more association queries. When splitting one of the association SQL statements for a two-step query, add additional association SQL statements to the access profile definition. When queries are split, you should use inner joins instead of outer joins whenever possible.
Be cautious when splitting queries. If you take this approach too extensively, it might result in many queries, each selecting from one table only. In this case, we will need to perform table joins in memory by writing Java code, for example, using a graph composer. In some cases, you will need to feed the result of one query into another. These types of queries should be considered as a last resort, as maintainability, customization and migration challenges can arise.
Use common expression syntax in Oracle
The Oracle optimizer typically generates very efficient queries, where a temporary table is created based on the result set of the original query joining views. A temporary table can be created using the common table expression syntax.
For example, the following query fetching product attribute values can run significantly faster after rewriting it using the common table expression syntax:
Standard query fetching product attributes:
BEGIN_ASSOCIATION_SQL_STATEMENT name=IBM_CatalogEntryAttributeValue base_table=CATENTRY sql = SELECT CATENTRY.$COLS:CATENTRY$, ATTRVALUE.$COLS:ATTRVALUE$, ATTRVALUE2.$COLS:ATTRVALUE$ FROM CATENTRY, ATTRVALUE JOIN ATTRIBUTE ON ATTRIBUTE.LANGUAGE_ID = ATTRVALUE.LANGUAGE_ID AND ATTRIBUTE.ATTRIBUTE_ID = ATTRVALUE.ATTRIBUTE_ID LEFT OUTER JOIN ATTRVALUE ATTRVALUE2 ON ATTRIBUTE.ATTRIBUTE_ID = ATTRVALUE2.ATTRIBUTE_ID AND ATTRVALUE2.CATENTRY_ID = 0 AND ATTRIBUTE.LANGUAGE_ID = ATTRVALUE2.LANGUAGE_ID WHERE CATENTRY.CATENTRY_ID IN ( $ENTITY_PKS$) AND ATTRVALUE.CATENTRY_ID = CATENTRY.CATENTRY_ID AND ATTRVALUE.LANGUAGE_ID IN ($CTX:LANG_ID$) END_ASSOCIATION_SQL_STATEMENTUsing common expression syntax:
BEGIN_ASSOCIATION_SQL_STATEMENT name=IBM_CatalogEntryAttributeValue base_table=CATENTRY sql = WITH TEMP_TABLE AS ( SELECT CATENTRY.CE_$COLS:CATENTRY$, ATTRVALUE.ATTR_$COLS:ATTRVALUE$, ATTRVALUE2.ATTR2_$COLS:ATTRVALUE$ FROM CATENTRY, ATTRVALUE JOIN ATTRIBUTE ON ATTRIBUTE.LANGUAGE_ID = ATTRVALUE.LANGUAGE_ID AND ATTRIBUTE.ATTRIBUTE_ID = ATTRVALUE.ATTRIBUTE_ID LEFT OUTER JOIN ATTRVALUE ATTRVALUE2 ON ATTRIBUTE.ATTRIBUTE_ID = ATTRVALUE2.ATTRIBUTE_ID AND ATTRVALUE2.CATENTRY_ID = 0 AND ATTRIBUTE.LANGUAGE_ID = ATTRVALUE2.LANGUAGE_ID WHERE CATENTRY.CATENTRY_ID IN ( $ENTITY_PKS$) AND ATTRVALUE.CATENTRY_ID = CATENTRY.CATENTRY_ID AND ATTRVALUE.LANGUAGE_ID IN ($CTX:LANG_ID$) ) SELECT * FROM TEMP_TABLE END_ASSOCIATION_SQL_STATEMENT
This technique enables the Oracle optimizer to push the necessary predicates into the views. As more data is filtered out in the early stages of query processing, subsequent joins will need to be applied to a much smaller data set. This often results in significant performance improvements.
Related concepts
Related tasks