Partial column updates for container managed persistence
Partial column updates for container managed persistence
Previously, the WebSphere Application Server implementation of the Container Managed Persistence (CMP) bean method ejbStore always stored all of the persistent attributes of the CMP bean to the database, even if only a subset of persistent attribute fields were changed. This needless performance degradation is eliminated in this release of the product.
For EJB 2.x CMP entity beans, you can use the partial update feature to specify how you want to update the persistent attributes of the CMP bean to the database. This feature is provided as a bean level persistence option, called PartialOperation, in the access intent policy configured for the bean. PartialOperation has two possible values:
NONE
Partial update is turned off. All of the persistent attributes of the CMP bean are stored to the database. This is the default value.
UPDATE_ONLY
That updates to the database occur only for the persistent attributes of the CMP bean that are changed.
Affects on performance
Performing partial updates increases performance in several ways:
by reducing query execution time, since only a subset of the columns are in the query. Improvement is higher for tables with many columns and indexes.
When the table has many indexes only the indexes affected by the updated columns need to be updated by the backend database.
by reducing network i/o since there is less data to be transmitted.
by saving any processing time for non-trivially mapped columns (if a column uses converters/composers/transformations), by partially injecting the input record.
by eliminating unnecessary firing of update triggers. If a CMP bean field is not changed, any trigger depending only on the corresponding column is not fired.
Although partial update improves performance in general, it can adversely affect performance too.
If you enable partial update for a bean for which your application modifies several different combinations of columns during the same time span, then the prepared statement cache maximum for the connection is reached very quickly.
As a result, statement handles are evicted from the cache based on least recent usage. This results in statements being prepared again and again, decreasing performance for all CMP functions (not just limited to ejbStore()).
Partial update query templates cached in the function set increase memory use. The increase is linear relative to the number of fields in the CMP bean for which the partial update access intent option is turned on.
The PartialOperation persistent option, when used in combination with the Batch Update persistent option, affects the performance of the batch update because now each partial query is different. There is an execution time cost incurred for generating a partial update query string dynamically. Since query fragments are stored for each column, the execution cost to assemble the query fragments is linear, based on the number of CMP bean fields dirtied.
There are condition checks for each CMP field (for example to inspect the dirty flags, to execute the preparedStatement setXXX() calls, and so on).
Considerations for using partial update
The performance gains you hope to achieve should be weighed against the possible instances where degradation can occur. You can use the following guide lines to help you make the decision.
Partial update might not benefit an application that only involves a small table (few columns) with simple data types and no update triggers.
The cost to assemble the partial query dynamically would probably outweigh the performance gain.
Partial update is a benefit if there is a complex data type that is not updated often. An example of a complex data type might be an employee bean with a “photo” CMP attribute mapped to a BLOB OR VARGRAPHIC or similar complex backend type that is typically stored in a different location in the database manager implementation.
Partial Update might benefit if there are several VARCHAR type columns and only a very few of them are updated typically.
It is better not to use the partial operation if the application can randomly be updating different combinations of columns and the number of assignable columns (non-key) is higher than five. This generates many different partial queries and fills up the prepared statement cache quickly. But if the bean does not have too many columns (four or less) and it has complex data types,
then you might consider turning partial update on, with the option of increasing the statement cache size to allow for the increased number of queries. For information on increasing the statement cache size, refer to Data source settings.
Partial Update is beneficial when there are update triggers needed on a subset of columns.
Partial Update is beneficial when the table has many columns and indexes and only a few indexes are touched by a typical update.
Restrictions
By default, batch update of update queries is disabled for all CMP
beans for which partial update is enabled. In other words, partial update takes precedence over batch update. Batch update of delete and insert queries is not affected.
Batch update performance is affected when both batch update and partial update persistence options are used on the same bean, because each partial query is different. You can use the JVM property -Dcom.ibm.ws.pm.grouppartialupdate=true
to group the similar partial update queries into a batch update. Grouping of partial updates only helps when there are several partial queries with the same shape in a transaction. Otherwise, grouping partial updates has the opposite affect on performance. Because this setting is not on a bean level basis, you should be careful when turning it on. Because this affects all beans that have both partial update and batch update on, make sure that batch update of partial queries does indeed increase performance when viewed across all the beans for which both updates are on.
So you should determine which situation gives the best performance for your application: batch update only or partial update only or both (with grouppartialupdate flag set to true).
To set the JVM property:
1. Open the server.xml file.
2. Change the value of -Dcom.ibm.ws.pm.grouppartialupdate=true to
-Dcom.ibm.ws.pm.grouppartialupdate=false.