WebSphere eXtreme Scale Administration Guide > Configure WebSphere eXtreme Scale > Loaders
Loader configuration
Implement a loader requires configuration for several attributes.
Preload considerations
Loaders are backing map plug-ins that are invoked when changes are made to the backing map or when the backing map is unable to satisfy a data request (a cache miss). For an overview of how eXtreme Scale interacts with a loader, see In-line caching scenarios.
Each backing map has a boolean preloadMode attribute that is set to indicate if preload of a map executes asynchronously. By default, the preloadMode attribute is set to false, which indicates that the backing map initialization does not complete until the preload of the map is complete. For example, backing map initialization is not complete until the preloadMap method returns. If the preloadMap method reads a large amount of data from its back end and loads it into the map, it might take a relatively long time to complete. In this case, you can configure a backing map to use asynchronous preload of the map by setting the preloadMode attribute to true. This setting causes the backing map initialization code to start a thread that invokes the preloadMap method, allowing initialization of a backing map to complete while the preload of the map is still in progress.
In a distributed eXtreme Scale scenario, one of the preload patterns is client preload. In the client preload pattern, an eXtreme Scale client is responsible for retrieving data from the backend and then inserting the data into the distributed eXtreme Scale server using DataGrid agents. Furthermore, client preload could be executed in the Loader.preloadMap method in one and only one specific partition. In this case, asynchronously loading the data to the grid becomes very important. If the client preload were executed in the same thread, the backing map would never be initialized, so the partition it resides in would never become ONLINE. Therefore, the eXtreme Scale client could not send the request to the partition, and eventually it would cause an exception.
If an eXtreme Scale client is used in the preloadMap method, you should set the preloadMode attribute to true. The alternative is to start a thread in the client preload code.
The following snippet of code illustrates how the preloadMode attribute is set to enable asynchronous preload:
BackMap bm = og.defineMap( "map1" ); bm.setPreloadMode( true );
The preloadMode attribute can also be set by using a XML file as illustrated in the following example:
<backingMap name="map1" preloadMode="true" pluginCollectionRef="map1" lockStrategy="OPTIMISTIC" />
TxID and use of the TransactionCallback interface
Both the get method and batchUpdate methods on the Loader interface are passed a TxID object that represents the Session transaction that requires the get or batchUpdate operation to be performed. It is possible that the get and batchUpdate methods are called more than once per transaction. Therefore, transaction-scoped objects that are needed by the Loader are typically kept in a slot of the TxID object. A Java™ database connectivity (JDBC) Loader is used to illustrate how a Loader uses the TxID and TransactionCallback interfaces.
It is also possible that several ObjectGrid maps are stored in the same database. Each map has its own Loader and each Loader might need to connect to the same database. When connecting to the same database, each Loader wants to use the same JDBC connection so that the changes to each table are committed as part of the same database transaction. Typically, the same person who writes the Loader implementation also writes the TransactionCallback implementation. The best method is when the TransactionCallback interface is extended to add methods that the Loader needs for getting a database connection and for caching prepared statements. The reason for this methodology becomes apparent as you see how the TransactionCallback and TxID interfaces are used by the loader.
As an example, the loader might need the TransactionCallback interface to be extended as follows:
import java.sql.Connection; import java.sql.PreparedStatement; import java.sql.SQLException; import com.ibm.websphere.objectgrid.TxID; public interface MyTransactionCallback extends TransactionCallback { Connection getAutoCommitConnection(TxID tx, String databaseName) throws SQLException; Connection getConnection(TxID tx, String databaseName, int isolationLevel ) throws SQLException; PreparedStatement getPreparedStatement(TxID tx, Connection conn, String tableName, String sql) throws SQLException; Collection getPreparedStatementCollection( TxID tx, Connection conn, String tableName ); }
Use these new methods, the Loader get and batchUpdate methods can get a connection as follows:
import java.sql.Connection; import java.sql.PreparedStatement; import java.sql.SQLException; import com.ibm.websphere.objectgrid.TxID; private Connection getConnection(TxID tx, int isolationLevel) { Connection conn = ivTcb.getConnection(tx, databaseName, isolationLevel ); return conn; }
In the previous example and in the examples that follow, ivTcb and ivOcb are Loader instance variables that were initialized as described in the Preload considerations section. The ivTcb variable is a reference to the MyTransactionCallback instance and the ivOcb is a reference to the MyOptimisticCallback instance. The databaseName variable is an instance variable of the Loader that was set as a Loader property during the initialization of the backing map. The isolationLevel argument is one of the JDBC Connection constants that are defined for the various isolation levels that JDBC supports. If the Loader is using an optimistic implementation, the get method typically uses a JDBC auto−commit connection to fetch the data from the database. In that case, the Loader might have a getAutoCommitConnection method that is implemented as follows:
import java.sql.Connection; import java.sql.PreparedStatement; import java.sql.SQLException; import com.ibm.websphere.objectgrid.TxID; private Connection getAutoCommitConnection(TxID tx) { Connection conn = ivTcb.getAutoCommitConnection(tx, databaseName); return conn; }
Recall that the batchUpdate method has the following switch statement:
switch ( logElement.getType().getCode() ) { case LogElement.CODE_INSERT: buildBatchSQLInsert( tx, key, value, conn ); break; case LogElement.CODE_UPDATE: buildBatchSQLUpdate( tx, key, value, conn ); break; case LogElement.CODE_DELETE: buildBatchSQLDelete( tx, key, conn ); break; }
Each of the buildBatchSQL methods uses the MyTransactionCallback interface to get a prepared statement. Following is a snippet of code that shows the buildBatchSQLUpdate method building an SQL update statement for updating an EmployeeRecord entry and adding it for the batch update:
private void buildBatchSQLUpdate( TxID tx, Object key, Object value, Connection conn ) throws SQLException, LoaderException { String sql = "update EMPLOYEE set LASTNAME = ?, FIRSTNAME = ?, DEPTNO = ?, SEQNO = ?, MGRNO = ? where EMPNO = ?"; PreparedStatement sqlUpdate = ivTcb.getPreparedStatement( tx, conn, "employee", sql ); EmployeeRecord emp = (EmployeeRecord) value; sqlUpdate.setString(1, emp.getLastName()); sqlUpdate.setString(2, emp.getFirstName()); sqlUpdate.setString(3, emp.getDepartmentName()); sqlUpdate.setLong(4, emp.getSequenceNumber()); sqlUpdate.setInt(5, emp.getManagerNumber()); sqlUpdate.setInt(6, key); sqlUpdate.addBatch(); }
After the batchUpdate loop has built all of the prepared statements, it calls the getPreparedStatementCollection method. This method is implemented as follows:
private Collection getPreparedStatementCollection( TxID tx, Connection conn ) { return ( ivTcb.getPreparedStatementCollection( tx, conn, "employee" ) ); }
When the application invokes the commit method on the Session, the Session code calls the commit method on the TransactionCallback method after it has pushed all the changes made by the transaction out to the Loader for each map that was changed by the transaction. Because all of the Loaders used the MyTransactionCallback method to get any connection and prepared statements they needed, the TransactionCallback method knows which connection to use to request that the back end commits the changes. So, extending the TransactionCallback interface with methods that are needed by each of the Loaders has the following advantages:
- The TransactionCallback object encapsulates the use of TxID slots for transaction-scoped data, and the Loader does not require information about the TxID slots. The Loader only needs to know about the methods that are added to TransactionCallback using the MyTransactionCallback interface for the supporting functions needed by the Loader.
- The TransactionCallback object can ensure that connection sharing occurs between each Loader that connects to the same backend so that a two phase commit protocol can be avoided.
- The TransactionCallback object can ensure that connecting to the backend is driven to completion through a commit or rollback invoked on the connection when appropriate.
- TransactionCallback ensures that the cleanup of database resources occurs when a transaction completes.
- TransactionCallback hides if it is obtaining a managed connection from a managed environment such as WebSphere Application Server or some other Java 2 Platform, Enterprise Edition (J2EE) compliant application server. This advantage allows the same Loader code to be used in both a managed and unmanaged environments. Only the TransactionCallback plug-in must be changed.
- For detailed information about how the TransactionCallback implementation uses the TxID slots for transaction-scoped data, see TransactionCallback plug-in.
OptimisticCallback
As mentioned earlier, the Loader might use an optimistic approach for concurrency control. In this case, the buildBatchSQLUpdate method example must be modified slightly for implementing an optimistic approach. Several possible ways exist for using an optimistic approach. A typical way is to have either a timestamp column or sequence number counter column for versioning each update of the row. Assume that the employee table has a sequence number column that increments each time the row is updated. You then modify the signature of the buildBatchSQLUpdate method so that it is passed the LogElement object instead of the key and value pair. It also needs to use the OptimisticCallback object that is plugged into the backing map for getting both the initial version object and for updating the version object. The following is an example of a modified buildBatchSQLUpdate method that uses the ivOcb instance variable that was initialized as described in the preloadMap section:
modified batch-update method code example private void buildBatchSQLUpdate( TxID tx, LogElement le, Connection conn ) throws SQLException, LoaderException { // Get the initial version object when this map entry was last read // or updated in the database. Employee emp = (Employee) le.getCurrentValue(); long initialVersion = ((Long) le.getVersionedValue()).longValue(); // Get the version object from the updated Employee for the SQL update //operation. Long currentVersion = (Long)ivOcb.getVersionedObjectForValue( emp ); long nextVersion = currentVersion.longValue(); // Now build SQL update that includes the version object in where clause // for optimistic checking. String sql = "update EMPLOYEE set LASTNAME = ?, FIRSTNAME = ?, DEPTNO = ?,SEQNO = ?, MGRNO = ? where EMPNO = ? and SEQNO = ?"; PreparedStatement sqlUpdate = ivTcb.getPreparedStatement( tx, conn, "employee", sql ); sqlUpdate.setString(1, emp.getLastName()); sqlUpdate.setString(2, emp.getFirstName()); sqlUpdate.setString(3, emp.getDepartmentName()); sqlUpdate.setLong(4, nextVersion ); sqlUpdate.setInt(5, emp.getManagerNumber()); sqlUpdate.setInt(6, key); sqlUpdate.setLong(7, initialVersion); sqlUpdate.addBatch(); }
The example shows that the LogElement is used to obtain the initial version value. When the transaction first accesses the map entry, a LogElement is created with the initial Employee object that is obtained from the map. The initial Employee object is also passed to the getVersionedObjectForValue method on the OptimisticCallback interface and the result is saved in the LogElement. This processing occurs before an application is given a reference to the initial Employee object and has a chance to call some method that changes the state of the initial Employee object.
The example shows that the Loader uses the getVersiondObjectForValue method to obtain the version object for the current updated Employee object. Before calling the batchUpdate method on the Loader interface, eXtreme Scale calls the updateVersionedObjectForValue method on the OptimisticCallback interface to cause a new version object to be generated for the updated Employee object. After the batchUpdate method returns to the ObjectGrid, the LogElement is updated with the current version object and becomes the new initial version object. This step is necessary because the application might have called the flush method on the map instead of the commit method on the Session. It is possible for the Loader to be called multiple times by a single transaction for the same key. For that reason, eXtreme Scale ensures that the LogElement is updated with the new version object each time the row is updated in the employee table.
Now that the Loader has both the initial version object and the next version object, it can run an SQL update statement that sets the SEQNO column to the next version object value and uses the initial version object value in the where clause. This approach is sometimes referred to as an overqualified update statement. The use of the overqualified update statement allows the relational database to verify that the row was not changed by some other transaction between the time that this transaction read the data from the database and the time that this transaction updates the database. If another transaction modified the row, then the count array that is returned by the batch update indicates that zero rows were updated for this key. The Loader is responsible for verifying that the SQL update operation did update the row. If it does not, the Loader displays a com.ibm.websphere.objectgrid.plugins.OptimisticCollisionException exception to inform the Session that the batchUpdate method failed due to more than one concurrent transaction trying to update the same row in the database table. This exception causes the Session to roll back and the application must retry the entire transaction. The rationale is that the retry will be successful, which is why this approach is called optimistic. The optimistic approach performs better if data is infrequently changed or concurrent transactions rarely try to update the same row.
It is important for the Loader to use the key parameter of the OptimisticCollisionException constructor to identify which key or set of keys caused the optimistic batchUpdate method to fail. The key parameter can either be the key object itself or an array of key objects if more than one key resulted in optimistic update failure. And eXtreme Scale uses the getKey method of the OptimisticCollisionException constructor to determine which map entries contain stale data and caused the exception to result. Part of the rollback processing is to evict each stale map entry from the map. Evicting stale entries is necessary so that any subsequent transaction that accesses the same key or keys results in the get method of the Loader interface being called to refresh the map entries with the current data from the database.
Other ways for a Loader to implement an optimistic approach include:
- No timestamp or sequence number column exists. In this case, the getVersionObjectForValue method on the OptimisticCallback interface simply returns the value object itself as the version. With this approach, the Loader needs to build a where clause that includes each of the fields of the initial version object. This approach is not efficient, and not all column types are eligible to be used in the where clause of an overqualified SQL update statement. This approach is typically not used.
- No timestamp or sequence number column exists. However, unlike the prior approach, the where clause only contains the value fields that were modified by the transaction. One method to detect which fields are modified is to set the copy mode on the backing map to be CopyMode.COPY_ON_WRITE mode. This copy mode requires that a value interface be passed to the setCopyMode method on the BackingMap interface. The BackingMap creates dynamic proxy objects that implement the provided value interface. With this copy mode, the Loader can cast each value to a com.ibm.websphere.objectgrid.plugins.ValueProxyInfo object. The ValueProxyInfo interface has a method that allows the Loader to obtain the List of attribute names that were changed by the transaction. This method enables the Loader to call the get methods on the value interface for the attribute names to obtain the changed data and to build an SQL update statement that only sets the changed attributes. The where clause can now be built to have the primary key column plus each of the changed attribute columns. This approach is more efficient than the prior approach, but it requires more code to be written in the Loader and leads to the possibility that the prepared statement cache needs to be larger to handle the different permutations. However, if transactions typically only modify a few of the attributes, this limitation might not be a problem.
- Some relational databases might have an API to assist in automatically maintaining column data that is useful for optimistic versioning. Consult the database documentation to determine if this possibility exists.
Parent topic
Loaders
Related tasks
Configure a JPA time-based data updater
Related reference
Write-behind dumper class sample code