Administration guide > Configure the deployment environment > Configuring data grids > Configuring write-behind loader support



Write-behind caching support

Use write-behind caching to reduce the overhead that occurs when updating a back-end database. Write-behind caching queues updates to the Loader plug-in.


Introduction

Write-behind caching asynchronously queues updates to the Loader plug-in. You can improve performance by disconnecting updates, inserts, and removes for a map, the overhead of updating the back-end database. The asynchronous update is performed after a time-based delay (for example, five minutes) or an entry-based delay (1000 entries).

When you configure the write-behind setting on a backing map, a write-behind thread is created and wraps the configured loader. When an eXtreme Scale transaction inserts, updates, or removes an entry from an eXtreme Scale map, a LogElement object is created for each of these records. These elements are sent to the write-behind loader and queued in a special ObjectMap called a queue map. Each backing map with the write-behind setting enabled has its own queue maps. A write-behind thread periodically removes the queued data from the queue maps and pushes them to the real back-end loader.

The write-behind loader will only send insert, update, and delete types of LogElement objects to the real loader. All other types of LogElement objects, for example, EVICT type, are ignored.

Write-behind support is an extension of the Loader plug-in, which you use to integrate eXtreme Scale with the database. For example, consult the Configure JPA loaders information about configuring a JPA loader.


Benefits

Enable write-behind support has the following benefits:


ObjectGrid descriptor XML

When configuring an eXtreme Scale using an eXtreme Scale descriptor XML file, the write-behind loader is enabled by setting the writeBehind attribute on the backingMap tag. An example follows:

<objectGrid name="library" >
    
<backingMap name="book" writeBehind="T300;C900" pluginCollectionRef="bookPlugins"/>

In the previous example, write-behind support of the "book" backing map is enabled with parameter "T300;C900".

The write-behind attribute specifies the maximum update time and/or a maximum key update count. The format of the write-behind parameter is:

write-behind attribute  ::=
<defaults> |
<update time> |
<update key count> |
<update time> ";"
<update key count> 
update time  ::= "T"
<positive integer> 
update key count  ::= "C"
<positive integer> 
defaults  ::= "" {table}

Updates to the loader occur when one of the following events occurs:

  1. The maximum update time in seconds has elapsed since the last update.

  2. The number of updated keys in the queue map has reached the update key count.

These parameters are only hints. The real update count and update time will be within close range of the parameters. However, you cannot guarantee that the actual update count or update time are the same as defined in the parameters. Also, the first behind update could happen after up to twice as long as the update time. This is because eXtreme Scale randomizes the update starting time so all partitions will not hit the database simultaneously.

In the previous example T300;C900, the loader writes the data to the back end when 300 seconds have passed since the last update or when 900 keys are pending to be updated.

The default update time is 300 seconds and the default update key count is 1000.

The table below lists some write-behind attribute examples.

If you configure the write-behind loader as an empty string: writeBehind="", the write-behind loader is enabled using the default values. Therefore, do not specify the writeBehind attribute if you do not want write-behind support enabled.

Table 1. Some write-behind options
Attribute value Time
T100 The update time is 100 seconds, and the update key count is 1000 (the default value)
C2000 The update time is 300 seconds (the default value), and the update key count is 2000.
T300;C900 The update time is 300 seconds and the update key count is 900.
"" The update time is 300 second (the default value), and the update key count is 1000 (the default value).


Programmatically enabling write-behind support

When you are creating a backing map programmatically for a local, in-memory eXtreme Scale, you can use the following method on the BackingMap interface to enable and disable write-behind support.

public void setWriteBehind(String writeBehindParam);

For more details about how to use the setWriteBehind method, see BackMap interface.


Application design considerations

Enable write-behind support is simple, but designing an application to work with write-behind support needs careful consideration. Without write-behind support, the eXtreme Scale transaction encloses the back-end transaction. The eXtreme Scale transaction starts before the back-end transaction starts, and it ends after the back-end transaction ends.

With write-behind support enabled, the eXtreme Scale transaction finishes before the back-end transaction starts. The eXtreme Scale transaction and back-end transaction are decoupled.


Referential integrity constraints

Each backing map that is configured with write-behind support has its own write-behind thread to push the data to the back-end. Therefore, the data that updated to different maps in one eXtreme Scale transaction are updated to the back-end in different back-end transactions. For example, transaction T1 updates key key1 in map Map1 and key key2 in map Map2. The key1 update to map Map1 is updated to the back-end in one back-end transaction, and the key2 updated to map Map2 is updated to the back-end in another back-end transaction by different write-behind threads. If data stored in Map1 and Map2 have relations, such as foreign key constraints in the back-end, the updates might fail.

When designing the referential integrity constraints in the back-end database, ensure that out-of-order updates are allowed.


Failed updates

Because the eXtreme Scale transaction finishes before the back-end transaction starts, it is possible to have transaction false success. For example, if you try to insert an entry in an eXtreme Scale transaction that does not exist in the backing map but does exist in the back-end, causing a duplicate key, the eXtreme Scale transaction does succeed. However, the transaction in which the write-behind thread inserts that object into the back-end fails with a duplicate key exception.

Refer to Handle failed write-behind updates for how to handle such failures.


Queue map locking behavior

Another major transaction behavior difference is the locking behavior.eXtreme Scale supports three different locking strategies: PESSIMISTIC, OPTIMISITIC, and NONE. The write-behind queue maps uses pessimistic locking strategy no matter which lock strategy is configured for its backing map. Two different types of operations exist that acquire a lock on the queue map:

Because of this extra queue map behavior, you can see some locking behavior differences.


Loader transaction retries

WebSphere eXtreme Scale does not support 2-phase or XA transactions. The write-behind thread removes records from the queue map and updates the records to the back-end. If the server fails in the middle of the transaction, some back-end updates can be lost.

The write-behind loader automatically retries to write failed transactions and sends an in-doubt LogSequence to the back end to avoid data loss. This action requires the loader to be idempotent, which means when the Loader.batchUpdate(TxId, LogSequence) method is called twice with the same value, it gives the same result as if it were applied one time. Loader implementations must implement the RetryableLoader interface to enable this feature. See theRetryableLoader interface.


Loader failures

The loader plug-in can fail when it is unable to communicate to the database back end. This can happen if the database server or the network connection is down. The write-behind loader will queue the updates and try to push the data changes to the loader periodically. The loader must notify the WebSphere eXtreme Scale run time that a database connectivity problem exists by throwing a LoaderNotAvailableException exception.

Therefore, the Loader implementation should be able to distinguish a data failure or a physical loader failure. Data failure should be thrown or re-thrown as a LoaderException or an OptimisticCollisionException exception, but a physical loader failure should be thrown or re-thrown as a LoaderNotAvailableException exception. WebSphere eXtreme Scale handles these two exceptions differently:

The common mistake is to throw a LoaderException while a LoaderNotAvailableException should be thrown. All the records queued in the write-behind loader will become failed update records, which defeats the purpose of back-end failure isolation. This mistake will likely happen if you write a generic loader to talk to databases.

The eXtreme Scale provided JPALoader is one example. The JPALoader uses JPA API to interact with database backends. When the network fails, the JPALoader gets a javax.persistence.PersitenceException but it does not know the essence of the failure unless the SQL state and SQL error code of the chained SQLException are checked. The fact that the JPALoader is designed to work with all types of database further complicates the problem as the SQL states and error codes are different for the network down problem.

To solve this, WebSphere eXtreme Scale provides an ExceptionMapper API to allow users plug in an implementation to map an Exception to a more consumable exception. For example, users can map a generic javax.persistence.PersitenceException to a LoaderNotAvailableException if the SQL state or error code indicates the network is down.


Performance considerations

Write-behind caching support increases response time by removing the loader update from the transaction. It also increases database throughput since database updates are combined. It is important to understand the overhead introduced by write-behind thread, which pulls the data out of the queue map and pushes to the loader.

The maximum update count or the maximum update time need to be adjusted based on the expected usage patterns and environment. If the value of the maximum update count or the maximum update time is too small, the overhead of the write-behind thread may exceed the benefits. Setting a large value for these two parameters could also increase the memory usage for queuing the data and increase the stale time of the database records.

For best performance, tune the write-behind parameters based on the following factors:


Parent topic:

Configure write-behind loader support


Related concepts

Write-behind caching

Handle failed write-behind updates

Related reference

Example: Writing a write-behind dumper class