Write-behind caching support

Administration guide > Configure the deployment environment > Configuring data grids > Configuring write-behind loader support

Write-behind caching support

Use write-behind caching to reduce the overhead that occurs when updating a back-end database. Write-behind caching queues updates to the Loader plug-in.

Introduction

Write-behind caching asynchronously queues updates to the Loader plug-in. You can improve performance by disconnecting updates, inserts, and removes for a map, the overhead of updating the back-end database. The asynchronous update is performed after a time-based delay (for example, five minutes) or an entry-based delay (1000 entries).
When you configure the write-behind setting on a backing map, a write-behind thread is created and wraps the configured loader. When an eXtreme Scale transaction inserts, updates, or removes an entry from an eXtreme Scale map, a LogElement object is created for each of these records. These elements are sent to the write-behind loader and queued in a special ObjectMap called a queue map. Each backing map with the write-behind setting enabled has its own queue maps. A write-behind thread periodically removes the queued data from the queue maps and pushes them to the real back-end loader.
The write-behind loader will only send insert, update, and delete types of LogElement objects to the real loader. All other types of LogElement objects, for example, EVICT type, are ignored.
Write-behind support is an extension of the Loader plug-in, which you use to integrate eXtreme Scale with the database. For example, consult the Configure JPA loaders information about configuring a JPA loader.

Benefits
Enable write-behind support has the following benefits:

Backend failure isolation: Write-behind caching provides an isolation layer from back-end failures. When the back-end database fails, updates are queued in the queue map. The applications can continue driving transactions to eXtreme Scale. When the back-end recovers, the data in the queue map is pushed to the back-end.

Reduced back-end load: The write-behind loader merges the updates on a key basis so only one merged update per key exists in the queue map. This merge decreases the number of updates to the back-end.

Improved transaction performance: Individual eXtreme Scale transaction times are reduced because the transaction does not need to wait for the data to be synchronized with the back-end.

ObjectGrid descriptor XML
When configuring an eXtreme Scale using an eXtreme Scale descriptor XML file, the write-behind loader is enabled by setting the writeBehind attribute on the backingMap tag. An example follows:
<objectGrid name="library" >
    
<backingMap name="book" writeBehind="T300;C900" pluginCollectionRef="bookPlugins"/>
In the previous example, write-behind support of the "book" backing map is enabled with parameter "T300;C900".
The write-behind attribute specifies the maximum update time and/or a maximum key update count. The format of the write-behind parameter is:
write-behind attribute  ::=
<defaults> |
<update time> |
<update key count> |
<update time> ";"
<update key count> 
update time  ::= "T"
<positive integer> 
update key count  ::= "C"
<positive integer> 
defaults  ::= "" {table}
Updates to the loader occur when one of the following events occurs:

The maximum update time in seconds has elapsed since the last update.

The number of updated keys in the queue map has reached the update key count.

These parameters are only hints. The real update count and update time will be within close range of the parameters. However, you cannot guarantee that the actual update count or update time are the same as defined in the parameters. Also, the first behind update could happen after up to twice as long as the update time. This is because eXtreme Scale randomizes the update starting time so all partitions will not hit the database simultaneously.
In the previous example T300;C900, the loader writes the data to the back end when 300 seconds have passed since the last update or when 900 keys are pending to be updated.
The default update time is 300 seconds and the default update key count is 1000.
The table below lists some write-behind attribute examples.
If you configure the write-behind loader as an empty string: writeBehind="", the write-behind loader is enabled using the default values. Therefore, do not specify the writeBehind attribute if you do not want write-behind support enabled.

Table 1. Some write-behind options
Attribute value Time
T100 The update time is 100 seconds, and the update key count is 1000 (the default value)
C2000 The update time is 300 seconds (the default value), and the update key count is 2000.
T300;C900 The update time is 300 seconds and the update key count is 900.
"" The update time is 300 second (the default value), and the update key count is 1000 (the default value).

Programmatically enabling write-behind support

When you are creating a backing map programmatically for a local, in-memory eXtreme Scale, you can use the following method on the BackingMap interface to enable and disable write-behind support.
public void setWriteBehind(String writeBehindParam);
For more details about how to use the setWriteBehind method, see BackMap interface.

Application design considerations

Enable write-behind support is simple, but designing an application to work with write-behind support needs careful consideration. Without write-behind support, the eXtreme Scale transaction encloses the back-end transaction. The eXtreme Scale transaction starts before the back-end transaction starts, and it ends after the back-end transaction ends.
With write-behind support enabled, the eXtreme Scale transaction finishes before the back-end transaction starts. The eXtreme Scale transaction and back-end transaction are decoupled.

Referential integrity constraints

Each backing map that is configured with write-behind support has its own write-behind thread to push the data to the back-end. Therefore, the data that updated to different maps in one eXtreme Scale transaction are updated to the back-end in different back-end transactions. For example, transaction T1 updates key key1 in map Map1 and key key2 in map Map2. The key1 update to map Map1 is updated to the back-end in one back-end transaction, and the key2 updated to map Map2 is updated to the back-end in another back-end transaction by different write-behind threads. If data stored in Map1 and Map2 have relations, such as foreign key constraints in the back-end, the updates might fail.
When designing the referential integrity constraints in the back-end database, ensure that out-of-order updates are allowed.

Failed updates

Because the eXtreme Scale transaction finishes before the back-end transaction starts, it is possible to have transaction false success. For example, if you try to insert an entry in an eXtreme Scale transaction that does not exist in the backing map but does exist in the back-end, causing a duplicate key, the eXtreme Scale transaction does succeed. However, the transaction in which the write-behind thread inserts that object into the back-end fails with a duplicate key exception.
Refer to Handle failed write-behind updates for how to handle such failures.

Queue map locking behavior

Another major transaction behavior difference is the locking behavior.eXtreme Scale supports three different locking strategies: PESSIMISTIC, OPTIMISITIC, and NONE. The write-behind queue maps uses pessimistic locking strategy no matter which lock strategy is configured for its backing map. Two different types of operations exist that acquire a lock on the queue map:

When an eXtreme Scale transaction commits, or a flush (map flush or session flush) happens, the transaction reads the key in the queue map and places an S lock on the key.

When an eXtreme Scale transaction commits, the transaction tries to upgrade the S lock to X lock on the key.

Because of this extra queue map behavior, you can see some locking behavior differences.

If the user map is configured as PESSIMISTIC locking strategy, there isn't much locking behavior difference. Every time a flush or commit is called, an S lock is placed on the same key in the queue map. During the commit time, not only is an X lock acquired for key in the user map, it is also acquired for the key in the queue map.

If the user map is configured as OPTIMISTIC or NONE locking strategy, the user transaction will follow the PESSIMISTIC locking strategy pattern. Every time a flush or commit is called, an S lock is acquired for the same key in the queue map. During the commit time, an X lock is acquired for the key in the queue map using the same transaction.

Loader transaction retries

WebSphere eXtreme Scale does not support 2-phase or XA transactions. The write-behind thread removes records from the queue map and updates the records to the back-end. If the server fails in the middle of the transaction, some back-end updates can be lost.
The write-behind loader automatically retries to write failed transactions and sends an in-doubt LogSequence to the back end to avoid data loss. This action requires the loader to be idempotent, which means when the Loader.batchUpdate(TxId, LogSequence) method is called twice with the same value, it gives the same result as if it were applied one time. Loader implementations must implement the RetryableLoader interface to enable this feature. See theRetryableLoader interface.

Loader failures

The loader plug-in can fail when it is unable to communicate to the database back end. This can happen if the database server or the network connection is down. The write-behind loader will queue the updates and try to push the data changes to the loader periodically. The loader must notify the WebSphere eXtreme Scale run time that a database connectivity problem exists by throwing a LoaderNotAvailableException exception.
Therefore, the Loader implementation should be able to distinguish a data failure or a physical loader failure. Data failure should be thrown or re-thrown as a LoaderException or an OptimisticCollisionException exception, but a physical loader failure should be thrown or re-thrown as a LoaderNotAvailableException exception. WebSphere eXtreme Scale handles these two exceptions differently:

If a LoaderException is caught by the write-behind loader, the write-behind loader will consider it fails due to some data failure, such as duplicate key failure. The write-behind loader will unbatch the update, and try the update one record at one time to isolate the data failure. If A LoaderException exception is caught again during the one record update, a failed update record is created and logged in the failed update map.

If a LoaderNotAvailableException is caught by the write-behind loader, the write-behind loader will consider it fails because it cannot connect to the database end, for example, the database back-end is down, a database connection is not available, or the network is down. The write-behind loader will wait for 15 seconds and then re-try the batch update to the database.

The common mistake is to throw a LoaderException while a LoaderNotAvailableException should be thrown. All the records queued in the write-behind loader will become failed update records, which defeats the purpose of back-end failure isolation. This mistake will likely happen if you write a generic loader to talk to databases.
The eXtreme Scale provided JPALoader is one example. The JPALoader uses JPA API to interact with database backends. When the network fails, the JPALoader gets a javax.persistence.PersitenceException but it does not know the essence of the failure unless the SQL state and SQL error code of the chained SQLException are checked. The fact that the JPALoader is designed to work with all types of database further complicates the problem as the SQL states and error codes are different for the network down problem.
To solve this, WebSphere eXtreme Scale provides an ExceptionMapper API to allow users plug in an implementation to map an Exception to a more consumable exception. For example, users can map a generic javax.persistence.PersitenceException to a LoaderNotAvailableException if the SQL state or error code indicates the network is down.

Performance considerations

Write-behind caching support increases response time by removing the loader update from the transaction. It also increases database throughput since database updates are combined. It is important to understand the overhead introduced by write-behind thread, which pulls the data out of the queue map and pushes to the loader.
The maximum update count or the maximum update time need to be adjusted based on the expected usage patterns and environment. If the value of the maximum update count or the maximum update time is too small, the overhead of the write-behind thread may exceed the benefits. Setting a large value for these two parameters could also increase the memory usage for queuing the data and increase the stale time of the database records.
For best performance, tune the write-behind parameters based on the following factors:

Ratio of read and write transactions

Same record update frequency

Database update latency.

Parent topic:
Configure write-behind loader support

Related concepts

Write-behind caching
Handle failed write-behind updates

Related reference

Example: Writing a write-behind dumper class

+
Search Tips | Advanced Search

Attribute value	Time
T100	The update time is 100 seconds, and the update key count is 1000 (the default value)
C2000	The update time is 300 seconds (the default value), and the update key count is 2000.
T300;C900	The update time is 300 seconds and the update key count is 900.
""	The update time is 300 second (the default value), and the update key count is 1000 (the default value).