WebSphere eXtreme Scale Product Overview > Availability > Replication for availability



Replication architecture


With eXtreme Scale, an in-memory database or shard can be replicated from one Java™ virtual machine (JVM) to another. A shard represents a partition that is placed on a container. Multiple shards that represent different partitions can exist on a single container. Each partition has an instance that is a primary shard and a configurable number of replica shards. The replica shards are either synchronous or asynchronous. The types and placement of replica shards are determined by eXtreme Scale using a deployment policy, which specifies the minimum and maximum number of synchronous and asynchronous shards.


Shard types

Replication uses three types of shards:

The primary shard receives all insert, update and remove operations. The primary shard adds and removes replicas, replicates data to the replicas, and manages commits and rollbacks of transactions.

Synchronous replicas maintain the same state as the primary. When a primary replicates data to a synchronous replica, the transaction is not committed until it commits on the synchronous replica.

Asynchronous replicas might or might not be at the same state as the primary. When a primary replicates data to an asynchronous replica, the primary does not wait for the asynchronous replica to commit. An asynchronous replica still maintains the order of the transactions that are sent from the primary.

Figure 1. Communication path between a primary shard and replica shards

A transaction communicates with a primary shard. The primary shard communicates with replica shards, which can exist in one or more Java Virtual Machines.


The memory cost of replication

To bring a replica online, a checkpoint map must be created to copy over existing data from the primary. However, the main memory cost on the primary is the memory for each entry that is modified after the checkpoint occurs. If the data changes a lot after the checkpoint, then this operation costs an amount of memory that is proportional to the number of modified entries. After the checkpoint is copied to the replica, the changes are merged back in to the checkpoint. The changed entries are not freed until they have been sent to all required replicas.


Minimum synchronous replica shards

When a primary prepares to commit data, it checks how many synchronous replica shards voted to commit the transaction. If the transaction processes normally on the replica, it votes to commit. If something went wrong on the synchronous replica, it votes not to commit. Before a primary commits, the number of synchronous replica shards that are voting to commit must meet the minSyncReplica setting from the deployment policy. When the number of synchronous replica shards that are voting to commit is too low, the primary does not commit the transaction and an error results. This action ensures that the required number of synchronous replicas are available with the correct data. Synchronous replicas that encountered errors reregister to fix their state. For more information about reregistering, see Replica shard recovery.

The primary throws a ReplicationVotedToRollbackTransactionException error too few synchronous replicas voted to commit.


Replication and Loaders

Normally, a primary shard writes changes synchronously through the Loader to a database. The Loader and database are always in sync. When the primary fails over to a replica shard, the database and Loader might not be in synch. For example:

Either approach leads to either the replica being one transaction in front of or behind the database. This situation is not acceptable. eXtreme Scale uses a special protocol and a contract with the Loader implementation to solve this issue without two phase commit. The protocol follows:


Primary side


Replica side


Replica side on failover

This protocol ensures that the database is at the same level as the new primary state.



Parent topic

Replication for availability


+

Search Tips   |   Advanced Search