Replicas and shards

Product overview > Availability overview

Replicas and shards

With eXtreme Scale, an in-memory database or shard can be replicated from one JVM to another. A shard represents a partition that is placed on a container. Multiple shards that represent different partitions can exist on a single container. Each partition has an instance that is a primary shard and a configurable number of replica shards. The replica shards are either synchronous or asynchronous. The types and placement of replica shards are determined by eXtreme Scale using a deployment policy, which specifies the minimum and maximum number of synchronous and asynchronous shards.

Shard types

Replication uses three types of shards:

Primary
Synchronous replica
Asynchronous replica

The primary shard receives all insert, update and remove operations. The primary shard adds and removes replicas, replicates data to the replicas, and manages commits and rollbacks of transactions.
Synchronous replicas maintain the same state as the primary. When a primary replicates data to a synchronous replica, the transaction is not committed until it commits on the synchronous replica.
Asynchronous replicas might or might not be at the same state as the primary. When a primary replicates data to an asynchronous replica, the primary does not wait for the asynchronous replica to commit.
Figure 1. Communication path between a primary shard and replica shards
primary shard. The primary shard communicates with replica shards, which can exist in one or more JVMs." />

Minimum synchronous replica shards

When a primary prepares to commit data, it checks how many synchronous replica shards voted to commit the transaction. If the transaction processes normally on the replica, it votes to commit. If something went wrong on the synchronous replica, it votes not to commit. Before a primary commits, the number of synchronous replica shards that are voting to commit must meet the minSyncReplica setting from the deployment policy. When the number of synchronous replica shards that are voting to commit is too low, the primary does not commit the transaction and an error results. This action ensures that the required number of synchronous replicas are available with the correct data.
Synchronous replicas that encountered errors reregister to fix their state.
If too few synchronous replicas voted to commit the primary throws errror...

ReplicationVotedToRollbackTransactionException

Replication and Loaders

Normally, a primary shard writes changes synchronously through the Loader to a database. The Loader and database are always in sync. When the primary fails over to a replica shard, the database and Loader might not be in synch. For example:

The primary can send the transaction to the replica and then fail before committing to the database.

The primary can commit to the database and then fail before sending to the replica.

Either approach leads to either the replica being one transaction in front of or behind the database. This situation is not acceptable. eXtreme Scale uses a special protocol and a contract with the Loader implementation to solve this issue without two phase commit. The protocol follows:

Primary side

Send the transaction along with the previous transaction outcomes.

Write to the database and try to commit the transaction.

If the database commits, then commit on eXtreme Scale. If the database does not commit, then roll back the transaction.

Record the outcome.

Replica side

Receive a transaction and buffer it.

For all outcomes, send with the transaction, commit any buffered transactions and discard any rolled back transactions.

Replica side on failover

For all buffered transactions, provide the transactions to the Loader and the Loader attempts to commit the transactions.

The Loader needs to be written to make each transaction is idempotent.

If the transaction is already in the database, then the Loader performs no operation.

If the transaction is not in the database, then the Loader applies the transaction.

After all transactions are processed, then the new primary can begin to serve requests.

This protocol ensures that the database is at the same level as the new primary state.

Shard placement
The catalog service is responsible for placing shards. Each ObjectGrid has a number of partitions, each of which has a primary shard and an optional set of replica shards. The catalog service does not place replica and primary shards for the same partition on the same container. It also does not place replica and primary shards on containers that have the same IP address (unless the configuration is in development mode). The catalog service allocates the shards by balancing them so that they are evenly distributed over the available containers.

Reading from replicas
You can configure map sets such that a client is permitted to read from a replica rather than being restricted to primary shards only.

Load balancing across replicas
Load balancing across replicas is typically used only when clients are caching data that is changing all the time or when the clients are using pessimistic locking.

Shard life cycles
Shards go through different states and events to support replication. The life cycle of a shard includes coming online, run time, shut down, fail over and error handling. Shards can be promoted from a replica shard to a primary shard to handle server state changes.

Parent topic:
Availability overview

Related concepts

High availability
Configure zones for replica placement
Multi-master data grid replication topologies
Distribute transactions
Map sets for replication

Related tasks

Configure deployment policies
Configure distributed deployments

+
Search Tips | Advanced Search