Configure zones for replica placement

Product overview > Availability overview

Configure zones for replica placement

Zone support allows sophisticated configurations for replica placement across data centers. With this capability, grids of thousands of partitions can be managed using a handful of optional placement rules. A data center can be different floors of a building, different buildings, or even different cities or other distinctions as configured with zone rules.

Flexibility of zones

You can place shards into zones. This function allows you to have more control over how eXtreme Scale places shards on a grid. Java™ virtual machines that host an eXtreme Scale server can be tagged with a zone identifier. The deployment file can now include one or more zone rules and these zone rules are associated with a shard type. The following section gives an overview of zone usage. .
Placement zones control of how the eXtreme Scale assigns out primaries and replicas to configure advanced topologies.
A Java virtual machine can have multiple containers but only 1 server. A container can host multiple shards from a single ObjectGrid.
This capability is useful to make sure that replicas and primaries are placed in different locations or zones for better high availability. Normally, eXtreme Scale does not place a primary and replica shard on Java virtual machines with the same IP address. This simple rule normally prevents two eXtreme Scale servers from being placed on the same physical computer. However, you might require a more flexible mechanism. For example, you might be using two blade chassis and want the primaries to be striped across both chassis and the replica for each primary be placed on the other chassis from the primary.
Striped primaries means that primaries are placed into each zone and the replica for each primary is located in the opposite zone. For example primary 0 would be in zoneA, and sync replica 0 would be in zoneB. Primary 1 would be in zoneB, and sync replica 1 would be in zoneA.
The chassis name would be the zone name in this case. Alternatively, you might name zones after floors in a building and use zones to make sure that primaries and replicas for the same data are on different floors. Buildings and data centers are also possible. Testing has been done across data centers using zones as a mechanism to ensure the data is adequately replicated between the data centers. If you are using the HTTP Session Manager for eXtreme Scale, you can also use zones. With this feature, you can deploy a single Web application across three data centers and ensure that HTTP sessions for users are replicated across data centers so that the sessions can be recovered even if an entire data center fails.
WebSphere eXtreme Scale is aware of the need to manage a large grid over multiple data centers. It can make sure that backups and primaries for the same partition are located in different data centers if that is required. It can put all primaries in data center 1 and all replicas in data center 2 or it can round robin primaries and replicas between both data centers. The rules are flexible so that many scenarios are possible. eXtreme Scale can also manage thousands of servers, which together with completely automatic placement with data center awareness makes such large grids affordable from an administrative point of view. Administrators can specify what they want to do simply and efficiently.
As an administrator, use placement zones to control where primary and replica shards are placed, which allows for the set up of advanced high performance and highly available topologies. You can define a zone to any logical grouping of eXtreme Scale processes, as noted above: These zones can correspond to physical workstation locations such as a data center, a floor of a data center, or a blade chassis. You can stripe data across zones, which provides increased availability, or you can split the primaries and replicas into separate zones when a hot standby is required.

Associate an eXtreme Scale server with a zone that is not using WebSphere Extended Deployment

If eXtreme Scale is used with Java Standard Edition or an application server that is not based on WebSphere Extended Deployment v6.1, then a JVM that is a shard container can be associated with a zone if using the following techniques.

Applications using the startOgServer script
The startOgServer script is used to start an eXtreme Scale application when it is not being embedded in an existing server. The -zone parameter is used to specify the zone to use for all containers within the server.

Specify the zone when starting a container using APIs
The zone name for a container can be specified as described in the documentation for Embedded server API.

Associate WebSphere Extended Deployment nodes with zones

If you are using eXtreme Scale with WebSphere Extended Deployment JEE applications, you can leverage WebSphere Extended Deployment node groups to place servers in specific zones.
In eXtreme Scale, a JVM is only allowed to be a member of a single zone. However, WebSphere allows a node to be a part of multiple node groups. Use this functionality of eXtreme Scale zones if you ensure that each of the nodes is in only one zone node group.
Use the following syntax to name the node group in order to declare it a zone: ReplicationZone<UniqueSuffix>. Servers running on a node that is part of such a node group are included in the zone specified by the node group name. The following is a description of an example topology.
First, you configure 4 nodes: node1, node2, node3, and node4, each node having 2 servers. Then you create a node group named ReplicationZoneA and a node group named ReplicationZoneB. Next, you add node1 and node2 to ReplicationZoneA and add node3 and node4 to ReplicationZoneB.
When the servers on node1 and node2 are started, they will become part of ReplicationZoneA, and likewise the servers on node3 and node4 will become part of ReplicationZoneB.
A grid-member JVM checks for zone membership at startup only. Adding a new node group or changing the membership only has an impact on newly started or restarted JVMs.

Zone rules

An eXtreme Scale partition has one primary shard and zero or more replica shards. For this example, consider the following naming convention for these shards. P is the primary shard, S is a synchronous replica and A is an asynchronous replica. A zone rule has three components:

A rule name

A list of zones

An inclusive or exclusive flag

The zone name for a container can be specified as described in the documentation for Embedded server API. A zone rule specifies the possible set of zones in which a shard can be placed. The inclusive flag indicates that after a shard is placed in a zone from the list, then all other shards are also placed in that zone. An exclusive setting indicates that each shard for a partition is placed in a different zone in the zone list. For example, using an exclusive setting means that if there are three shards (primary, and two synchronous replicas), then the zone list must have three zones.
Each shard can be associated with one zone rule. A zone rule can be shared between two shards. When a rule is shared then the inclusive or exclusive flag extends across shards of all types sharing a single rule.

Examples

A set of examples showing various scenarios and the deployment configuration to implement the scenarios follows.

Striping primaries and replicas across zones
You have three blade chassis, and want primaries distributed across all three, with a single synchronous replica placed in a different chassis than the primary. Define each chassis as a zone with chassis names ALPHA, BETA, and GAMMA. An example deployment XML follows:
<?xml version="1.0" encoding="UTF-8"?>
<deploymentPolicy xmlns:xsi=http://www.w3.org/2001/XMLSchema-instance 
    xsi:schemaLocation=
    "http://ibm.com/ws/objectgrid/deploymentPolicy ../deploymentPolicy.xsd"
                xmlns="http://ibm.com/ws/objectgrid/deploymentPolicy">
        <objectgridDeployment objectgridName="library">
            <mapSet name="ms1" numberOfPartitions="37" minSyncReplicas="1"
                maxSyncReplicas="1" maxAsyncReplicas="0">
            <map ref="book" />
            <zoneMetadata>
                <shardMapping shard="P" zoneRuleRef="stripeZone"/>
                <shardMapping shard="S" zoneRuleRef="stripeZone"/>
                <zoneRule name ="stripeZone" exclusivePlacement="true" >
                    <zone name="ALPHA" />
                    <zone name="BETA" />
                    <zone name="GAMMA" />
                </zoneRule>
            </zoneMetadata>
        </mapSet>
    </objectgridDeployment>
</deploymentPolicy>
This deployment XML contains a grid called library with a single Map called book. It uses four partitions with a single synchronous replica. The zone metadata clause shows the definition of a single zone rule and the association of zone rules with shards. The primary and synchronous shards are both associated with the zone rule "stripeZone". The zone rule has all three zones in it and it uses exclusive placement. This rule means that if the primary for partition 0 is placed in ALPHA then the replica for partition 0 will be placed in either BETA or GAMMA. Similarly, primaries for other partitions are placed in other zones and the replicas will be placed.

Asynchronous replica in a different zone than primary and synchronous replica
In this example, two buildings exist with a high latency connection between them. You want no data loss high availability for all scenarios. However, the performance impact of synchronous replication between buildings leads you to a trade off. You want a primary with synchronous replica in one building and an asynchronous replica in the other building. Normally, the failures are JVM crashes or computer failures rather than large scale issues. With this topology, you can survive normal failures with no data loss. The loss of a building is rare enough that some data loss is acceptable in that case. You can make two zones, one for each building. The deployment XML file follows:
<?xml version="1.0" encoding="UTF-8"?>
<deploymentPolicy xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
        xsi:schemaLocation="http://ibm.com/ws/objectgrid/deploymentPolicy ../deploymentPolicy.xsd"
        xmlns="http://ibm.com/ws/objectgrid/deploymentPolicy">

    <objectgridDeployment objectgridName="library">
        <mapSet name="ms1" numberOfPartitions="13" minSyncReplicas="1"
            maxSyncReplicas="1" maxAsyncReplicas="1">
            <map ref="book" />
            <zoneMetadata>
                <shardMapping shard="P" zoneRuleRef="primarySync"/>
                <shardMapping shard="S" zoneRuleRef="primarySync"/>
                <shardMapping shard="A" zoneRuleRef="aysnc"/>
                <zoneRule name ="primarySync" exclusivePlacement="false" >
                        <zone name="BldA" />
                        <zone name="BldB" />
                </zoneRule>
                <zoneRule name="aysnc" exclusivePlacement="true">
                        <zone name="BldA" />
                        <zone name="BldB" />
                </zoneRule>
            </zoneMetadata>
        </mapSet>
    </objectgridDeployment>
</deploymentPolicy>
The primary and synchronous replica share a primaySync zone rule with an exclusive flag setting of false. So, after the primary or sync gets placed in a zone, then the other is also placed in the same zone. The asynchronous replica uses a second zone rule with the same zones as the primarySync zone rule but it uses the exclusivePlacement attribute set to true. This attribute indicates that means a shard cannot be placed in a zone with another shard from the same partition. As a result, the asynchronous replica does not get placed in the same zone as the primary or synchronous replicas.

Place all primaries in one zone and all replicas in another zone
Here, all primaries are in one specific zone and all replicas in a different zone. We will have a primary and a single asynchronous replica. All replicas will be in zone A and primaries in B.
    <?xml version="1.0" encoding="UTF-8"?>

    <deploymentPolicy xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
        xsi:schemaLocation=
            "http://ibm.com/ws/objectgrid/deploymentPolicy ../deploymentPolicy.xsd"
        xmlns="http://ibm.com/ws/objectgrid/deploymentPolicy">

        <objectgridDeployment objectgridName="library">
            <mapSet name="ms1" numberOfPartitions="13" minSyncReplicas="0"
                maxSyncReplicas="0" maxAsyncReplicas="1">
                <map ref="book" />
                <zoneMetadata>
                    <shardMapping shard="P" zoneRuleRef="primaryRule"/>
                    <shardMapping shard="A" zoneRuleRef="replicaRule"/>
                    <zoneRule name ="primaryRule">
                        <zone name="A" />
                    </zoneRule>
                    <zoneRule name="replicaRule">
                        <zone name="B" />
                            </zoneRule>
                        </zoneMetadata>
                    </mapSet>
            </objectgridDeployment>
    </deploymentPolicy>
Here, you can see two rules, one for the primaries (P) and another for the replica (A).

Zones over wide area networks (WAN)

You might want to deploy a single eXtreme Scale over multiple buildings or data centers with slower network interconnections. Slower network connections lead to lower bandwidth and higher latency connections. The possibility of network partitions also increases in this mode due to network congestion and other factors. eXtreme Scale approaches this harsh environment in the following ways.

Limited heart beating between zones
Java virtual machines grouped together into core groups do heart beat each other. When the catalog service organizes Java virtual machines in to groups, those groups do not span zones. A leader within that group pushes membership information to the catalog service. The catalog service verifies any reported failures before taking action. It does this by attempting to connect to the suspect Java virtual machines. If the catalog service sees a false failure detection then it takes no action as the core group partition will heal in a short period of time.
The catalog service will also heart beat core group leaders periodically at a slow rate to handle the case of core group isolation.

Parent topic:
Availability overview

Related concepts

High availability
Replicas and shards
Multi-master data grid replication topologies
Distribute transactions
Map sets for replication

+
Search Tips | Advanced Search