IBM WebSphere DataPower XC10 appliance overview
Introduction
The DataPower XC10 appliance contains a 240 GB data grid used for caching of WebSphere application data. To add more memory to the data grid, add additional appliances to the configuration to create a collective.
WebSphere Application Server drop-in scenarios:
- Session management for HTTP requests
- Dynamic cache support
The appliance sits between the appserver and backend tiers. In the appserver tier, WebSphere eXtreme Scale Client is installed on each node, including the dmgr. The WXS client enables the appserver tier to communicate with the appliance tier.
Data grids store application objects. Collectives group appliances together for scalability management. Zones define physical locations for appliances, and are used to determine placement of data in the cache. Each appliance can be a member of one collective and one zone. Each appliance hosts multiple data grids.
Two appliances are required to make the data grid highly available.
When we define a collective, the following information is shared among the appliances in the collective:
- data grids
- monitoring information
- collective and zone members
- users
When information is updated, changes are persisted to all of the other appliances in the collective. The catalog service, a group of catalog servers, enables communication between appliances. Each appliance in the collective runs a catalog server, with a max of three catalog servers used per collective. If there are more than three appliances in a collective, the catalog service runs on the first three appliances that were added to the collective. If an appliance with a catalog server is removed from the collective, or an appliance with a catalog server becomes unavailable, the next appliance added to the collective runs a catalog server. The catalog server does not fail over to other appliances.
Appliances can only be in one collective. We cannot add an appliance one collective to a different collective. You also cannot join two collectives into a single collective. To join appliances from separate collectives, remove each appliance from its respective collective, making each appliance stand alone. Then create a new collective that includes all of the appliances.
To add an appliance to a collective, we add the host name and secret key for the appliance to the collective configuration panel in another appliance. Because membership information is persisted among the members, configuration changes can be done from any appliance in the collective.
To make a change, log in to any appliance in the collective, and go to...
- Appliance | Appliance Settings
- Appliance | Troubleshooting
Zones
Zones are associated with a physical location of the appliance, such as a city or rack location in a lab. Zones help the catalog service to define where the data in the data grids is stored. For example, if the primary information for the data grid is stored in a given zone, then the replica data is stored in an appliance that is in a different zone. With this configuration, failover can occur from the primary to a replica if the appliance that holds the data grid primary fails.
Data grids
Data grids hold the cached objects for the applications. There are three types of data grids:
- Simple data grid
- Holds data in key-value pairs. For example, the results of a database query. Uses the ObjectMap API, which works similarly to Java Maps.
- Session data grid
- Holds WAS application session data.
- Dynamic cache data grid
- Holds data from the WAS dynamic cache. Used for applications that leverage the Dynamic Cache API or use container-level caching, such as servlets. All cache data is offloaded to the appliance and is no longer stored in appserver memory.
Data grid replicas
Replicas are created when there are at least two appliances in the collective. If there is only one appliance, no replicas are created. The maximum number of replicas is n-1, because one appliance hosts the primary data grid. Editing replica settings requires that data grids be cleared, set intial values carefully. As new appliances join the collective, additional replicas are created. Primary and replica data grids are evenly distributed, or striped, across all of the appliances in the collective. As new appliances join the collective, rebalancing occurs to distribute the primary and replica data grids.
Replicas can be synchronous or asynchronous. Synchronous replicas receive updates as part of the transaction on the primary data grid. Asynchronous replicas are updated after the transaction on the primary data grid is committed. Synchronous replicas guarantee data consistency, but can increase the response time of a request when compared with an asynchronous replica. Asynchronous replicas do not have the same guarantee in data consistency, but can make your transactions complete faster. A data grid has one asynchronous replica by default. A placement algorithm controls where the replicas are located.
Maps
Contain data for the grid in key-value pairs. A single data grid has multiple maps. Client apps can connect to a specifically-named map, or dynamic maps, which are automatically created, can be used..
Collective links
A single collective should not span an unreliable network because false positive failure detections might occur. However, you might still want to replicate data grid data across appliances that have unreliable network connectivity. Some common scenarios where you might want to use this type of topology follow:
- Disaster recovery between data centers where one collective is active and the other is used for backup
- Geographically distributed data centers where all collectives are active for geographically close clients.
After you connect two collectives, any data grids that have the same names are asynchronously replicated between the collectives. These data grids must have the same number of replicas in each collective, and must have the same dynamic map configurations.
Topologies for multiple collectives
Links connecting collectives
A replication data grid infrastructure is a connected graph of collectives with bidirectional links among them. With a link, two collectives can communicate data changes. For example, the simplest topology is a pair of collectives with a single link between them. The collectives are named alphabetically: A, B, C, and so on, from the left. A link can cross a wide area network (WAN), spanning large distances. Even if the link is interrupted, we can still change data in either collective. The topology reconciles changes when the link reconnects the collectives. Links automatically try to reconnect if the network connection is interrupted.
After you set up the links, the product first tries to make every collective identical. Then, eXtreme Scale tries to maintain the identical conditions as changes occur in any collective. The goal is for each collective to be an exact mirror of every other collective connected by the links. The replication links between the collectives help ensure that any changes made in one collective are copied to the other collectives.
Line topologies
Although it is such a simple deployment, a line topology demonstrates some qualities of the links. First, it is not necessary for a collective to be connected directly to every other collective to receive changes. The collective B pulls changes from collective A. The collective C receives changes from collective A through collective B, which connects collectives A and C. Similarly, collective D receives changes from the other collectives through collective C. This ability spreads the load of distributing changes away from the source of the changes.
Notice that if collective C fails, the following actions would occur:
- collective D would be orphaned until collective C was restarted
- collective C would synchronize itself with collective B, which is a copy of collective A
- collective D would use collective C to synchronize itself with changes on collective A and B. These changes initially occurred while collective D was orphaned (while collective C was down).
Ultimately, collectives A, B, C, and D would all become identical to one other again.
Ring topologies
Ring topologies are an example of a more resilient topology. When a collective or a single link fails, the surviving collectives can still obtain changes. The collectives travel around the ring, away from the failure. Each collective has at most two links to other collectives, no matter how large the ring topology. The latency to propagate changes can be large. Changes from a particular collective might need to travel through several links before all the collectives have the changes. A line topology has the same characteristic.
Ring topology with a root collective at the center of the ring. The root collective functions as the central point of reconciliation. The other collectives act as remote points of reconciliation for changes occurring in the root collective. The root collective can arbitrate changes among the collectives. If a ring topology contains more than one ring around a root collective, the collective can only arbitrate changes among the innermost ring. However, the results of the arbitration spread throughout the collectives in the other rings.
Hub-and-spoke topologies
With a hub-and-spoke topology, changes travel through a hub collective. Because the hub is the only intermediate collective specified, hub-and-spoke topologies have lower latency. The hub collective is connected to every spoke collective through a link. The hub distributes changes among the collectives. The hub acts as a point of reconciliation for collisions. In an environment with a high update rate, the hub might require run on more hardware than the spokes to remain synchronized. WebSphere DataPower XC10 appliance is designed to scale linearly, meaning we can make the hub larger, as needed, without difficulty. However, if the hub fails, then changes are not distributed until the hub restarts. Any changes on the spoke collectives will be distributed after the hub is reconnected.
We can also use a strategy with fully replicated clients, a topology variation which uses a pair of servers that are running as a hub. Every client creates a self-contained single container data grid with a catalog in the client JVM. A client uses its data grid to connect to the hub catalog. This connection causes the client to synchronize with the hub as soon as the client obtains a connection to the hub.
Any changes made by the client are local to the client, and are replicated asynchronously to the hub. The hub acts as an arbitration collective, distributing changes to all connected clients. The fully replicated clients topology provides a reliable L2 cache for an object relational mapper, such as OpenJPA. Changes are distributed quickly among client JVMs through the hub. If the cache size can be contained within the available heap space, the topology is a reliable architecture for this style of L2.
Use multiple partitions to scale the hub collective on multiple JVMs, if necessary. Because all of the data still must fit in a single client JVM, multiple partitions increase the capacity of the hub to distribute and arbitrate changes. However, having multiple partitions does not change the capacity of a single collective.
Tree topologies
We can also use an acyclic directed tree. An acyclic tree has no cycles or loops, and a directed setup limits links to existing only between parents and children. This configuration is useful for topologies that have many collectives. In these topologies, it is not practical to have a central hub that is connected to every possible spoke. This type of topology can also be useful when add child collectives without updating the root collective.
A tree topology can still have a central point of reconciliation in the root collective. The second level can still function as a remote point of reconciliation for changes occurring in the collective beneath them. The root collective can arbitrate changes between the collectives on the second level only. We can also use N-ary trees, each of which have N children at each level. Each collective connects out to n links.
Fully replicated clients
This topology variation involves a pair of servers that are running as a hub. Every client creates a self-contained single container data grid with a catalog in the client JVM. A client uses its data grid to connect to the hub catalog, causing the client to synchronize with the hub as soon as the client obtains a connection to the hub.
Any changes made by the client are local to the client, and are replicated asynchronously to the hub. The hub acts as an arbitration collective, distributing changes to all connected clients. The fully replicated clients topology provides a good L2 cache for an object relational mapper, such as OpenJPA. Changes are distributed quickly among client JVMs through the hub. As long as the cache size can be contained within the available heap space of the clients, this topology is a good architecture for this style of L2.
Use multiple partitions to scale the hub collective on multiple JVMs, if necessary. Because all of the data still must fit in a single client JVM, using multiple partitions increases the capacity of the hub to distribute and arbitrate changes, but it does not change the capacity of a single collective.
New in Version 2.0
WebSphere DataPower XC10 appliance Version 2.0 includes enhanced appliance hardware, the ability to enable capacity limits on data grids, SNMP support, and integration with WebSphere Portal.
Multimaster replication support
Replicate data grid data across appliances that have unreliable network connectivity.
xscmd utility
New supported version of the old xsadmin utility, which was included as an unsupported sample in previous releases.
xsloganalyzer tool
Generate reports from log files to analyze the performance and troubleshoot issues.
HTTP command interface
Run operations with HTTP POST JSON statements that we can combine into scripted operations to configure appliance settings and administer data grids, collectives and zones.
IPv6 support
Internet Protocol Version 6 (IPv6), which is the next evolution in Internet Protocol beyond the IPv4 standard. The key IPv6 enhancement is the expansion of the IP address space from 32 bits to 128 bits, enabling virtually unlimited IP addresses.
Least recently used (LRU) policy support
When setting the maximum capacity on a simple grid, we can choose to specify an LRU policy.
Eviction policy support
With Simple Grids, a default map and a set of dynamic maps are created. By default, a time-to-live (TTL) eviction policy is required when you create a dynamic map with one of the provided templates, where we can choose an eviction policy of creation time, last-update-time, or last-access-time. We can now change this behavior so that an eviction policy is set on both a dynamic map and default map on a simple grid.
Appliance Type 7199-92x
New with Version 2.0. Includes faster processors, more network ports, and more cache capacity.
Data grid capacity limits
Define a maximum capacity for each data grid in a collective. Configuring a maximum capacity limits the amount of data storage that a particular data grid can use. The capacity limit ensures that the available storage capacity for the collective is used in a predictable manner.
Simple Network Monitoring Protocol (SNMP) support
Use SNMP to monitor the status of an appliance as a part of a larger group of systems in a data center.
WebSphere Portal integration
Persist HTTP sessions from WebSphere Portal into a data grid on the appliance.
Data grid inserts are rejected when physical data storage reaches full capacity
When the included 240 GB cache is full, the physical data storage is at its maximum capacity. Any insert or operations on the data grid are rejected, but read and delete operations can continue. To prevent the physical data storage from reaching full capacity, we can configure capacity limits on the data grids in the collective.
Firmware performance improvements
The Version 2.0 firmware includes throughput performance improvements that apply to both Type 9235-92X and Type 7199-92X appliances. Performance varies depending on the workload and environment.
WebSphere Commerce integration
WebSphere Commerce Version 7.0.0.1 now supports the use of WXS client Version 7.1. Use the WebSphere DataPower XC10 appliance to cache dynamic cache data from WebSphere Commerce.
Release notes and technotes