WebSphere eXtreme Scale Product Overview > Cache integration
WebSphere eXtreme Scale dynamic cache provider
Overview
The IBM DynaCache API is available to Java EE applications deployed in WAS,
The dynamic cache provider can be leveraged to...
- cache business data
- cache generated HTML
- synchronize cached data in the cell using the data replication service
Previously, the only service provider for the dynamic cache API was the default dynamic cache engine built into WAS. Customers can use the dynamic cache service provider interface in WAS to plug eXtreme Scale into dynamic cache. By setting up this capability, you can enable applications written with the dynamic cache API or applications using container-level caching (such as servlets) to leverage the features and performance capabilities of eXtreme Scale.
eXtreme Scale significantly increase the distributed capabilities of the dynamic cache API beyond what is offered by the default dynamic cache engine and data replication service. eXtreme Scale creates...
- Caches that are distributed between multiple servers, rather than just replicated and synchronized between the servers.
- Caches that are transactional and highly available, ensuring that each server sees the same contents for the dynamic cache service.
eXtreme Scale offers a higher quality of service for cache replication than DRS.
However, these advantages do not mean that the eXtreme Scale dynamic cache provider is the right choice for every application. Use the decision trees and feature comparison matrix below to determine what technology fits the application best.
Decision tree for migrating existing dynamic cache applications
Decision tree for choosing cache provider for new applications
Feature comparison
Cache features Default provider eXtreme Scale provider eXtreme Scale API Local, in-memory caching
x
x
x
Distributed caching
Embedded
Embedded, embedded-partitioned and remote-partitioned
Multiple
Linearly scalable
x
x
Reliable replication (synchronous)
ORB
ORB
Disk overflow
x
Eviction
LRU/TTL/heap-based
LRU/TTL (per partition)
Multiple
Invalidation
x
x
x
Relationships
Dependency IDs, templates
Dependency IDs, templates
x
Non-key lookups
Query and index
Back-end integration
Loaders
Transactional
Implicit
x
Key-based storage
x
x
x
Events and listeners
x
x
x
WAS integration
Single cell only
Multiple cell
Cell independent
Java Standard Edition support
x
x
Monitor and statistics
x
x
x
Security
x
x
x
Technology integration
Cache features Default provider eXtreme Scale provider eXtreme Scale API WAS servlet/JSP results caching
V5.1+
V6.1.0.25+
WAS Web Services (JAX-RPC) result caching
V5.1+
V6.1.0.25+
HTTP session caching
x
Cache provider for OpenJPA and Hibernate
x
Database synchronization using OpenJPA and Hibernate
x
Programming interfaces
Cache features Default provider eXtreme Scale provider eXtreme Scale API Command-based API
Command framework API
Command framework API
DataGrid API
Map-based API
DistributedMap API
DistributedMap API
ObjectMap API
EntityManager API
x
For a more detailed description on how eXtreme Scale distributed caches work, see "Deployment configurations for eXtreme Scale" in the Programming Guide.
An eXtreme Scale distributed cache can only store entries where the key and the value both implement the java.io.Serializable interface.
Topology types
A dynamic cache service created with the eXtreme Scale provider can be deployed in any of three available topologies, allowing you to tailor the cache specifically to performance, resource, and administrative needs. These topologies are embedded, embedded partitioned, and remote.
Embedded topologyThe embedded topology is similar to the default dynamic cache and DRS provider. Distributed cache instances created with the embedded topology keep a full copy of the cache within each eXtreme Scale process that accesses the dynamic cache service, allowing all read operations to occur locally. All write operations go through a single-server process, in which the transactional locks are managed, before being replicated to the rest of the servers. Consequently, this topology is better for workloads where cache-read operations greatly outnumber cache-write operations.
With the embedded topology, new or updated cache entries are not immediately visible on every single server process. A cache entry will not be visible, even to the server that generated it, until it propagates through the asynchronous replication services of eXtreme Scale. These services operate as fast as the hardware will allow, but there is still a small delay.
Embedded partitioned topologyFor workloads where cache-writes occur as often as or more frequently than reads, the embedded partitioned or remote topologies are recommended.
The embedded partitioned topology keeps all of the cache data within the WAS processes that access the cache. However, each process only stores a portion of the cache data. All reads and writes for the data located on this “partition” go through the process, meaning that most requests to the cache will be fulfilled with a remote procedure call. This results in a higher latency for read operations than the embedded topology, but the capacity of the distributed cache to handle read and write operations will scale linearly with the number of WAS processes accessing the cache. Also, with this topology, the maximum size of the cache is not bound by the size of a single WebSphere process. Because each process only holds a portion of the cache, the maximum cache size becomes the aggregate size of all the processes, minus the overhead of the process.
For example, assume you have a grid of server processes with 256 megabytes of free heap each to host a dynamic cache service. The default dynamic cache provider and the eXtreme Scale provider using the embedded topology would both be limited to an in-memory cache size of 256 megabytes minus overhead. See the Capacity Planning and High Availability section later in this document. The eXtreme Scale provider using the embedded partitioned topology would be limited to a cache size of one gigabyte minus overhead. In this manner, the eXtreme Scale provider makes it possible to have an in-memory dynamic cache services that are larger than the size of a single server process. The default dynamic cache provider relies on the use of a disk cache to allow cache instances to grow beyond the size of a single process. In many situations, the eXtreme Scale provider can eliminate the need for a disk cache and the expensive disk storage systems needed to make them perform.
Remote topologyThe remote topology can also be used to eliminate the need for a disk cache.
The only difference between the remote and embedded partitioned topologies is that all of the cache data is stored outside of WAS processes when you are using the remote topology. eXtreme Scale supports standalone container processes for cache data. These container processes have a lower overhead than a WAS process and are also not limited to using a particular Java Virtual Machine (JVM). For example, the data for a dynamic cache service being accessed by a 32-bit WAS process could be located in an eXtreme Scale container process running on a 64-bit JVM. This allows users to leverage the increased memory capacity of 64-bit processes for caching, without incurring the additional overhead of 64-bit for application server processes.
Data compressionAnother performance feature offered by the eXtreme Scale dynamic cache provider that can help users manage cache overhead is compression. The default dynamic cache provider does not allow for compression of cached data in memory. With the eXtreme Scale provider, this becomes possible. Cache compression using the deflate algorithm can be enabled on any of the three distributed topologies. Enabling compression will increase the overhead for read and write operations, but will drastically increase cache density for applications like servlet and JSP caching.
The eXtreme Scale dynamic cache provider can also be used to back dynamic cache instances that have replication disabled. Like the default dynamic cache provider, these caches can store non-serializable data. They can also offer better performance than the default dynamic cache provider on large multi-processor enterprise servers because the eXtreme Scale code path is designed to maximize in-memory cache concurrency.
Dynamic cache engine and eXtreme Scale functional differences
In the case of local in-memory caches where replication is disabled, there should be no appreciable functional difference between caches backed by the default dynamic cache provider and eXtreme Scale. Users should not notice a functional difference between the two caches except that the eXtreme Scale backed caches do not support disk offload or statistics and operations related to the size of the cache in memory.
In the case of caches where replication is enabled there will be no appreciable difference in the results returned by most dynamic cache API calls, regardless of whether the customer is using the default dynamic cache provider or the eXtreme Scale dynamic cache provider. For some operations you cannot emulate the behavior of the dynamic cache engine using eXtreme Scale.
Dynamic cache statistics
Dynamic cache statistics are reported via the CacheMonitor application or the dynamic cache MBean. When using the eXtreme Scale dynamic cache provider, statistics will still be reported through these interfaces, but the context of the statistical values will be different.
If a dynamic cache instance is shared between three servers named A, B, and C, then the dynamic cache statistics object only returns statistics for the copy of the cache on the server where the call was made. If the statistics are retrieved on server A, they only reflect the activity on server A.
With eXtreme Scale, there is only a single distributed cache shared among all the servers, so it is not possible to track most statistics on a server-by-server basis like the default dynamic cache provider does. A list of the statistics reported by the Cache Statistics API and what they represent when you are using the eXtreme Scale dynamic cache provider follows. Like the default provider, these statistics are not synchronized and therefore can vary up to 10% for concurrent workloads.
- Cache Hits : Cache hits are tracked per server. If traffic on Server A generates 10 cache hits and traffic on Server B generates 20 cache hits, the cache statistics will report 10 cache hits on Server A and 20 cache hits on Server B.
- Cache Misses: Cache misses are tracked per server just like cache hits.
- Memory Cache Entries: This statistic reports the number of cache entries in the distributed cache. Every server that accesses the cache will report the same value for this statistic, and that value will be the total number of cache entries in memory over all the servers.
- Memory Cache Size in MB: This metric is not currently supported and will always return -1.
- Cache Removes: This statistic reports the total number of entries removed from the cache by any method, and is an aggregate value for the whole distributed cache. If traffic on Server A generates 10 invalidations and traffic on Server B generates 20 invalidations, then the value on both servers will be 30.
- Cache Least Recently Used (LRU) Removes: This statistic is aggregate, like cache removes. It tracks the number of entries that were removed to keep the cache under its maximum size.
- Timeout Invalidations: This is also an aggregate statistic, and it tracks the number of entries that were removed because they timed out.
- Explicit Invalidations : Also an aggregate statistic, this tracks the number of entries that were removed with direct invalidation by key, dependency ID or template.
- Extended Stats : The eXtreme Scale dynamic cache provider exports the following extended stat key strings.
- com.ibm.websphere.xs.dynacache.remote_hits: The total number of cache hits tracked at the eXtreme Scale container. This is an aggregate statistic, and its value in the extended stats map is a long.
- com.ibm.websphere.xs.dynacache.remote_misses: The total number of cache misses tracked at the eXtreme Scale container. An aggregate statistic, its value in the extended stats map is a long.
Reporting reset statistics
The dynamic cache provider allows you to reset cache statistics. With the default provider the reset operation only clears the statistics on the affected server. The eXtreme Scale dynamic cache provider tracks most of its statistical data on the remote cache containers. This data is not cleared or changed when the statistics are reset. Instead the default dynamic cache behavior is simulated on the client by reporting the difference between the current value of a given statistic and the value of that statistic the last time reset was called on that server.
For example, if traffic on Server A generates 10 cache removes, the statistics on Server A and on Server B will report 10 removes. Now, if the statistics on Server B are reset and traffic on Server A generates an additional 10 removes, the statistics on Server A will report 20 removes and the stats on Server B will report 10 removes.
Dynamic cache events
The dynamic cache API allows users to register event listeners. When you are using eXtreme Scale as the dynamic cache provider, the event listeners work as expected for local in-memory caches.
For distributed caches, event behavior will depend on the topology being used. For caches using the embedded topology, events will be generated on the server that handles the write operations, also known as the primary shard. This means that only one server will receive event notifications, but it will have all the event notifications normally expected from the dynamic cache provider. Because eXtreme Scale chooses the primary shard at runtime, it is not possible to ensure that a particular server process always receives these events.
Embedded partitioned caches will generate events on any server that hosts a partition of the cache. So if a cache has 11 partitions and each server in an 11 server WebSphere Network Deployment grid hosts one of the partitions, then each server will receive the dynamic cache events for the cache entries that it hosts. No single server process would see all of the events unless all 11 partitions were hosted in that server process. As with the embedded topology, it is not possible to ensure that a particular server process will receive a particular set of events or any events at all.
Caches that use the remote topology do not support dynamic cache events.
MBean calls
The eXtreme Scale dynamic cache provider does not support disk caching. Any MBean calls relating to disk caching will not work.
Dynamic cache replication policy mapping
The WAS built-in dynamic cache provider supports multiple cache replication policies. These policies can be configured globally or on each cache entry.
The eXtreme Scale dynamic cache provider does not honor these policies directly. The replication characteristics of a cache are determined by the configured eXtreme Scale distributed topology type and apply to all values placed in that cache, regardless of the replication policy set on the entry by the dynamic cache service. The following is a list of all the replication policies supported by the dynamic cache service and illustrates which eXtreme Scale topology provides similar replication characteristics.
Note that the eXtreme Scale dynamic cache provider ignores DRS replication policy settings on a cache or cache entry. Users must choose the topology that appropriate to their replication needs.
- NOT_SHARED – currently none of the topologies provided by the eXtreme Scale dynamic cache provider can approximate this policy. This means that all data stored into the cache must have keys and values that implement java.io.Serializable.
- SHARED_PUSH – The embedded topology approximates this replication policy. When a cache entry is created it is replicated to all the servers. Servers only look for cache entries locally. If an entry is not found locally, it is assumed to be non-existent and the other servers are not queried for it.
- SHARED_PULL and SHARED_PUSH_PULL – The embedded partitioned and remote topologies approximate this replication policy. The distributed state of the cache is completely consistent between all the servers.
This information is provided mainly so you can make sure that the topology meets the distributed consistency needs. For example, if the embedded topology is a better choice for a the deployment and performance needs, but you require the level of cache consistency provided by SHARED_PUSH_PULL, then consider using embedded partitioned, even though the performance may be slightly lower.
Security
You can secure dynamic cache instances that are running in embedded or embedded partitioned topologies with the security functionality built into WAS.
When a cache is running in remote topology, it is possible for a standalone eXtreme Scale client to connect to the cache and affect the contents of the dynamic cache instance. The eXtreme Scale dynamic cache provider has a low overhead encryption feature that can prevent cache data from being read or changed by non-WAS clients.
To enable this feature, set the optional parameter...
com.ibm.websphere.xs.dynacache.encryption_password...to the same value on every WAS instance that accesses the dynamic cache provider. This will encrypt the value and user metadata for the CacheEntry using 128-bit AES encryption.
It is very important that the same value be set on all servers. Servers will not be able to read data put into the cache by servers with a different value for this parameter.
If the eXtreme Scale provider detects that different values are set for this variable in the same cache, it generate a warning in the log of the eXtreme Scale container process.
See the eXtreme Scale documentation on eXtreme Scale security if SSL or client authentication is required.
Additional information
- Dynamic cache Redbook
- Dynamic cache documentation
- DRS documentation
See also:
Configure the dynamic cache provider for eXtreme Scale.
Parent topic
Cache integration