WebSphere Commerce v7 performance with eXtreme Scale
WebSphere Commerce sites heavily leverage dynacache to reduce database roundtrips, and thus gain an important performance boost. WebSphere Commerce uses dynacache to cache...
- whole pages (JSPs)
- page fragments
- commands
Dynacache is an in-process cache. Each instance of a cluster contains its own instance of dynacache, including copies of each cache entry. Each WebSphere Commerce instance contains its own cache within its JVM.
Each instance of dynacache contains the same, although not necessarily identical (more on this later), cache entries. Typically, the dynacache-based application server topology has to keep the many dynacache instances in synch. If cache entry "a" in the figure above is invalidated by one of the application servers, dynacache informs the other cluster members of the invalidation (failure to do so leaves many shoppers viewing stale data) by dispatching an invalidation message via the WAS Domain Replication Service (DRS).
WebSphere eXtreme Scale is a distributed shared cache technology. The eXtreme Scale-based topology has a single logical instance of the cache that is shared among commerce servers. Since this cache is shared across servers, unlike Dynacache, multiple copies of the same pages and fragments are not needed for each. Instead, a single cache instance is created on the first request for that page or fragment, and is then available to all commerce servers sharing the cache.
The following sections discuss how eXtreme Scale can improve performance for key scenarios frequently experienced by high-volume commerce customers. We will discuss how eXtreme Scale can potentially reduce the impact of a full or partial site restart for end users. We will also discuss the cache invalidation scenarios needed by retailers to reflect catalog updates, and consider how eXtreme Scale potentially improves the end user experience during these events.
WebSphere Commerce and WebSphere eXtreme Scale integration
eXtreme Scale V7.0 supports many caching topologies. However, not all of them are supported by Feature Pack 1. WebSphere Commerce support is focused on the eXtreme Scale dynacache plugin component as a configuration choice for commerce customers. eXtreme Scale is not included as part of Feature Pack 1.
Type of topology supported by Feature Pack 1:
- The eXtreme Scale grid is installed on a separate physical machine or partition (for example, LPAR) from commerce to avoid CPU, memory, and network contention.
- IBM Power™ and AIX customers can install commerce and eXtreme Scale on different LPARs. For customers using Power virtualization features, such as micropartitions, consider the CPU, memory, and network requirements for each LPAR. Sharing critical system resources among LPARs may have implications for performance.
- WebSphere Commerce communicates with eXtreme Scale via the eXtreme Scale dynamic cache provider.
- Consider possible optimizations for commerce and eXtreme Scale placement, such as sharing the same frame (p-series architecture) to further reduce network latency.
- We recommend that eXtreme Scale JVMs run in different LPARs than those containing commerce servers.
WebSphere Commerce sites, as with most IT assets, may need to take components offline periodically for maintenance. Likewise, the site may want to keep components in reserve for failover or peak events when the site might need additional capacity.
Bringing cold components into a working site presents an interesting challenge when caching is involved. The component (typically an LPAR, JVM, or application) performs optimally when the cache is warm. However, if the component was offline, the cache may have been lost entirely or become significantly stale.
This requires the component to work harder until the cache reaches its optimal state. During this period, the component may not support its full capacity as it uses extra resource, such as mid-tier CPU and database capacity, to generate responses that will eventually reside in the cache.
Cache warm-up after a cold-start is particularly problematic if the component is immediately thrust into a heavy workload requiring the component to respond at full capacity. For example, if a JVM restarted during a Black Friday sales event, it becomes active in a farm handling a peak load for the year.
Improving startup performance under heavy load
Restarting parts of a commerce site under heavy load presents special challenges. Without a populated cache to boost the site performance, end users who are sent to the newly started portion of the site might experience response time degradation. Also, the mid-tier CPU and the database would see increased utilization.
It is important for the retail site to rebuild the cache and bring site response times and utilization to a steady state quickly.
The key difference between a dynacache-based topology and the eXtreme Scale-based topology is that the latter has only a single cache instance that is populated by multiple application server JVMs. WebSphere Commerce using eXtreme Scale means the site shares a single cache, reducing the start up times.
In this example, the team tested an extreme situation: Restarting the entire site prior to releasing a full load against it. This simulates some failover scenarios involving active-passive datacenter concepts.
The figure below shows the data comparing a traffic surge to a newly-started site using commerce with traditional dynacache versus commerce with the new eXtreme Scale central caching capability. The red arrow indicates the point where a steady state is considered to be reached. Spikes in I/O are produced by garbage collection of cached objects stored in the disk offload file.
The figure below shows CPU utilization on the commerce machine for a restart scenario with eXtreme Scale. The disk I/O activity is at a low level with minor spikes due to application server logging. The red arrow indicates the point where steady state is considered to be reached.
Overall in our cold start tests, we observed that with eXtreme Scale, the commerce site tends to reach steady state in about 60% of the time required for dynacache - almost twice as fast! In our tests, we used four commerce cluster members. A larger commerce cluster would likely see a bigger benefit.
Improvement in average response time is another notable benefit of employing eXtreme Scale to help with the cold start situation. This is shown in the figure below.
The eXtreme Scale topology shows less statistical scatter in response times, and overall consistently faster response times. In our testing, we observed up to 25% improvement in average response times. This produces a better end user experience for the shoppers and will tend to improve revenue (due to fewer shoppers leaving the site) during major high-volume events.
Improving startup performance under light load and load ramp-up
For a system with load, when one of the JVMs is shut down for maintenance, two important things happen:
- Live shopper sessions of JVMs under maintenance fail over to other JVMs.
- The commerce JVM loses its cache and disk offload file during restart.
It is possible to flush the cache contents to disk, but this is not generally desired due to the fact that cache contents can become stale during the maintenance window. Also, there are performance implications around the "flush to disk on stop" approach.
Once the JVM is restarted after maintenance, the HTTP plug-in will route a proportion of requests to that JVM. The cache instance for that JVM is cold (not populated with cache entries). Given that the restart occurs under lighter load conditions, the system has sufficient resources to operate until the cache is warm. However, a proportion of shoppers whose requests are directed to this JVM by the HTTP plug-in may still experience slower response times until the cache is warm.
The figure below shows CPU and I/O utilization for the commerce server for both dynacache-based and eXtreme Scale-based topologies. For the commerce server running dynacache-based topology, you can clearly see a CPU and disk utilization disturbance. The commerce server running the eXtreme Scale topology remains virtually undisturbed by the JVM restart.
The response time curve for the eXtreme Scale-based topology is undisturbed. For the eXtreme Scale topology, all JVMs share a single cache instance that remains undisturbed during JVM restart. eXtreme Scale, therefore, brings significant benefit during a warm restart. Shoppers get a superior end user experience, and retailers are consequently able to realize higher revenue during operational procedures.
Optimizing invalidation performance
Another important set of scenarios reflects the fact that many web retailers need to periodically invalidate the contents of their cache. Caches need to be regularly invalidated for a number of reasons. The most common of these are:
- Need to display accurate inventory levels. Customers frequently have the commerce site integrated via a live feed with an ERP that stores inventory information.
- Need to display the most up-to-date price information.
These cache invalidation scenarios generally fall into two categories:
- Full cache invalidation
The web retailer chooses to periodically manually invalidate the entire cache under load and rebuild it.
- Partial, continuous cache invalidation
The web retailer has a live price or inventory feed that periodically invalidates a proportion of the cache.
The figure below shows a comparison of I/O rate on the database server for both the dynacache-based and eXtreme Scale-based topologies. Database I/O rate is an important indicator of overall site performance. As the cache is completely invalidated, all customer requests are sent to the database server for processing. The database server can bottleneck on disk I/O, resulting in reduced throughput and poor response times.
When the cache is invalidated, the database I/O rate for both topologies rapidly increases to a high level. I/O rate for the eXtreme Scale-based topology recovers faster and settles at a lower level than for the dynacache-based topology. In the case of the eXtreme Scale topology, the single cache instance is populated by several JVMs. Once a particular cache entry is refreshed by any cluster member, then it is immediately available to all of the other cluster members. This causes the cache to become populated faster that in the dynacache case. Larger numbers of commerce cluster members will show higher degradation as they are all competing for the single database instance during this warm-up time.
Continously invalidating part of the cache
To simulate this scenario, we developed an algorithm that invalidated 25% of the cache entries every twenty minutes - similar to the effect that is caused by a live inventory feed. As with the previous scenario, where we invalidate the entire cache, the database server I/O rate is a good indicator of overall site performance. The figure below shows a comparison of database server I/O for both dynacache-based and eXtreme Scale-based topologies.
In our test, the overall I/O rate (yellow line) for the dynacache-based topology remains consistently high at about 800 I/O operations per second. The test site remains bottlenecked on the database server I/O and bound by performance characteristics of the database server disk.
The database I/O for eXtreme Scale-based topology is able to recover and drop to a much lower level of about 300 I/O operations per second. The test site is bottlenecked on database I/O for short periods of time following each partial invalidation event, but recovers quickly. Again, this good result is due to the fact that with eXtreme Scale, multiple JVMs work to populate a single cache instance. The cache is populated faster, allowing the database server I/O rate to relax.
WebSphere Application Server dynacache with disk offload can provide commerce sites with excellent performance characteristics. There are, however, a number of important advantages to deploying WebSphere eXtreme Scale together with WebSphere Commerce. These include:
- Potentially up to 25% reduction in average response time in many scenarios.
- Less statistical fluctuation in response time produces a more consistent end user experience.
- Potentially up to 40% improvement in time to reach steady state after full or partial site restart, or after full cache invalidation.
- Simplified tuning and operational maintenance due to the fact that with eXtreme Scale, you do not need to worry about tuning the size of the disk offload file or file system cache performance.
- Reduced I/O volume to high-speed disk - eXtreme Scale keeps data in RAM.
- Coherent and consistent cache. With eXtreme Scale, only one version of a cache entry is cached:
- Same version of the page is always shown.
- Pages and page fragments are invalidated only once, rather than once per JVM.
- Edge-caching with Akamai is facilitated due to the fact that each page or page fragment will have the same last-modified date.
WebSphere eXtreme Scale works best for caching commerce pages, page fragments, and commands. For best performance results it is best not to cache distributed maps in eXtreme Scale.
Some customers are especially likely to benefit from integrating their site with eXtreme Scale:
- High-volume customers with large numbers of commerce JVMs and large caches will see the most benefit.
- Best results for cached pages and page fragments.
- Customers with a lot of commerce JVM instances and large clusters.
- WebSphere Commerce customers performing a high volume of cache invalidations due to promotions, inventory, or price feeds.
- WebSphere Commerce customers who are unable to leverage DRS for cache invalidation due to network bottlenecks.
Customer results may vary due to differences in scenario and hardware environment. We would like to encourage customers to evaluate adding WebSphere eXtreme Scale to their commerce site.
- WebSphere Commerce High Availability and Performance Solutions is a comprehensive IBM Redbook that describes key aspects of WebSphere Commerce performance.
- IBM WebSphere eXtreme Scale V7: Solutions Architecture is an IBM Redpiece that describes key eXtreme Scale architecture concepts.
- WebSphere Commerce Server now supports using WebSphere eXtreme Scale for page fragment caching is Billy Newport's blog about eXtreme Scale-WebSphere Commerce integration.
- Learn about WebSphere Commerce in the WebSphere Commerce V7 Information Center.
- Learn about WebSphere eXtreme Scale in the WebSphere eXtreme Scale Information Center.
- Trial download: WebSphere eXtreme Scale V7.1
- WebSphere Commerce discussion forum