11.6 Summary

IBM

There are various ways to set up the HACMP failover behavior, for example, one system can be the backup for several other primary systems. In our environment, a simple cascading resource group was set up for demonstrating how WAS leverages the HACMP features to provide a highly available environment.

The WAS software is installed on the disk array shared by the primary and standby machines. When the primary machine fails, the WAS configuration on the disk array will be mounted to the standby machine. The HACMP start script on the standby machine then runs each command in the script to start all necessary servers. Also, the standby machine takes over the service adapter of the primary machine. During this failover process, WebSphere is not available for its clients. There is a need to add error recovery logic into the client program to handle the server failure.

In addition to testing the WAS administrative servers (Deployment Manager and Node Agent), we tested the failover of appservers with 2-phase Commit transactions and active messaging engines to make sure that potential in-doubt transactions are recovered and unprocessed messages are processed after the failover occurs.

The adjustments of several settings given in this paper are based on our lab environment. As with performance tuning, the values might vary for different business environments.

ibm.com/redbooks