7.5.3 WebSphere HACMP configuration

To configure a pair of LPARs to failover WebSphere processes with HACMP, we need to set up a resource group that knows about the two LPARs, the resources involved (an IP address and shared file system), and the WebSphere Application Server processes themselves. This resource group can then be used to move all resources in the group from a primary machine to a standby machine when the primary machine fails.

The WebSphere Application Server software is installed on a volume group available through the VIO Server to the primary and standby LPARs. When the primary LPAR fails, the WebSphere Application Server configuration on the volume will be mounted to the standby LPAR. The standby LPAR will takes over the IP address of the primary LPAR. The HACMP start script on the standby LPAR will then run each command in the script to start all necessary server processes. During this failover process, the WebSphere Application Server processes on the LPAR are not available to service clients.

During a failure (such as a network or hardware failure), HACMP on the primary machine notifies its peer services on the standby machine through the heartbeat communication. HACMP on the standby machine recognizes the failure event. It takes over the service IP address of the primary machine, mounts the shared file system, and starts all registered servers, such as WebSphere Application Server.

As a simple HACMP configuration for illustration, imagine a cluster configuration consisting of two AIX logical partitions with shared disk storage provided by the VIO Server. The two partitions are connected via an Ethernet-based IP network, over which a heartbeat is performed. WebSphere product binaries are to be installed on the shared file system, and will be used by whichever partition is active at a given moment.