Configuration and setup for HACMP

We want to emphasize here that the configuration files of all WASs must not contain any physical dependent properties such as a physical host IP. You must install and run WASs using name based virtual hosts (failover unit) in order for the clustering software to work for WASs independent of physical hosts. In other words, change all physical hosts into virtual hosts in your servers and clients or change your host name to virtual. If you want to install more than one unit into a physical host, you need to specify different ports (see the WebSphere InfoCenter for port settings).

Follow these steps to configure and install all necessary software:

1. Install HACMP 4.5 or HACMP 4.3.1 or HACMP 4.4 or HACMP 4.4.1 or HACMP/ES 4.5: Before you configure HACMP, network adapters must be defined, the AIX operating system must be updated, and grant clustering nodes permission to access one another. Modify the following configuration files: /etc/netsvc.conf, /etc/hosts, and /.rhosts.

Make sure that each node's service adapters and boot addresses are listed in the /.rhosts file on each cluster node so that the /usr/sbin/cluster/utilities/clruncmd command and the /usr/sbin/cluster/godm command can run. Use smit to install HACMP 4.5, HACMP 4.3.1 or HACMP 4.4 or HACMP 4.4.1 or HACMP/ES 4.5 into both nodes. For installation details, see the HACMP for AIX Installation Guide at:

http://www-1.ibm.com/servers/eServer/pseries/library/hacmp_docs.html

You can also install HACMP after you configure the network adapter and shared disk subsystem.

2. Public network configuration:

The public network is used to service client requests (WebSphere Deployment Manager management, WAS, applications, LDAP requests, for example). We defined two TCP/IP public networks in our configuration. The public network consists of a service/boot adapter and any standby adapters. It is recommended that you use one or more standby adapters. Define standby IP addresses and boot IP addresses. For each adapter, use smit mktcpip to define the IP label, IP address, and network mask. HACMP will define the service IP address. Since this configuration process also changes the host name, configure the adapter with the desired default host name last. Then use smit chinet to change service adapters so that they will boot from the boot IP addresses. Check your configuration using lsdev -Cc if. Finally, try to ping the nodes to test the public TCP/IP connections.

3. Private network configuration:

A private network is used by the cluster manager for the heartbeat message; and can be implemented as a serial network in an HACMP cluster. A serial network allows the cluster manager to continuously exchange keep-alive packets, should the TCP/IP-based subsystem, networks, or network adapters fail. The private network can be either a raw RS232 serial line, a target mode SCSI, or a target mode SSA loop. We used an HACMP for AIX serial line (a null-model, serial to serial cable) to connect the nodes. Use smitty to create the TTY device. After creating the TTY device on both nodes, test the communication over the serial line by entering the command stty </dev/ttyx on both nodes (where /dev/ttyx is the newly added TTY device). Both nodes should display their TTY settings and return to prompt if the serial line is okay. After testing, define the RS232 serial line to HACMP for AIX.

4. Shared disk array installation and LVG configuration: The administrative data, application data, session data, LDAP data, transaction log files, other log files and other file systems that need to be highly available are stored in the shared disks that use RAID technologies or are mirrored to protect data. The shared disk array must be connected to both nodes with at least two paths to eliminate the single point of failure. We used IBM 7133 Serial Storage Architecture (SSA) Disk Subsystem. You can either configure the shared volume group to be concurrent access or non-concurrent access. A non-concurrent access environment typically uses journaled file systems to manage data, while concurrent access environments use raw logical volumes. There is a graphical interface called TaskGuide to simplify the task of creating a shared volume group within an HACMP cluster configuration. In Version 4.4, the TaskGuide has been enhanced to automatically create a JFS log and display the physical location of available disks. After one node has been configured, import volume groups to the other node by using smit importvg.

5. Installation of: WAS, WebSphere Network Deployment (or WebSphere Enterprise), DB2 client or Oracle client if using OCI, configuration, instance and database or remote catalog creation: For installation details, see the manuals for these products, the WebSphere InfoCenter; or the redbook IBM WAS V5.0 System Management and Configuration: WebSphere Handbook Series, SG24-6195. You can install the products either on local disks or on the shared disk array. As mentioned before, you need to use virtual host names for this setup. It is better to change to using virtual host names before you install WebSphere. Otherwise you may need to manually change the virtual host names in all WebSphere configuration XML files afterwards. (Doing so is not recommended nor is it supported at present, though a set of scripts for doing so are slated for Web delivery as part of WAS V5.1.1.)

Important: Although you can install the WebSphere code on each node's local disks, it is very important to keep all shared data, such as the configuration repository and log files, on the shared disk array. This makes sure that another node can access this data when the current Deployment Manager node fails.

You will need to install the binaries for WAS twice: i. Mount the shared disk array to the first node, for example as /ha. Install WAS into /ha as root, so the installation path is /ha/WebSphere/AppServer. ii. After installation, start server1 to make sure everything is OK. iii. Delete all /ha/WebSphere/AppServer files. iv. Mount the shared disk array to the same mount point on the other node, which again is /ha. Install WebSphere into /ha on the second node. v. When the installation is completed on the second node, start server1 and test the environment again.

Doing this makes sure that all nodes pick up the WebSphere packages correctly.

Note If you choose to install the software onto local disks rather than on the shared disk array, then install the same software version on both systems!

Install the WebSphere Network Deployment code on a separate system or on one of your WAS nodes.

Note You might also want to make the Deployment Manager highly available. Instructions for doing this are found in 10.5.2, Make Deployment Manager highly available using clustering software and hardware.

Use addNode.sh to federate your WAS nodes to your Deployment Manager. The Node Agent is created and started automatically during the federation operation. Create WebSphere clusters and cluster members. Be aware that you cannot fail over individual parts of the environment, for example an appserver. You always have to failover the entire WebSphere Failover Unit. For example: The WASs must failover together with their Node Agent. All appservers in a node must failover together as a WebSphere Failover Unit. Any appserver must failover together with all of its dependent resources. If "thick" database clients are used, install the database clients on your WebSphere nodes and configure the database clients to connect to the database server. For example, install DB2 clients on all WebSphere nodes and catalog the remote node and database server. The DB2 client must also failover together with the appservers.

6. Define the cluster topology and HACMP appservers.

Please note that an HACMP appserver is different from a WAS. In our case, the HACMP application server is the WebSphere Failover Unit that includes all the components as mentioned above.

7. Define the cluster topology and HACMP appservers - please note that HACMP application server is totally different from WAS - We call HACMP application server as WebSphere Failover Unit: The cluster topology is comprised of cluster definition, cluster nodes, network adapters, and network modules. The cluster topology is defined by entering information about each component in HACMP-specific ODM classes. These tasks can be performed using smit hacmp. For details, see the HACMP for AIX Installation Guide at:

http://www-1.ibm.com/servers/eServer/pseries/library/hacmp_docs.html An "appserver", in HACMP or HACMP/ES, is a cluster resource that is made highly available by the HACMP or HACMP/ES software. For example in this case, WASs, Node Agents, Deployment Manager, database client. Do not confuse this with WAS. Use smit hacmp to define the HACMP (HACMP/ES) application server with a name and its start script and stop script. Start and stop scripts for WebSphere Failover Unit (WFOU) as the HACMP appservers.

Our sample WFOU service start script is:

/ha/WebSphere/AppServer/bin/startNode.sh

/ha/WebSphere/AppServer/bin/startServer.sh WLMmember1

/ha/WebSphere/AppServer/bin/startServer.sh WLMmember2

su - db2inst1 <<MYDB2

db2start

MYDB2

And our sample WFOU service stop script is:

/ha/WebSphere/AppServer/bin/stopNode.sh

/ha/WebSphere/AppServer/bin/stopServer.sh WLMmember1

/ha/WebSphere/AppServer/bin/stopServer.sh WLMmember2

su - db2inst1 <<MYDB2

db2 force applications all

db2stop

MYDB2

You must test each configuration part/script individually before you can bundle all of them as the Failover Unit.

Remember that the WebSphere Failover Unit includes the WAS, Node Agent(s), and database client. You must failover all of these together as an integrated failover unit, an individual server failover will not work.

8. Define and configure the resource groups: For HACMP and HACMP/ES to provide a highly available appserver service, you need to define the service as a set of cluster-wide resources essential to uninterrupted processing. The resource group can have both hardware and software resources such as disks, volume groups, file systems, network addresses, and application servers themselves. The resource group is configured to have a particular kind of relationship with a set of nodes. There are three kinds of node relationships: cascading, concurrent access, or rotating. For the cascading resource group, setting the cascading without fallback (CWOF) attribute will minimize the client failure time. We used this configuration in our tests. Use smit to configure resource groups and resources in each group. Finally, you need to synchronize cluster resources to send the information contained on the current node to the other node.

9. Cluster verification: Use /usr/sbin/cluster/daig/clverify on one node to check that all cluster nodes agree on the cluster configuration and the assignment of HACMP for AIX resources. You can also use smit hacmp to verify the cluster. If all nodes do not agree on the cluster topology and you want to define the cluster as it is defined on the local node, you can force agreement of cluster topology onto all nodes by synchronizing the cluster configuration. Once the cluster verification is satisfactory, start the HACMP cluster services using smit hacmp on both nodes, and monitor the log file using tail -f /tmp/hacmp.out. Check the database processes.

10. Takeover verification: To test a failover, use smit hacmp to stop the cluster service with the takeover option. On the other node, enter the tail -f /tmp/hacmp.out command to watch the takeover activity.

Example 9-10 shows the complete configuration file for a successful WAS failover.

Example 9-10 Configuration script for WASs Failover Unit

Cluster Description of Cluster hacluster
Cluster ID: 99
There were 2 networks defined : haserial, tr157
There are 2 nodes in this cluster.
NODE hacmp1: 
        This node has 2 service interface(s):
        
        Service Interface hacmp1-tty:
                IP address:     /dev/tty0
                Hardware Address:
                Network:        haserial
                Attribute:      serial  
                Aliased Address?:       False
        
        Service Interface hacmp1-tty has no boot interfaces.
        Service Interface hacmp1-tty has no standby interfaces.
        
        
Service Interface hacmp1s:
IP address:     10.2.157.68
                Hardware Address:
                Network:        tr157
                Attribute:      public  
                Aliased Address?:       False
Service Interface hacmp1s has 1 boot interfaces.
                Boot (Alternate Service) Interface 1: hacmp1b
                IP address:     10.2.157.207
                Network:        tr157
                Attribute:      public
        Service Interface hacmp1s has 1 standby interfaces.
                Standby Interface 1: hacmp1sb
                IP address:     10.10.0.30
                Network:        tr157
                Attribute:      public
NODE hacmp2:    
        This node has 2 service interface(s):
        
        Service Interface hacmp2-tty:
                IP address:     /dev/tty0
                Hardware Address:
                Network:        haserial
                Attribute:      serial
                Aliased Address?:       False
                
        Service Interface hacmp2-tty has no boot interfaces.
        Service Interface hacmp2-tty has no standby interfaces.
        
        
Service Interface hacmp2s:
                IP address:     10.2.157.196
                Hardware Address:
                Network:        tr157
                Attribute:      public
                Aliased Address?:       False
Service Interface hacmp1s has 1 boot interfaces.
                Boot (Alternate Service) Interface 1: hacmp1b
                IP address:     10.2.157.207
                Network:        tr157
                Attribute:      public
Service Interface hacmp1s has 1 standby interfaces.
                Standby Interface 1: hacmp1sb
                IP address:     10.10.0.30
                Network:        tr157
                Attribute:      public
                
                
NODE hacmp2:    
        This node has 2 service interface(s):
        
        Service Interface hacmp2-tty:
                IP address:     /dev/tty0
                Hardware Address:
                Network:        haserial
                Attribute:      serial
                Aliased Address?:       False
                
        Service Interface hacmp2-tty has no boot interfaces.
        Service Interface hacmp2-tty has no standby interfaces.
        
        
        Service Interface hacmp2s:
                IP address:     10.2.157.196
                Hardware Address:
                Network:        tr157
                Attribute:      public
                Aliased Address?:       False
        Service Interface hacmp2-tty has no boot interfaces.
        Service Interface hacmp2-tty has no standby interfaces.
        
        
        Service Interface hacmp2s:
                IP address:     10.2.157.196
                Hardware Address:
                Network:        tr157
                Attribute:      public  
                Aliased Address?:       False
Service Interface hacmp2s has 1 boot interfaces.
                Boot (Alternate Service) Interface 1: hacmp2b
                IP address:     10.2.157.206
                Network:        tr157
                Attribute:      public
        Service Interface hacmp2s has 1 standby interfaces.
                Standby Interface 1: hacmp2sb
                IP address:     10.10.0.40
                Network:        tr157
                Attribute:      public
                
Breakdown of network connections:
Connections to network haserial
        Node hacmp1 is connected to network haserial by these interfaces:
                hacmp1-tty
                
        Node hacmp2 is connected to network haserial by these interfaces:
                hacmp2-tty
                
                
Connections to network tr157
        Node hacmp1 is connected to network tr157 by these interfaces:
                hacmp1b
                hacmp1s
                hacmp1sb
                
        Node hacmp2 is connected to network tr157 by these interfaces:
                hacmp2b
                hacmp2s
hacmp2sb
                
Resource Group Name                          WASND
Node Relationship                            cascading
Participating Node Name(s)                   hacmp1 hacmp2
Dynamic Node Priority
Service IP Label                             hacmp1s
Filesystems                                  /ha
Filesystems Consistency Check                fsck
Filesystems Recovery Method                  sequential
Filesystems/Directories to be exported
Filesystems to be NFS mounted
Network For NFS Mount
Volume Groups                                havg
Concurrent Volume Groups
Disks
Connections Services
Fast Connect Services
Shared Tape Resources
Application Servers                          WASFOU
Highly Available Communication Links
Miscellaneous Data
Auto Discover/Import of Volume Groups        true
Inactive Takeover                            false
Cascading Without Fallback                   true
9333 Disk Fencing                            false
SSA Disk Fencing                             false
Filesystems mounted before IP configured     true
Run Time Parameters:
Node Name                                    hacmp1
Debug Level                                  high
Host uses NIS or Name Server                 false
Format for hacmp.out                         Standard
Node Name                                    hacmp2
Debug Level                                  high
Host uses NIS or Name Server                 false
Format for hacmp.out                         Standard

  Prev | Home | Next

 

WebSphere is a trademark of the IBM Corporation in the United States, other countries, or both.

 

IBM is a trademark of the IBM Corporation in the United States, other countries, or both.