Initiating various failures and tests

We performed both individual tests for each EJB kind, servlets, JSPs, applets and Java clients, as well as two-day stress testing of a WebSphere application (Trade3). We also initiated various failures.

For HA database nodes:

1. Kill the database processes to verify that automatic failover works: a. Obtain PIDs for database processes (DB2, Oracle) using ps -ef | grep <name>. b. Kill the database processes using kill <PIDs>. c. View the database package status using cmviewcl -v.

You should see the database package running on the specified adoptive node. d. You can further verify the database processes in that node. e. Your WebSphere server and WebSphere clients should function normally after the failover has finished (failover time). f. Move the database package back to the primary node by using SAM.

2. Stop the database processes to verify that automatic failover works: a. Stop the database servers using db2stop for DB2 or shutdown for Oracle. b. View the database package status using cmviewcl -v.

You should see the database package running on the specified adoptive node. c. You can further verify the database processes in that node. d. Your WebSphere server and WebSphere clients should function normally after the failover. e. Move the database package back to the primary node by using SAM.

3. Turn off the power for the node to verify that automatic failover works: a. Turn off the power for the primary node. b. Observe the automatic database package failover using cmviewcl -v.

The database package should be switched automatically to the backup node. c. You can further verify the database processes in that node. d. Your WebSphere server and WebSphere clients should function normally after the failover. e. Turn on the power for the primary node and observe that the primary node is rejoining the cluster by using cmviewcl -v. f. Move the database package back to the primary node, using SAM.

4. Disconnect the network to the primary node to verify that automatic failover works: a. Identify the primary and standby LAN cards using lanscan and
cmviewcl -v. b. Disconnect the LAN connection from the primary card. c. Observe local or remote switching using cmview -v. d. Your WebSphere server and WebSphere clients should function normally after the failover takes place. e. Reconnect the LAN to the primary card of the primary node and check its status using cmviewcl -v. f. Move the database package back to the primary node by using SAM.

5. Use SAM to further test whether packages can be moved from one node to another.

For the other parts in our WebSphere test topologies, we initiated failures by: Killing HTTP, appserver, administrative server (Node Agent and Deployment Manager), and Load Balancer processes Stopping HTTP, appserver, administrative server (Node Agent and Deployment Manager), and Load Balancer processes Disconnecting the network cable to each node Powering off each node, one at a time

We initiated various kinds of failures in our test topologies, and we counted the failed client requests versus the total client requests. The failure rate was very small (less than 1%), depending on the frequency of initiation of various failures, MC/ServiceGuard detection and recovery time, and network characteristics. We found that WebSphere process failures were instantaneously recovered through the WebSphere WLM mechanism and contributed very little to failed requests. However, database recovery took minutes to complete and contributed the most to the failed requests. Therefore, in the next section, we discuss how to tune the clustering parameters.

  Prev | Home | Next

 

WebSphere is a trademark of the IBM Corporation in the United States, other countries, or both.

 

IBM is a trademark of the IBM Corporation in the United States, other countries, or both.