Retry interval for connecting to a cluster member marked as down. | Divide the servers into a primary server list and a backup server list.


Plug-in Connection timeout


+

Search Tips   |   Advanced Search

 


Overview

When a cluster member exists on a machine that is removed from the network (because its network cable is unplugged or it has been powered off, for example), the plug-in, by default, cannot determine the cluster member's status until the operating system TCP/IP timeout expires. Only then will the plug-in be able to forward the request to another available cluster member.

It is not possible to change the operating system timeout value without unpredictable side effects. For instance, it might make sense to change this value to a low setting so that the plug-in can fail over quickly.

However, the timeout value on some of the operating systems is not only used for outgoing traffic (from Web server to application server) but also for incoming traffic. This means that any changes to this value will also change the time it takes for clients to connect to your Web server. If clients are using dial-up or slow connections, and you set this value too low, they will not be able to connect.

To overcome this problem, WebSphere Application Server V6 offers an option within the plug-in configuration file that allows you to bypass the operating system timeout.

It is possible to change the connection timeout between the plug-in and each application server, which makes the plug-in use a non-blocking connect, as shown in Figure 18-3. To configure this setting, go to...

Servers | Application servers | AppServer_Name | Web server plug-in properties

Setting the connect timeout attribute for a server to a value of zero (default) is equal to selecting the No Timeout option, that is, the plug-in performs a blocking connect and waits until the operating system times out. Set this attribute to an integer value greater than zero to determine how long the plug-in should wait for a response when attempting to connect to a server. A setting of 10 means that the plug-in waits for 10 seconds to time out.


Finding the correct setting

To determine what setting should be used, you need to take into consideration how fast your network and servers are. Complete some testing to see how fast your network is, and take into account peak network traffic and peak server usage. If the server cannot respond before the connection timeout, the plug-in will mark it as down.

Since this setting is determined on the each application server, you can set it for each individual cluster member. For instance, you have a system with four cluster members, two of which are on a remote node. The remote node is on another subnet and it sometimes takes longer for the network traffic to reach it. You might want to set up your cluster in this case with different connection timeout values.

If a non-blocking connect is used, you will see a slightly different trace output. Example 18-4 shows what you see in the plug-in trace if a non-blocking connect is successful.

Example 18-4 Plug-in trace when ConnectTimeout is set

...
TRACE: ws_common: websphereGetStream: Have a connect timeout of 10; Setting socket to not block for the connect
TRACE: errno 55
TRACE: RET 1
TRACE: READ SET 0
TRACE: WRITE SET 32
TRACE: EXCEPT SET 0
TRACE: ws_common: websphereGetStream: Reseting socket to block