+

Search Tips   |   Advanced Search

(ZOS) Timeout values: guidelines for altering timeout values

This file lists common timer variables and tools for monitoring these timeout conditions

Generally speaking, increasing the timeout values should be your last resort, or only a temporary action taken to prevent multiple timeout-abend dumps from causing system performance problems. If we increase timeout values without properly diagnosing the timeout condition, the only results we might see are less frequent abends and dumps for the same timeout condition, or slower system or application performance.

For information on how to set values for these timer variables, and how these variables map to internal variables, see Controlling behavior through timeout values

(Some WebSphere variables are split on multiple lines for printing purposes.)

WebSphere variable and its relationship, if any, to other timers How to monitor processing for this type of timeout condition: To adjust the value, consider the following:
WLM timeout

For HTTP work and Scalable Messaging Support, the WLM timer is not set and only the ConnectionResponseTimeout is in effect (covering the entire dispatch window)

SMF provides data on WLM queue time How long work takes to get to a servant depends on the number of servants that WLM starts, how many you let it start, how many service classes the work is spread across, how much work you're getting, and so on.
ConnectionIOTimeOut

None.

This behavior is not easily monitored. Turning on a trace point would indicate whether a client failed because of this input timeout setting, but tracing has performance consequences.

  • How long are we willing to allow a control region worker thread to be blocked while it is waiting for a message?
  • How big are incoming HTTP requests? The larger they are, the longer it might take to get the whole request through the network.

ConnectionResponseTimeout

If the application component starts transactions, then the transaction timers also might be involved.

This behavior is not easily monitored, but the controller will terminate the servant (region) with abend EC3 for this timeout condition.

  • How long are we willing to let a client hang waiting for a response?
  • How long are we willing to let a thread in a servant (region) be tied up working on a response before concluding that the request has taken too long?

  • If we have multiple application threads in the servant (region), all of them will be terminated when only one of them times out. This loss of work might make we want to allow these time outs to occur less frequently.

ConnectionKeepAliveTimeout None. All the other timers relate to work processing, whereas this one relates to what happens when there is no work.

None. How much time passes between requests vs. how much does it cost to establish a new session. We would want to keep idle sessions around for a while to avoid the startup cost of a new session, but don't want to keep them forever as resource usage accumulation will eventually be a problem.
Request Timeout (ORB Service)

None. This variable is a client-side timeout, and IIOP only.

None, other than to observe the timeouts occuring on the client side. How long are we willing to let the client wait?
ORB listener keep alive ORB SSL listener keep alive

None. These variables relate to session activity during idle periods and only for IIOP, so these timers do not interact with the ConnectionKeepAliveTimeout timer.

We should read TCP/IP APAR PQ18618 for information about the
SOCK_TCP_KEEPALIVE
values and their consequences.
Is it useful to have idle sessions timeout? They normally don't which can consume resources. However, detecting a timeout requires network traffic between TCP/IP stacks. Creating traffic on otherwise idle sessions may have network consequences we don't want.
Total Transaction Lifetime Timeout

This variable can be overridden by applications up to the maximum indicated by the Maximum Transaction Timeout variable, which limits the amount of time an application can set for its transactions to complete. Output timers also might cause work to time out, but the transaction timers and output timers are not aware of each other.

The controller issues message BBOT0003W to indicate a timeout condition, and terminates the servant (region) with abend EC3 reason codes 04130002 or 04130005.

  • How long are we willing to let a client hang waiting for a response?
  • How long are we willing to let a thread in a servant (region) be tied up working on a response before concluding that the request has taken too long?

  • If we have multiple application threads in the servant (region), all of them will be terminated when only one of them times out. This loss of work might make we want to allow these time-outs to occur less frequently.

Maximum Transaction Timeout

If set, this variable limits the amount of time an application can set for its transactions to complete. If the Maximum Transaction Timeout variable is not set, application transactions are controlled by the time limit set on the Total Transaction Lifetime Timeout variable.

None. Same considerations as for
 transaction_ defaultTimeout
transaction_ recoveryTimeout None

None. Locks are held while one controller waits for other controllers required to resolve in-doubt transactions. How long can you afford to have these resources held?
server_region_request_cputimeused_limit This behavior is not easily monitored, but the controller terminates a request when the specified cpu use time limit is reached.

  • How much of our CPU time are we willing to let a single application request consume before taking some action against that request?

  • If we have multiple application threads in the servant, all of them are terminated if terminating this request means that the servant now has enough unresponsive threads for the controller to terminate the servant. The setting for the server_region_stalled_thread_threshold_percent property determines how many threads need to be unresponsive before the servant is terminated.

 server_region_stalled_thread
_threshold_percent
This behavior is not easily monitored, but the controller will terminate the servant with abend EC3 when the percentage of unresponsive threads meets this condition.

  • How long are we willing to let a client wait for a response?
  • How long are we willing to let a thread in a servant be tied up working on a response before concluding that the request has taken too long?

  • If we have multiple application threads in the servant, all of them are terminated when the controller terminates the servant because the percentage of unresponsive threads is reached. This loss of work might make we want to allow a larger percentage of request threads to become unresponsive before having the controller terminate the servant.

  • Troubleshoot administration
  • Timer overview
  • Timeout conditions: analyzing diagnostic data
  • Timeout conditions - possible causes and fixes