Configure the hang detection policy

Configure the hang detection policy

A common error in J2EE applications is a hung thread. A hung thread can result from a simple software defect (such as an infinite loop) or a more complex cause (for example, a resource deadlock). System resources, such as CPU time, might be consumed by this hung transaction when threads run unbounded code paths, such as when the code is running in an infinite loop. Alternately, a system can become unresponsive even though all resources are idle, as in a deadlock scenario. Unless an end user or a monitoring tool reports the problem, the system may remain in this degraded state indefinitely.

The hang detection option for WebSphere Application Server is turned on by default. You can configure a hang detection policy to accommodate your applications and environment so that potential hangs can be reported, providing earlier detection of failing servers. When a hung thread is detected, WebSphere Application Server notifies you so that you can troubleshoot the problem.

Using the hang detection policy, you can specify a time that is too long for a unit of work to complete. The thread monitor checks all managed threads in the system (for example, Web container threads and object request broker (ORB) threads) . Unmanaged threads, which are threads created by applications, are not monitored. For more information read Hung threads in J2EE applications .

The thread hang detection option is enabled by default. To adjust the hang detection policy values, or to disable hang detection completely:

  1. From the administrative console, click Servers > Application Servers > server_name

  2. Under Server Infrastructure, click Administration > Custom Properties

  3. Click New .

  4. Add the following properties:
    Name:  com.ibm.websphere.threadmonitor.interval Value:   The frequency (in seconds) at which managed threads in the selected 
            application server will be interrogated.
    Default:  180 seconds (three minutes). 
    
    Name:  com.ibm.websphere.threadmonitor.threshold Value:    The length of time (in seconds) in which a thread can be active 
             before it is considered hung.  Any thread that is detected as 
             active for longer than this length of time is reported as hung.
    Default:  The default value is 600 seconds (ten minutes).
    
    Name: com.ibm.websphere.threadmonitor.false.alarm.threshold 
    Value:  The number of times (T) that false alarms can occur 
           before automatically increasing the threshold. It is possible that a 
           thread that is reported as hung eventually completes its work, 
           resulting in a false alarm.  A large number of these events indicates 
           that the threshhold value is too small. The hang detection facility can 
           automatically respond to this situation: For every T false alarms, the 
           threshold T is increased by a factor of 1.5. Set the value to 
           zero (or less) to disable the automatic adjustment. (Link to 
           Detecting hung threads in J2EE applications for information on false alarms)
    Default:  100
    
    To disable the hang detection option, set the com.ibm.websphere.threadmonitor.interval property to less than or equal to zero.

  5. Click Apply .

  6. Click OK .

  7. Save the changes and make sure a file synchronization is performed before restarting the servers.

  8. Restart the Application Server for the changes to take effect.



 

Sub-topics


Hung threads in J2EE applications

Hang detection policy of a running server

Searchable topic ID: ttrb_confighangdet