+

Search Tips   |   Advanced Search

Hung threads in Java Platform, Enterprise Edition applications

WebSphere Application Server monitors thread activity and performs diagnostic actions if one has become inactive. When WebSphere detects that a thread has been active longer than the time defined by the thread monitor threshold, the application server takes the following actions:


False Alarms

If the work actually completes, a second set of messages, notifications and PMI events is produced to identify the false alarm. The following message is written to the log:

where threadname is the name that appears in a JVM thread dump, hangtime gives an approximation of how long the thread has been active and totalthreads gives an overall assessment of the system threads.


Automatic adjustment of the hang time threshold

If the thread monitor determines that too many false alarms are issued (determined by the number of pairs of hang and clear messages), it can automatically adjust the threshold. When this adjustment occurs, the following message is written to the log:

where: thresholdtime is the time (in seconds) in which a thread can be active before it is considered hung.

We can prevent WAS from automatically adjusting the hang time threshold. See Configure the hang detection policy


System Alarms

An application server monitors the activity of threads on which system alarms execute. When a system alarm thread has been active longer than the time defined by the alarm thread monitor threshold, the application server logs the following warning in the system log. This message indicates the name of the thread that is not responding, the length of time that the thread has already been active, and the exception stack of the thread, which identifies the system component.

In this message, threadname is the name that appears in a JVM thread dump, n is approximately how long the thread was active, totalthreads is an overall assessment of the system threads, and threadstack is the exception stack of the thread.

If the alarm work eventually completes, the following message is written to the system log. This message indicates thread that produced the false alarm.

UTLS0009W: Alarm Thread threadname was previously reported to be hung but has 
   completed.  It was active for approximately n milliseconds.

In this message, threadname is the name that appears in a JVM thread dump, and n is approximately how long the thread was active.

Typically, system alarms do not process heavy loads because such activity might slow the processing of later system alarms, which in turn might impact server behavior. The UTLS0008W message is intended to help IBM Support personnel investigate problems potentially caused by system alarm behavior.

All of the system alarms share a common alarm thread pool. The properties which govern the monitoring of this thread pool can be tuned using the administrative console. We can reduce the frequency at which WebSphere generates alarm hung thread messages by adjusting the alarm thread monitor check interval or threshold. See the topic Configure the hang detection policy for a description of how to change these settings.


  • Configure the hang detection policy
  • Monitor performance with Tivoli Performance Viewer
  • Example: Adjusting the thread monitor to affect server hang detection