Hung threads in J2EE applications
WAS monitors thread activity and performs diagnostic actions if one has become inactive.
When WebSphere detects that a thread has been active longer than the time defined by the thread monitor threshold, the appserver takes the following actions:
- Logs a warning that indicates the name of the thread that is hung and how long it has been active...
WSVR0605W: Thread threadname has been active for hangtime and may be hung. There are totalthreads threads in total in the server that may be hung....where...
- threadname is the name that appears in a JVM thread dump
- hangtime gives an approximation of how long the thread has been active
- totalthreads gives an overall assessment of the system threads.
- Issues a JMX notification.
This notification enables third-party tools to catch the event and take appropriate action, such as...
- Triggering a JVM thread dump of the server
- Issuing an electronic page or e-mail
The following JMX notification events are defined in the...
com.ibm.websphere.management.NotificationConstants...class
- TYPE_THREAD_MONITOR_THREAD_HUNG
Triggered by the detection of a (potentially) hung thread.
- TYPE_THREAD_MONITOR_THREAD_CLEAR
Triggered if a thread that was previously reported as hung completes its work. See false alarms.
- Triggers changes in the PMI data counters.
These PMI data counters are used by various tools, such as the TPV, to provide a performance analysis.
False Alarms
If the work actually completes, a second set of messages, notifications and PMI events is produced to identify the false alarm. The following message is written to the log:
WSVR0606W: Thread threadname was previously reported to be hung but has completed. It was active for approximately hangtime. There are totalthreads threads in total in the server that still may be hung.where threadname is the name that appears in a JVM thread dump, hangtime gives an approximation of how long the thread has been active and totalthreads gives an overall assessment of the system threads.
Automatic adjustment of the hang time threshold
If the thread monitor determines that too many false alarms are issued (determined by the number of pairs of hang and clear messages), it can automatically adjust the threshold. When this adjustment occurs, the following message is written to the log:
WSVR0607W: Too many thread hangs have been falsely reported. The hang threshold is now being set to thresholdtime....where thresholdtime is the time (in seconds) in which a thread can be active before it is considered hung.
You can prevent WAS from automatically adjusting the hang time threshold.
Related tasks
Configure the hang detection policy
Related Reference
Example: Adjuste the thread monitor to affect server hang detection