Web module or application server dies or hangs

Web module or application server dies or hangs

+
Search Tips | Advanced Search

If an application server dies (its process spontaneously closes), or freezes (its Web modules stop responding to new requests):

Isolate the problem by installing Web modules on different servers.
Monitor performance with Tivoli performance viewer ...
Use the performance viewer to determine which resources have reached their maximum capacity, such as...

Java heap memory (indicating a possible memory leak)
database connections

If a particular resource appears to have reached its maximum capacity, review the application code for a possible cause:

If database connections are used and never freed, ensure that application code performs a close() on any opened Connection object within a finally{} block.
If there is a steady increase in servlet engine threads in use, review application synchronized code blocks for possible deadlock conditions.
If there is a steady increase in a JVM heap size, review application code for memory leak opportunities, such as static (class-level) collections, that can cause objects to never get garbage-collected.

Enable verbose garbage collection on the application server. This feature adds detailed statements to the JVM error log file of the application server about the amount of available and in-use memory.
To set up verbose garbage collection:

Select...
Servers | Application Servers | servername | Process Definition | Java Virtual Machine | Verbose Garbage Collection

Stop and restart the application server.
Browse the log file for garbage collection statements.
Look for statements beginning with "allocation failure". The string indicates that a need for memory allocation has triggered a JVM garbage collection (freeing of unused memory). Allocation failures themselves are normal and not necessarily indicative of a problem. The allocation failure statement is followed by statements showing how many bytes are needed and how many are allocated.
If there is a steady increase in the total amount of free and used memory (the JVM keeps allocating more memory for itself), or if the JVM becomes unable to allocate as much memory as it needs (indicated by the bytes needed statement), there might be a memory leak.

If either the performance viewer or verbose garbage collection output indicates that the application server is running out of memory, one of the following problems might be present:

There is a memory leak in application code that address.
To pinpoint the cause of a memory leak, enable the RunHProf function in...

Go ...
Servers | Application Servers | servername | Process Definition | Java Virtual Machine pane

In the same JVM pane, set the HProf Arguments field to a value similar to depth=20,file=heapdmp.txt. This value shows exception stacks to a maximum of 20 levels, and saves the heapdump output to...
install_root/bin/heapdmp.txt

Save the settings.
Stop and restart the application server.
Reenact the scenario or access the resource that causes the hang or crash.
Stop the application server. If this is not possible, wait until the hang or crash happens again and stop the application server.
Examine the file into which the heapdump was saved. For example, examine the install_root/bin/heapdmp.txt file:

Search for the string, "SITES BEGIN". This finds the location of a list of Java objects in memory, which shows the amount of memory allocated to the objects.
The list of Java objects occurs each time there was a memory allocation in the JVM. There is a record of what type of object the memory instantiated and an identifier of a trace stack, listed elsewhere in the dump, that shows the Java method that made the allocation.
The list of Java object is in descending order by number of bytes allocated.
Depending on the nature of the leak, the problem class should show up near the top of the list, but this is not always the case. Look throughout the list for large amounts of memory or frequent instances of the same class being instantiated. In the latter case, use the ID in the trace stack column to identify allocations occurring repeatedly in the same class and method.
Examine the source code indicated in the related trace stacks for the possibility of memory leaks.

The default maximum heap size of the application server needs to be increased.
There is a defect in the WAS product that either report, or correct by installing a fix or fix pack from a maintenance download. Contact IBM support.

If an application server spontaneously dies, look for a Java thread dump file. The JVM creates the file in the product directory structure, with a name like javacore[number].txt.
Force an application to create a thread dump (or javacore). Here is the process for forcing a thread dump, which is different from the process in earlier releases of the product:

Using the wsadmin command prompt, get a handle to the problem application server:
wsadmin>set jvm [$AdminControl completeObjectName type=JVM,process=server1,*]

Generate the thread dump
wsadmin>$AdminControl invoke $jvm dumpThreads

Look for an output file in the installation root directory with a name like...
javacore.date.time.id.txt

Browse the thread dump for clues:

If the JVM creates the thread dump as it closes (the thread dump is not manually forced), there might be "error" or "exception information" strings at the beginning of the file. These strings indicate the thread that caused the application server to die.
The thread dump contains a snapshot of each thread in the process, starting in the section labeled "Full thread dump."

Look for threads with a description that contains "state:R".
Such threads are active and running when the dump is forced, or the process exited.
Look for multiple threads in the same Java application code source location.
Multiple threads from the same location might indicate a deadlock condition (multiple threads waiting on a monitor) or an infinite loop, and help identify the application code with the problem.

If these steps do not fix your problem, search to see if the problem is known and documented, using the methods identified in the available online support (hints and tips, technotes, and fixes) topic. If you find that your problem is not known, contact IBM support to report it.
For current information available from IBM Support on known problems and their resolution, see the IBM Support page.
IBM Support has documents that can save you time gathering information needed to resolve this problem. Before opening a PMR, see the IBM Support page.

Related Tasks

Troubleshooting by task
Troubleshooting by component