Health management
With the health management feature in Liberty, we can take a policy-driven approach to monitoring the application server environment and respond when unhealthy criteria are discovered.
We can define the health policies, which include the health conditions to be monitored in the environment and the health actions to take if these conditions are met.
Health conditions
Health conditions define the variables we want to monitor in the environment. The condition element defines what behavior can trigger this health policy. Only one condition element can be defined per health policy. We can choose from the following predefined health conditions:
- Excessive request timeout condition
- Specifies a percentage of HTTP requests that can time out. When the percentage of requests
exceeds the defined value, the health actions run. The timeout value depends on the environment configuration.
<excessiveRequestTimeout timeoutPercentage="5"/>
<excessiveResponseTime responseTime="10s"/>
Note: Requests that exceed the timeout value configured for the excessive request timeout condition are not counted toward this health condition. For example, if the default timeout value is 60 seconds, then any request that exceeds 60 seconds times out and is not included in the average response time calculation. This restriction applies even if we do not define an excessive request timeout condition.
<excessiveMemoryUsage heapSizePercentage="85" timePeriod="5m"/>
<memoryLeak/>
Important:
- Dynamic Routing must be enabled to use either the excessive request timeout or excessive response time conditions.
- The healthAnalyzer-1.0 feature must be enabled in the server.xml file to use either the excessive memory usage or memory leak conditions. This feature can be enabled only for collective members.
Health actions
Health actions define the activities to perform when a health condition is not met. Action elements define what action is taken in response to a detected condition. All actions share the element type of <action>. The action attribute determines which action is taken and multiple actions can be defined for each health policy. Actions are run in the order they are specified in the policy. The following table lists the health actions supported in Liberty server environments:
Health action | Liberty servers that run in the same collective controller |
---|---|
Restart server. | Supported |
Take thread dumps. | Supported |
Take Java virtual machine (JVM) heap dumps. | Supported for servers that are running on the IBM JRE or Java Developer Kit |
Enter server into maintenance mode. | Supported |
Exit server out of maintenance mode. | Supported |
<action action="generateThreadDump"/> <action action="generateHeapDump"/> <action action="restartServer"/> <action action="enterMaintenanceMode"/> <action action="exitMaintenanceMode"/>
Health targets
Target elements define the scope of the topology being monitored for the condition. Three target types are available:- A host
<host hostName="someHost"/>
<cluster clusterName="someCluster"/>
<server hostName="Host" wlpUsrDirectory="/opt/ibm/liberty/wlp" serverName="Server"/>
Each target type has a unique element used to define it within the healthPolicy element. More than one target can be specified per health policy.