Custom health condition subexpression builder
Use the custom health condition subexpression builder to define a custom health condition for our health policy. Use the build subexpression utility to build complex rule conditions from subexpressions using AND, OR, NOT and parenthetical grouping. The subexpression builder validates the rule when applying the changes, and alerts you to mismatched parentheses and unsupported logic operators.
From the admin console, click...
Operational polices > Health policies > New
If we choose a custom health condition, the Run reaction plan when field is displayed. Click Subexpression builder to build the custom health condition.
Select the properties to include in our custom health condition, and click Generate subexpression. The subexpression value displays. To append the subexpression to our custom health condition, click Append.
Logical operator
Operator used to append this subexpression to the previous subexpression in the custom health condition.
- and
- Both subexpressions that are around the and operator must be true for actions to be taken on the health policy.
- or
- To select a node, one of the two subexpressions that are around the or operator must be true for actions to be taken on the health policy.
Operand
The PMI metric: From server start
Uses an average number of the reported values from the time that the server started.
The PMI metric: From last interval
Uses an average of the reported values in the last interval. The interval is the length of the health controller cycle.
Both PMI operands have the following PMI modules:
- Connection pool module (JDBC):
Faults Number of connection timeouts in the pool. Number of creates Total number of connections created. Percent used Average percent of the pool that is in use. The value is based on the total number of configured connections in the connection pool, not the current number of connections. Prepared statement cache discards The number of statements that are discarded because the cache is full. Number of destroys Total number of connections that are closed. Pool size Size of the connection pool. Number of connection handles The number of connection objects that are discarded because the cache is full. Concurrent waiters Average number of threads that are waiting for a connection at the same time. Number of managed connections The number of ManagedConnection objects in use for a particular connection pool. This metric applies to v5.0 data sources only. Percent maxed Average percent of the time that all connections are in use. JDBC time (milliseconds) Average time, in milliseconds, that is spent running JDBC calls. This time includes time that is spent in the JDBC driver, network, and database. This metric applies to v5.0 data sources only. Average use time (milliseconds) The average time in milliseconds that a connection is used. The value is the difference between the time at which the connection is allocated and returned. This value includes the JDBC operation time. Number of returns Total number of connections that are returned to the pool. Free pool size Number of free connections in the pool. Number of allocates Number of connections that are allocated. Average wait time (milliseconds) The average waiting time in milliseconds
- System module:
These metrics can be used on servers running WAS or that are running other middleware servers. The system module has the following metrics:
CPU utilization: Since server start The average CPU utilization since the server started. CPU utilization: Last interval The average CPU utilization since the last query. Free memory (KB) Specifies a snapshot of free memory, in kilobytes.
- Process module (for other servers):
These metrics can be used on servers running WAS or that are running other middleware servers. The process module has the following metrics:
Process resident memory (KB) The process resident memory, in kilobytes. Process CPU utilization: Since server start Process CPU utilization since the server start. Process CPU utilization: Last interval Process CPU utilization in the last interval. Process total memory (KB) Process total memory, in kilobytes.
- EJB module:
Average concurrent active methods The average number methods that are active at the same time. Total method calls Number of calls to the remote methods of the bean. Methods submodule: Method loads The method load in the methods submodule. Stores Amount of time that the bean data was stored in persistent storage. Message count Number of messages that were delivered to the onMessage method of the bean. This message count applies to message-driven beans. Average concurrent live beans The average beans that are alive at the same time. Removes Number of times that beans were removed. Returns to pool Number of calls that are returning an object to the pool. Passivates Number of times that beans were moved to passivated state. Gets from pool Number of calls that are retrieving an object from the pool. Drains from pool Number of times that the daemon found the pool idle and attempted to clean the pool. Ready count Number of bean instances in ready state. Average create time (milliseconds) The average time, in milliseconds, to run a bean create call. This time includes the time to load the bean. Returns discarded Number of times that the returning object was discarded because the pool was full. Activates Number of times that beans were activated. Server session usage (percentage) The percentage of the ServerSession pool being used used. This metric applies to: message-driven beans. Loads Number of times that bean data was loaded from persistent storage. Message backout count Number of backed out messages that failed to be delivered to the onMessage method of the bean. This metric applies to: message-driven beans. Methods submodule: Method response time (milliseconds) Method response time, in milliseconds. Passivation count Number of beans in a passivated state. Pool size Average number of objects in the pool. Load time (milliseconds) Average time in milliseconds for loading the bean data from persistent storage. Average remove time (milliseconds) The average time, in milliseconds, to run a beanRemove call. This time includes the time at the database. Gets found Number of times that a retrieve call found an available object in the pool. Activation time Average time, in milliseconds, for activating a bean object Average drain size Average number of objects that are discarded in each drain. Methods submodule: Method calls The number of method calls. Destroys Number of times that beans were destroyed. Average server session wait time (milliseconds) Average time, in milliseconds, required to obtain a server session from the pool. This metric applies to message-driven beans. Creates Number of times that beans were created. Average method response time (milliseconds) Average response time, in milliseconds, that elapses for remote method calls on the bean. Instantiates Number of times that beans were instantiated. Store time (milliseconds) Average time, in milliseconds, for storing the bean data to persistent storage. Passivation time (milliseconds) The average time, in milliseconds, for bean object passivation to occur.
- Web application module:
Number of errors Total number of times an error was received from the servlet or JavaServer Pages (JSP) files. Total requests Total number of requests that a servlet processed. Response time (milliseconds) The average response time, in milliseconds, in which servlet requests finish. Concurrent requests Number of requests that are currently processing. Number of reloads Number of servlets that are reloaded. Number of loaded servlets Number of servlets that are loaded.
- JVM runtime module:
These metrics can be used on servers running WAS or that are running other middleware servers. Java Virtual Machine (JVM) runtime modules have the following metrics:
Free memory (KB) Free memory, in kilobytes, in the JVM runtime. Up time (seconds) Amount of time, in seconds, that the JVM has been running. Total memory (KB) Total memory, in kilobytes, in the JVM runtime. Used memory (KB) Amount of used memory, in kilobytes, in the JVM runtime.
- Thread pool module:
Number of thread stops Number of threads that are declared as stopped. Percent maxed Average percent of the time that all threads are in use. Average active time (milliseconds The average time in milliseconds that the threads are in active state. Thread destroys Total number of threads that are destroyed. Pool size Average number of threads in a pool. Thread creates Total number of threads created. Concurrently hung threads Number of threads that are concurrently stopped. Number of cleared thread stops The number of thread stops that are cleared. Active threads Number of threads that are concurrently active.
Subexpression format for PMI Metric: From server start:
PMIMetric_FromServerStart$moduleName$metricName operator LongValueL (with "L" suffix)
Example:
PMIMetric_FromServerStart$systemModule$cpuUtilization > 90L
Subexpression format for PMI Metric: Last reported interval:
PMIMetric_FromLastInterval$moduleName$metricName operator LongValueL (with "L" suffix)
Example:
PMIMetric_FromLastInterval$webAppModule$responseTime > 200L
- ODR server level metric: From server start
Use a subset of server level metrics that the on demand router publishes. These metrics are cumulative and reported since the server start.
- Metric name
Use the following server level metrics:
Departs Number of requests that are dispatched from the queue to the server during the reported interval. A request is considered to be dispatched just once, even if it fails at the first server and is retried at another time. The next event after dispatch is return. These metrics can be used on servers running WAS or that are running other middleware servers. Response time (milliseconds) Specifies an average response time for requests. To calculate this average, add the sum of the response times for requests that returned from the server to the client during the reported interval. The sum is in units of milliseconds. Divide by the serviced metric value to get the average response time. The response time of a request is the sum of the request's waiting time and service time. These metrics can be used on servers running WAS or that are running other middleware servers. Currently executing requests The number of requests running at the end of the reported interval. These metrics can be used on servers running WAS or that are running other middleware servers. Service time (milliseconds) The average service time for requests. To calculate this average, add the sum of each request service time for requests that returned to the server during the reported interval. The sum is in units of milliseconds. Divide by the serviced metric value in the same interval to get the average. The service time of a request is the time from dispatch to return. These metrics can be used on servers running WAS or that are running other middleware servers. Wait time (milliseconds) Average wait time for requests. To calculate this average, add the sum of the time that each request spent waiting in the queue over the reported interval. The sum is in units of milliseconds. Divide by the number of departs to get the average wait time. Dropped requests do not contribute to this sum. These metrics can be used on servers running WAS or that are running other middleware servers. Errors Number of requests that returned from the server with an error indicator during the reported interval. These metrics can be used on servers running WAS or that are running other middleware servers. Serviced Specifies the number of requests that returned from server to client during the reported interval. These metrics can be used on servers running WAS or that are running other middleware servers. Timeouts Number of requests that returned due to service timeout during the reported interval. These metrics can be used on servers running WAS or that are running other middleware servers.
Subexpression format:
ODRServerMetric_FromServerStart$metricName operator LongValueL (with "L" suffix)
Example:
ODRServerMetric_FromServerStart$errors > 100L
- ODR server level metric: Last reported interval
We can use the same set of metrics as the ODR server level metric: From server start operand. This operand uses an average of the reported values in the last interval. The interval is the length of the health controller cycle.
Departs Number of requests that are dispatched from the queue to the server during the reported interval. A request is considered to be dispatched just once, even if it fails at the first server and is retried at another time. The next event after dispatch is return. These metrics can be used on servers running WAS or that are running other middleware servers. Response time (milliseconds) Specifies an average response time for requests. To calculate this average, add the sum of the response times for requests that returned from the server to the client during the reported interval. The sum is in units of milliseconds. Divide by the serviced metric value to get the average response time. The response time of a request is the sum of the request's waiting time and service time. These metrics can be used on servers running WAS or that are running other middleware servers. Currently executing requests The number of requests running at the end of the reported interval. These metrics can be used on servers running WAS or that are running other middleware servers. Service time (milliseconds) The average service time for requests. To calculate this average, add the sum of each request service time for requests that returned to the server during the reported interval. The sum is in units of milliseconds. Divide by the serviced metric value in the same interval to get the average. The service time of a request is the time from dispatch to return. These metrics can be used on servers running WAS or that are running other middleware servers. Wait time (milliseconds) Average wait time for requests. To calculate this average, add the sum of the time that each request spent waiting in the queue over the reported interval. The sum is in units of milliseconds. Divide by the number of departs to get the average wait time. Dropped requests do not contribute to this sum. These metrics can be used on servers running WAS or that are running other middleware servers. Errors Number of requests that returned from the server with an error indicator during the reported interval. These metrics can be used on servers running WAS or that are running other middleware servers. Serviced Specifies the number of requests that returned from server to client during the reported interval. These metrics can be used on servers running WAS or that are running other middleware servers. Timeouts Number of requests that returned due to service timeout during the reported interval. These metrics can be used on servers running WAS or that are running other middleware servers.
Subexpression format:
ODRServerMetric_FromLastInterval$metricName operator LongValue (with "L" suffix)
Example:
ODRServerMetric_FromLastInterval$serviced > 10000L
- ODR cell level metric: From ODR start
Use a subset of cell level metrics that the ODR publishes. These metrics are cumulative and reported since the server start. Use the following set of metrics:
- Departs: Number of requests that are dispatched from the queue to the server during the reported interval. A request is considered to be dispatched just once, even if it fails at the first server and is retried at another time. The next event after dispatch is return. These metrics can be used on servers running WAS or that are running other middleware servers.
- Response time (milliseconds): Specifies an average response time for requests. To calculate this average, add the sum of the response times for requests that returned from the server to the client during the reported interval. The sum is in units of milliseconds. Divide by the serviced metric value to get the average response time. The response time of a request is the sum of the request's waiting time and service time. These metrics can be used on servers running WAS or that are running other middleware servers.
- Current queue length: Length of the queue at the end of the reported interval.
- Service time (milliseconds)
- Errors: Number of requests that returned from the server with an error indicator during the reported interval. These metrics can be used on servers running WAS or that are running other middleware servers.
- Average queue length: Average length of the queue. To calculate this average, add the sum of the queue lengths that are reported at each request arrival before insertion, and divide the sum by the number of arrivals. These metrics can be used on servers running WAS or that are running other middleware servers.
- Serviced: Specifies the number of requests that returned from server to client during the reported interval. These metrics can be used on servers running WAS or that are running other middleware servers.
- Timeouts: Number of requests that returned due to service timeout during the reported interval. These metrics can be used on servers running WAS or that are running other middleware servers.
- Currently executing requests: The number of requests running at the end of the reported interval. These metrics can be used on servers running WAS or that are running other middleware servers.
- Arrivals: Number of requests that arrived during the reported interval. The next event, if any, after arrival is either dispatch or drop. These metrics can be used on servers running WAS or that are running other middleware servers.
- Queue overflow drops: The number of requests that were initially accepted into the queue, and then ejected from the queue during the reported interval. These metrics can be used on servers running WAS or that are running other middleware servers.
- Queue drops: Number of requests that were initially accepted into the queue at some time and then were ejected from the queue at some time during the reported interval. These metrics can be used on servers running WAS or that are running other middleware servers.
- Delayed: Number of requests that arrived during the reported interval and were not immediately dispatched or dropped. These metrics can be used on servers running WAS or that are running other middleware servers.
- Wait time (milliseconds): Average wait time for requests. To calculate this average, add the sum of the time that each request spent waiting in the queue over the reported interval. The sum is in units of milliseconds. Divide by the number of departs to get the average wait time. Dropped requests do not contribute to this sum. These metrics can be used on servers running WAS or that are running other middleware servers.
Subexpression format:
ODRCellMetric_FromServerStart$metricName operator LongValue (with "L" suffix)
Example:
ODRCellMetric_FromServerStart$arrivals > 10000L
- ODR cell level metric: Last reported interval
We can use the same set of metrics as the ODR cell level metric: From server start operand. This operand uses an average of the reported values in the last interval. The interval is the length of the health controller cycle.
Departs Number of requests that are dispatched from the queue to the server during the reported interval. A request is considered to be dispatched just once, even if it fails at the first server and is retried at another time. The next event after dispatch is return. These metrics can be used on servers running WAS or that are running other middleware servers. Response time (milliseconds) Specifies an average response time for requests. To calculate this average, add the sum of the response times for requests that returned from the server to the client during the reported interval. The sum is in units of milliseconds. Divide by the serviced metric value to get the average response time. The response time of a request is the sum of the request's waiting time and service time. These metrics can be used on servers running WAS or that are running other middleware servers. Current queue length Length of the queue at the end of the reported interval. Service time (milliseconds) The average service time for requests. To calculate this average, add the sum of each request service time for requests that returned to the server during the reported interval. The sum is in units of milliseconds. Divide by the serviced metric value in the same interval to get the average. The service time of a request is the time from dispatch to return. These metrics can be used on servers running WAS or that are running other middleware servers. Errors Number of requests that returned from the server with an error indicator during the reported interval. These metrics can be used on servers running WAS or that are running other middleware servers. Average queue length Average length of the queue. To calculate this average, add the sum of the queue lengths that are reported at each request arrival before insertion, and divide the sum by the number of arrivals. These metrics can be used on servers running WAS or that are running other middleware servers. Serviced Specifies the number of requests that returned from server to client during the reported interval. These metrics can be used on servers running WAS or that are running other middleware servers. Timeouts Number of requests that returned due to service timeout during the reported interval. These metrics can be used on servers running WAS or that are running other middleware servers. Currently executing requests The number of requests running at the end of the reported interval. These metrics can be used on servers running WAS or that are running other middleware servers. Arrivals Number of requests that arrived during the reported interval. The next event, if any, after arrival is either dispatch or drop. These metrics can be used on servers running WAS or that are running other middleware servers. Queue overflow drops The number of requests that were initially accepted into the queue, and then ejected from the queue during the reported interval. These metrics can be used on servers running WAS or that are running other middleware servers. Queue drops Number of requests that were initially accepted into the queue at some time and then were ejected from the queue at some time during the reported interval. These metrics can be used on servers running WAS or that are running other middleware servers. Delayed Number of requests that arrived during the reported interval and were not immediately dispatched or dropped. These metrics can be used on servers running WAS or that are running other middleware servers. Wait time (milliseconds) Average wait time for requests. To calculate this average, add the sum of the time that each request spent waiting in the queue over the reported interval. The sum is in units of milliseconds. Divide by the number of departs to get the average wait time. Dropped requests do not contribute to this sum. These metrics can be used on servers running WAS or that are running other middleware servers.
Subexpression format:
ODRCellMetric_FromLastInterval$metricName operator LongValue (with "L" suffix)
Example:
ODRCellMetric_FromLastInterval$timeouts > 100L
- MBean operation metric: Long return type and MBean operation metric: String return type
For Managed Bean (Mbean) operation metric operands, specify the Object name query string and the MBean method name.These metrics can be used only on servers running WAS.
- Object name query string
- When creating the Object name query string, escape all special characters with a backslash character.
The value that you enter for the object name query string must have both the process=process_name>, and node=node strings specified, or none specified. If we specify both process=process_name>, and node=node>, the backend creates a singleton MBeanSensor sensor that senses the particular MBean on a server and node. If we specify none, the backend appends the name of the current server as the process name and the name of the current node as the node name, creating a MBeanSensor sensor for each server to which the health policy applies. If we specify only one of the twoprocess=<process_name> or node=node>, an error results.
- MBean method name
- Name of the MBean method to invoke.
Subexpression format long metrics:
MBeanOperationMetric_TypeLong$objectNameQueryString$methodName operator LongValueL (with "L" suffix)
Example for long metrics:
MBeanOperationMetric_TypeLong$WebSphere\:\*\,type\ =HealthConditionLanguageInitializer\,node\=hipods3\,process\=nodeagent$getNumberOfOperands > 10L
Subexpression format for string metrics:
MBeanOperationMetric_TypeString$objectNameQueryString$methodName operator StringValue
Example for string metrics:
MBeanOperationMetric_TypeString$WebSphere\:\*\,type\ =HealthConditionLanguageInitializer\,node\=hipods3\,process\=nodeagent$getOperands = 't'
- MBean attribute metric: Long return type and MBean attribute metric: String return type
The MBean attribute metrics are used for querying an attribute of a MBean rather than invoking a method on the MBean. The operand takes the Object name query string and the attribute name as inputs. These metrics can be used only on servers running WAS.
- Object name query string
- When creating the Object name query string, escape all special characters with a backslash character.
The value that you enter for the object name query string must have both the process=process_name>, and node=node strings specified, or none specified. If we specify both process=process_name>, and node=node>, the backend creates a singleton MBeanSensor sensor that senses the particular MBean on a server and node. If we specify none, the backend appends the name of the current server as the process name and the name of the current node as the node name, creating a MBeanSensor sensor for each server to which the health policy applies. If we specify only one of the twoprocess=<process_name> or node=node>, an error results.
- Attribute name
- Attribute that is queried on the MBean.
Subexpression format for long metrics:
MBeanAttributeMetric_TypeLong$objectNameQueryString$attributeName operator LongValue
Example for long metrics:
MBeanAttributeMetric_TypeLong$WebSphere\:\*\,type\ =HealthConditionLanguageInitializer\,node\=hipods3\,process\=nodeagent$NumberOfOperands > 10L
Subexpression format for string metrics:
MBeanAttributeMetric_TypeString$objectNameQueryString$attributeName operator StringValue
Example for string metrics:
MBeanAttributeMetric_TypeString$WebSphere\:\*\,type\ =HealthConditionLanguageInitializer\,node\=hipods3\,process\=nodeagent$OperatorList = 'test'
- URL return code metric
With this operand, we can ping any relative path (URI) on the server that is the target of this policy. The return value is used in the condition expression for the custom health policy.
- URL port number
- Port number to ping.
- URL relative path
- The URL to ping. Any special characters in the string must be escaped with a backslash (\) character.
- Value
- Integer that is the expected return code of the ping.
Use this operand to ping any general purpose URL by selecting the on demand router as the target of the health policy and by setting the appropriate routing rules in the ODR.
Use this operand to select members running WAS, or that are running other middleware servers.
Subexpression format:
URLReturnCodeMetric$portNumber$relativePath operator IntValue
Example:
URLReturnCodeMetric$9060$ibm\/console\/login\.do = 200
The URL sensor returns0 if the Web site cannot be reached:
URLReturnCodeMetric$9060$ibm\/console\/login\.do = 0
- External URL return code metric
With this operand, we can enter an absolute URL instead of a relative URL. By doing so, we can periodically send a ping request to other targets than application servers or on demand routers.
Always create a custom health action without specifying a target server.
Example:
ExternalURLReturnCodeMetric$http://foo.bar.com <> 200
If http://foo.bar.com returns a different response code than 200, the health policy is triggered. Take this custom action to run a custom script, for example a script to recycle a web server.
Operator
- Equals (=): The equality operator expresses a case-sensitive match.
- Not Equals (<>): The not equal operator expresses that the operand value is not equal to the value you enter.
- Greater Than (>): The greater-than operator is for use with numbers.
- Greater Than or Equals (>=): The greater-than or equal to operator is for use with numbers.
- Less Than (<) : The less-than operator is for use with numbers.
- Less Than or Equals (<=): The less-than or equal to operator is for use with numbers.
- Between (BETWEEN): The value must be between a Lower bound and Upper bound specified.
- In (IN): The value must be in a list of values. We can type in values and add them to a list.
Value
Depending on the operator chosen, type in a value for the subexpression to create.
Subexpression
After clicking Generate subexpression, this field displays the generated subexpression fragment based on the options that we selected. To add this subexpression to our custom health condition, click Append.