IBM Tivoli Monitoring > Version 6.3 Fix Pack 2 > Installation Guides > Installation Guide > Configure IBM Tivoli Monitoring components

IBM Tivoli Monitoring, Version 6.3 Fix Pack 2


Configure the heartbeat interval

IBM Tivoli Monitoring uses a heartbeat mechanism to monitor the status of remote monitoring servers and monitoring agents.

The different monitoring components in the monitoring architecture form a hierarchy (shown in Figure 1) across which the heartbeat information is propagated.

The hub monitoring server maintains status for all monitoring agents. Remote monitoring servers offload processing from the hub monitoring server by receiving and processing heartbeat requests from monitoring agents, and communicating only status changes to the hub monitoring server.

Figure 1. Hierarchy for the heartbeat interval

At the highest level, the hub monitoring server receives heartbeat requests from remote monitoring servers and from any monitoring agents that are configured to access the hub monitoring server directly (rather than through a remote monitoring server). The default heartbeat interval used by remote monitoring servers to communicate their status to the hub monitoring server is 3 minutes. The default heartbeat interval of 3 minutes for monitoring servers is suitable for most environments, and should not need to be changed. If you decide to modify this value, carefully monitor the system behavior before and after making the change.

At the next level, remote monitoring servers receive heartbeat requests from monitoring agents that are configured to access them. The default heartbeat interval used by monitoring agents to communicate their status to the monitoring server is 10 minutes.

You can specify the heartbeat interval for a node (either a remote monitoring server or a remote monitoring agent) by setting the CTIRA_HEARTBEAT environment variable. For example, specifying CTIRA_HEARTBEAT=5 sets the heartbeat interval to 5 minutes. The minimum heartbeat interval that can be configured is 1 minute.

When a monitoring agent becomes active and sends an initial heartbeat request to the monitoring server, it communicates the desired heartbeat interval for the agent in the request. The monitoring server stores the time the heartbeat request was received and sets the expected time for the next heartbeat request based on the agent heartbeat interval. If no heartbeat interval was set at the agent, the default value is used.

Changes to offline status typically require two missed heartbeat requests for the status to change. Offline status is indicated by the node being disabled in the portal client's Navigator View. If the heartbeat interval is set to 10 minutes, an offline status change would be expected to take between 10 and 20 minutes before it is reflected on the portal client's Navigator View.

Lower heartbeat intervals increase CPU utilization on the monitoring servers processing the heartbeat requests. CPU utilization is also affected by the number of agents being monitored. A low heartbeat interval and a high number of monitored agents could cause the CPU utilization on the monitoring server to increase to the point that performance related problems occur. If you reduce the heartbeat interval, you must monitor the resource usage on your servers. A heartbeat interval lower than 3 minutes is not supported.


Parent topic:

Configure IBM Tivoli Monitoring components

+

Search Tips   |   Advanced Search