IBM Tivoli Monitoring > Version 6.3 Fix Pack 2 > Installation Guides > High Availability Guide for Distributed Systems > The clustering of IBM Tivoli Monitoring components > What to expect from the IBM Tivoli Monitoring infrastructure in a clustered environment

IBM Tivoli Monitoring, Version 6.3 Fix Pack 2


Clustered hub monitoring server

When the hub server is configured as a cluster, and failover or failback occurs, the connected Tivoli Monitoring components operate as if the hub has been restarted. When failover or failback occurs, the other components automatically reconnect to the hub and some synchronization takes place.

After reconnection, as part of the remote monitoring server to hub synchronization, all situations that are the responsibility of the remote monitoring server (distributed to the monitoring server itself or to one of its connected agents) are restarted. This restarting of situations represents the current behavior for all reconnection cases between remote monitoring servers and the hub, despite the clustered environments. See Situations for more information.

For agents directly connected to the hub, there might be periods in which situation thresholding activity on the connected agents does not occur because, when a connection failure to the reporting hub is detected, the situations are stopped. As soon as the connection is reestablished, the synchronization process takes place and situations are restarted. (Note that historical metric collection is not stopped.)

The portal server, the Summarization and Pruning Agent, and the warehouse proxy agent reconnect to the hub and perform any synchronization steps necessary.

The occurrence of a hub restart depends on the size of the environment (including the number of agents and situations). Initially, you are notified that the portal server has lost contact with the monitoring server, and views might be unavailable. When the portal server reconnects to the hub, the Enterprise default workspace is displayed, allowing access to the Navigator Physical view. However, some agents might have delays in returning online (due to the reconnection timers), and trigger polled situations again (when the situations are restarted by the agents).

While the hub failover or failback (including the startup of the new hub) might be quick (approximately 1-3 minutes), the resynchronization of all elements to their normal state might be delayed in large-scale environments with thousands of agents. This behavior is not specific to a cluster environment, but valid anytime the hub is restarted, or when connections from the other Tivoli Monitoring components to the hub are lost and later reestablished.


Parent topic:

What to expect from the IBM Tivoli Monitoring infrastructure in a clustered environment

+

Search Tips   |   Advanced Search