IBM Tivoli Monitoring > Version 6.3 Fix Pack 2 > Installation Guides > Installation Guide > Set up event forwarding to Netcool/OMNIbus > Customize Event Integration

IBM Tivoli Monitoring, Version 6.3 Fix Pack 2


Customize event status processing behavior when agent switching is used or the agent goes offline

This section contains the variables for customizing the behavior of event status processing when agent switching is used.

The variables in Table 1 can be added to the monitoring server's environment file to customize the behavior of event status processing when agent switching is used or when the agent goes offline. The first two variables help ensure that events are not closed by the agent's primary monitoring server after the agent has switched to its secondary monitoring server. The monitoring server's environment file can be found in these locations:

You must recycle the Tivoli Enterprise Monitoring Server after modifying the environment file for your changes to be picked up.


Variables to customize the behavior of event status processing when agent switching is used

Variable Architecture Type Details Administrator
IRA_MIN_NO_DATA_WAIT_TIME Unidirectional and bidirectional The minimum time to wait before the monitoring server closes a situation event. This parameter is defined in number of seconds. The default value is zero.

By default, after an agent is disconnected from a Tivoli Enterprise Monitoring Server, situations already open will remain open for three situation polling intervals. For example, take two sampled situations, S1 and S2, with intervals of 30 seconds and 15 minutes respectively. Both situations are open when the agent loses connection. Situation S1 closes after at least one minute and 30 seconds. Situation S2 closes after at least 45 minutes. With agent switching, if a situation closes too soon it might generate duplicate events because the agent did not have sufficient time to connect to the backup monitoring server before the primary server closes the original event. This is particularly true for situations with very short polling intervals.

In such a scenario you can use the IRA_MIN_NO_DATA_WAIT_TIME variable to set the minimum wait time before a situation is closed. Using the example above, if IRA_MIN_NO_DATA_WAIT_TIME is set to 300 (5 minutes), S1 will close after 5 minutes not 90 seconds. S2 is unaffected and will close after 45 minutes as before.

  1. You should set this variable in the environment file for all of your monitoring servers.

  2. IRA_MIN_NO_DATA_WAIT_TIME does not prevent a situation from closing if an agent switches from a remote monitoring server that has been stopped. The variable only applies when an agent switches from a running remote monitoring server to the backup remote monitoring server because of a connectivity issue or when the agent is stopped.

IBM Tivoli Monitoring
CMS_SIT_TIME_VALIDATION Unidirectional and bidirectional Valid entries are Y or N. The default is N.

By default, the monitoring server handles situation events on a first-come-first-serve basis. In a scenario where agent switching is enabled an agent might send events through two different monitoring servers. The events that arrive first might not necessarily be the earlier events if one of the monitoring servers encountered connection issues. This generally has little impact on situation event processing, except when a monitoring server is shutdown and some situations might be closed prematurely even though the agent is already connected to a different monitoring server.

You must perform two actions to avoid this scenario:

  1. All monitoring server hosts should synchronize time, preferable through Internet Time Protocol (ITP) clients.

  2. You should add the CMS_SIT_TIME_VALIDATION=Y variable to all monitoring server environment files. This switches the LCLTMSTMP column in situation events to use UTC time instead of local time, which is then used to determine event order.

IBM Tivoli Monitoring
CMS_SIT_CHECK_NODESTS Unidirectional and bidirectional Valid entries are Y or N. The default is N. This variable is only applicable for users of event integration with Tivoli Enterprise Console and Omnibus. When the CMS_SIT_CHECK_NODESTS variable is set to Y in the environment file, the hub monitoring servers check the agent status whenever a close status update event is forwarded to Netcool/Omnibus. The CMS_SIT_CHECK_NODESTS variable should only be added to the hub monitoring server environment file. If the agent is offline the close status update event is tagged with a special OFFLINE indicator in the situation_eventdata EIF slot.

If you do not want events to be closed in Netcool/OMNIbus when an agent goes offline, you can customize the EIF probe rules to ignore close events where the situation_status slot is set to N and the situation_eventdata EIF slot is set to OFFLINE. See Customize the rules file for details on how to add customizations to the EIF probe rules. You should also consider setting the IRA_MIN_NO_DATA_WAIT_TIME environment variables described in this table so that close status events are not sent to Netcool/OMNIbus until after the agent offline condition has been detected. (If the default agent heartbeat interval is used, it can take between 10 to 20 minutes before the monitoring server detects that the agent is no longer online.)

You should only set the CMS_SIT_CHECK_NODESTS variable in the environment file of your hub monitoring server.

IBM Tivoli Monitoring


Parent topic:

Customize Event Integration

+

Search Tips   |   Advanced Search