IBM Tivoli Monitoring > Version 6.3 > User's Guides > System p Agent User's Guides > AIX Premium User's Guide > Troubleshooting > Problems and workarounds IBM Tivoli Monitoring, Version 6.3


Agent troubleshooting

A problem can occur with the agent after it has been installed.
Table 1 contains problems and solutions that can occur with the agent after it is installed.


Agent problems and solutions

Problem Solution
Log data accumulates too rapidly. Check the RAS trace option settings, which are described in Set RAS trace parameters using the GUI. The trace option settings that you can set on the KBB_RAS1= and KDC_DEBUG= lines potentially generate large amounts of data.
No data is displayed in the Tivoli Enterprise Portal for all attribute groups. Inspect the data in the Performance Object Status attribute group and restart the agent.
Empty workspace views are displayed in the Tivoli Enterprise Portal.

IBM Tivoli Monitoring uses timeout settings during agent metric gathering as a way to avoid prolonged waits for data at the Tivoli Enterprise Portal client. When an agent takes longer than the portal timeout period to provide data, the requesting portal workspaces show empty views.

The IBM Tivoli Monitoring System p agents implement metric caching to alleviate running into these timeouts when metric data acquisition is taking a long time. When data is retrieved by the agent, it caches the attribute group returned to the portal. Metrics gathered within the portal timeout period are readily displayed on the console. Those attribute groups taking longer are displayed from the cache while the agent continues to collect data in the background for the original request.

Because of the way some metrics are gathered, certain metrics take longer than the default timeout and fail to make it to the cache before the portal timeout expires.

Typically, this problem is caused by network traffic, SSH communication overhead, HMC IPC communication layer, Logical Volume Manager communication layer and possible other circumstances. As a result, the portal displays empty workspace views for these attribute groups. The workspace shows data only when the data has been cached.

The following attribute groups are affected by these behaviors:

  • File Systems

  • Physical Volumes

  • Logical Volumes

  • Network Adapter Totals

  • Network Adapter Rates

CPU, network interface, and Workload Manager (WLM) metrics are not dynamically updated in the CPU Detail, Workload Manager, and Internet Protocol Detail attribute groups if these resources are added or removed after the AIX Premium agent is started. Metrics for these attribute groups are taken from the System Performance Measurement Interface (SPMI) shared library. After the SPMI is initialized, it creates a list of CPUs, network interfaces, and WLM classes configured. The SPMI library does not reinitialize these lists until one of the following occurs:

  1. The system is restarted.

  2. The number of consumers using the library goes to zero, and programs that were using the library end their SPMI connection gracefully.

  3. The SPMI shared library is manually restarted.

Restarting the IBM Tivoli Monitoring agent might not solve the problem if other SPMI consumers are active. A consumer is any program that has established a connection with the SPMI to acquire data. It is also possible to have a program that is a DDS (Dynamic Data Supplier) that provides data to the SPMI. Some examples of both are: topas, xmtopas, xmservd, xmtrend, and the IBM Tivoli Monitoring: AIX Premium Agent.

To recycle the SPMI without restarting:

  1. All data SPMI consumers and DDSs must end.

  2. Ensure that no remaining Shared Memory IDs start with a key of 0x78.

  3. If so, issue ipcrm -m id.

  4. Issue slibclean.

A configured and running instance of the monitoring agent is not displayed in the Tivoli Enterprise Portal, but other instances of the monitoring agent on the same system are displayed in the portal. IBM Tivoli Monitoring products use Remote Procedure Call (RPC) to define and control product behavior. RPC is the mechanism that a client process uses to make a subroutine call (such as GetTimeOfDay or ShutdownServer) to a server process somewhere in the network. Tivoli processes can be configured to use TCP/UDP, TCP/IP, SNA, and SSL as the protocol (or delivery mechanism) for RPCs that you want.

IP.PIPE is the name given to Tivoli TCP/IP protocol for RPCs. The RPCs are socket-based operations that use TCP/IP ports to form socket addresses. IP.PIPE implements virtual sockets and multiplexes all virtual socket traffic across a single physical TCP/IP port (visible from the netstat command).

A Tivoli process derives the physical port for IP.PIPE communications based on the configured, well-known port for the hub Tivoli Enterprise Monitoring Server. (This well-known port or BASE_PORT is configured using the 'PORT:' keyword on the KDC_FAMILIES / KDE_TRANSPORT environment variable and defaults to '1918'.)

The physical port allocation method is defined as (BASE_PORT + 4096*N), where N=0 for a Tivoli Enterprise Monitoring Server process and N={1, 2, ..., 15} for another type of monitoring server process. Two architectural limits result as a consequence of the physical port allocation method:

  • No more than one Tivoli Enterprise Monitoring Server reporting to a specific Tivoli Enterprise Monitoring Server hub can be active on a system image.

  • No more than 15 IP.PIPE processes can be active on a single system image.

A single system image can support any number of Tivoli Enterprise Monitoring Server processes (address spaces) if each Tivoli Enterprise Monitoring Server on that image reports to a different hub. By definition, one Tivoli Enterprise Monitoring Server hub is available per monitoring enterprise, so this architecture limit has been reduced to one Tivoli Enterprise Monitoring Server per system image.

No more than 15 IP.PIPE processes or address spaces can be active on a single system image. With the first limit expressed earlier, this second limitation refers specifically to Tivoli Enterprise Monitoring Agent processes: no more than 15 agents per system image.

Continued on next row.

Continued from previous row.

This limitation can be circumvented (at current maintenance levels, IBM Tivoli Monitoring V6.1, Fix Pack 4 and later) if the Tivoli Enterprise Monitoring Agent process is configured to use the EPHEMERAL IP.PIPE process. (This process is IP.PIPE configured with the 'EPHEMERAL:Y' keyword in the KDC_FAMILIES / KDE_TRANSPORT environment variable). The number of ephemeral IP.PIPE connections per system image has no limitation. If ephemeral endpoints are used, the Warehouse Proxy agent is accessible from the Tivoli Enterprise Monitoring Server associated with the agents using ephemeral connections either by running the Warehouse Proxy agent on the same computer or using the Firewall Gateway feature. (The Firewall Gateway feature relays the Warehouse Proxy agent connection from the Tivoli Enterprise Monitoring Server computer to the Warehouse Proxy agent computer if the Warehouse Proxy agent cannot coexist on the same computer.)

The AIX Premium agent reports the state of network adapters and network interfaces. An "Available" state signifies that the adapter hardware is physically installed and configured successfully. It also means that the network interface is configured with an IP address and it is in an "Up" state. These network adapters are also shown as "Available" in the Tivoli Enterprise Portal. However, those network interfaces that are not configured with an IP address or in a "Defined" state are not shown in the Tivoli Enterprise Portal. Also, the Tivoli Enterprise Portal shows some "ent0" and "ent1" network adapters as "Stopped". It is common that unused ports that are not configured with an IP address are not be plugged with any network cables for security purposes. You must remember this consideration when you view interfaces that are described as "Stopped" in the Tivoli Enterprise Portal.


Parent topic:

Problems and workarounds

+

Search Tips   |   Advanced Search