IBM Tivoli Monitoring > Version 6.3 Fix Pack 2 > Installation Guides > High Availability Guide for Distributed Systems > Create clusters with monitoring components in a System Automation for Multiplatforms environment
IBM Tivoli Monitoring, Version 6.3 Fix Pack 2
Known problems and limitations
It is important to remember specific characteristics and constraints of Tivoli Monitoring installation and set up, and their effects on the cluster setup.
During the certification test for Tivoli Monitoring clustering, issues encountered when setting up the clustered environment are formally reported as defects. These defects are typically related to the setup of Tivoli Monitoring in a non-default manner, instead of being specific to the cluster environment. These defects are handled as part of the Tivoli Monitoring service stream. Here is a list of known problems and workarounds.
- The Tivoli Monitoring installer configures the components to be auto started by default. It does not give the user an option to configure the components to not auto start. Under this limitation, the user has to edit an operating system script to remove this behavior.
Under Linux, another limitation is that the Tivoli Monitoring installer places the auto start commands under every OS run level. Because all of these files are actually links, editing one of them removes this behavior at all run levels.
This same behavior occurs whether installing for the first time, or applying Fix Packs.
See the installation sections and the maintenance section for details on how to configure the components not to auto start.
- Chapter 5, “Installing and uninstalling service,” in the Tivoli System Automation for Multiplatforms Installation Guide is missing 2 steps required to install service on a node:
After running the samctrl -u a Node command in step 2, you need to run the following command to stop the node:
stoprpnode Node
Before running the samctrl -u d Node command in step 6, you need to run the following command to start the node:
startrpnode Node
In some cases, removing a cluster domain from one node does not remove it from the second node. This is evident when you attempt to create a domain with the same name that was just removed, and you get an error that the domain already exists. To remedy this, when removing a domain, run the rmrpdomain command on both nodes.
If a resource stays in Stuck online status for a long amount of time, as shown in the results of lsrg –m or lssam –top, this could mean that the configured stop command timeout for the resource is too low. You should time how long it takes to stop the resource, and double that, to get a good timeout value. When you have this value, do the following to change the value for the resource:
- Run the following commands to take the resource group offline (this causes Tivoli Monitoring to stop):
export CT_MANAGEMENT_SCOPE=2
chrg –o offline RG_NAME
- Stop any remaining processes that have not ended for the resource.
- Run the appropriate command for the resource that you would like to change, where n is the new timeout value, in seconds:
Change Resource (chrsrc) commands for setting the Timeout value
Resource Command Hub monitoring server chrsrc –s “Name = ‘TEMSSRV’” IBM.Application StopCommandTimeout=n Portal server chrsrc –s “Name = ‘TEPSSRV’” IBM.Application StopCommandTimeout=n Warehouse Proxy Agent chrsrc –s “Name = ‘TDWProxy’” IBM.Application StopCommandTimeout=n Summarization and Pruning Agent chrsrc –s “Name = ‘TDWProxy’” IBM.Application StopCommandTimeout=n
- Run the following command to bring the resource group online:
chrg –o online RG_NAME
IBM Tivoli System Automation for Multiplatforms on AIX 5.3 related problems
- During testing, a problem was encountered where shared disks were not being properly released or mounted after failures. This problem was solved after installing the latest storageRM file set, which can be downloaded from http://www-1.ibm.com/support/docview.wss?rs=1207&context=SG11P&dc=DB510&dc=DB550&q1=rsct.opt.storagerm.2.4.7.1&uid=isg1fileset-671500801&loc=en_US&cs=UTF-8&lang=all.
- In the test bed, there was a line in the inittab of the OS that prevented RSCT processes from starting automatically after the computer reboot. If there is such a line in your environment, and it is located above the lines that related to RSCT processes, then make sure that you comment out this line:
install_assist:2:wait:/usr/sbin/install_assist </dev/console>/dev/console 2>&1
In order for Java Web Start to work, you must change all occurrences of $HOST$ to your fully qualified host name in the .jnlpt file. You can locate the .jnlpt file in the CANDLE_HOME\config directory.
Parent topic:
Create clusters with monitoring components in a System Automation for Multiplatforms environment