addNode command best practices
Use the addNode command to add a stand-alone node into a cell.
The addNode command does the following:
- Copies the base WebSphere Application Server cell configuration to a new cell structure. This new cell structure matches the structure of deployment manager.
- Creates a new node agent definition for the node that the cell incorporates.
- Sends commands to the deployment manager to add the documents from the new node to the cell repository.
- Performs the first configuration synchronization for the new node, and verifies that this node is synchronized with the cell.
- Launches the node agent process for the new node.
- Updates the setupCmdLine.bat or setupCmdline.sh files and the wsadmin.properties file to point to the new cell.
- After federating the node, the addNode command backs up the plugin-cfg.xml file from the app_server_root/config/cells directory to the config/backup/base/cells directory. The addNode command regenerates a new plugin-cfg.xml file at the deployment manager and the nodeSync operation copies the files to the node level.
(Dist) For information about port numbers, see the Port number settings topic.
Tips for using the addNode command:
- If we add a node to a cell, the cell name of the node we are federating must be different from the name of the cell to which the node is federated. Otherwise, we receive the ADMU0027E message, and the addNode command does not add the node to the cell.
- Verify that the deployment manager and nodes are updated to the same revision level within the WAS. For example, a deployment manager at level 6.0.1 will be unable to federate with nodes at 6.0.2.
- Do not put WAS .jar files on the generic CLASSPATH variable (default class path) for the overall system.
- If the WAS ND product cannot resolve the host name of the server, problems can occur with such situations as adding or administering nodes, or the node agent contacting the application server. To resolve the host name, the product opens a port, or queries for an IP address. The product then waits for the operating system to return the correct information. The operating system might go to multiple places to find the IP address, but the product does not care about the order in which the operating system does this, if the correct information is returned. If the host name of the server cannot be resolved, refer to the network administration documentation to resolve the problem. The following additional information might help you ensure that the host name is resolved.
- Some operating systems create an association between the host name of the machine and the loopback address of 127.0.0.1. Red Hat installations create the association by default. SuSE installations create a similar association with the loopback address 127.0.0.2. The association is in the hosts file.
If the hosts file contains mappings from the 127.0.0.1 IP or 127.0.0.2 address to a host name other than localhost, remove the mappings. The following example illustrates what might happen if the mappings are not removed: When a node agent communicates with the deployment manager, it sends its IP address to the deployment manager. The node agent resolves the node agent host name to 127.0.0.1 if the operating system returns a mapping for the host name from the hosts file. This resolution prevents the deployment manager from sending a message to the node agent because the 127.0.0.1 IP address is also the IP address for the local machine of the deployment manager.
(UNIX) The hosts file is located at /etc/hosts.
(Windows) The hosts file is located at \WINDOWS\system32\drivers\etc\hosts.
- (AIX) The default AIX installation checks the domain name server (DNS) first to return the information to a server so that the server host name of that server or another server can be resolved. If the host name cannot be resolved or cannot be resolved in a reasonable amount of time, you can add the following statement to the /etc/netsvc.conf file so that the AIX operating system checks the local hosts file first for the host name.
hosts=local,bind
- By default, applications installed on the node will not copy to the cell. If we install an application after using the addNode command, the application will install on the cell. By specifying the -includeapps option, you force the addNode command to copy applications from the node to the cell. Applications with duplicate names will not copy to the cell.
- Cell-level documents are not merged. Any changes that we make to the stand-alone cell-level documents before using the addNode command must be repeated on the new cell. For example, virtual hosts.
- If we receive an OutOfMemory exception while using the addNode command, you may need to increase the heap size of the deployment manager. To increase the heap size of the deployment manager, adjust the Maximum heap size parameter. For example, in the administrative console, go to System administration > Deployment manager > Java and Process Management > Process definition > Java Virtual Machine and increase the Maximum heap size value.
On HP-UX or Solaris operating systems, a java.lang.OutOfMemoryError: PermGen space problem might occur during large and complex tasks. For example, we might encounter this problem when we run commands such as addNode on nodes with large applications. When the demands for resources exceed the default storage size, the task can fail with a java.lang.OutOfMemoryError: PermGen space error. To resolve this problem, increase the minimum size of the permanent region. Set the -XX:PermSize JVM option to a value such as 128MB, which is sufficient for many situations in which this problem occurs:
XX:PermSize=128m
- In some instances it may take longer than anticipated for the deployment manager to respond to the addNode command. The default timeout value, which determines how long the client will wait for a server response, is appropriate in the majority of cases. However, you may require more time for the server to respond under heavier processing conditions. For example, if you include the -includeapps option and have a large number of applications, or the applications are very large, the default value of 180 seconds may be insufficient. To change the default timeout value, edit...
app_server_root/profiles/profile/properties/soap.client.props
...and find the following line...
com.ibm.SOAP.requestTimeout=180
To change the default we can edit this line to set the timeout to a value more appropriate for our situation.
Set the default timeout value to 0 seconds disables the timeout check.
If the timeout value is set too high we will have to wait a long time to determine if the addNode command will successfully complete its request to the deployment manager. If the value is set too short the deployment manager will not have sufficient time to complete the request before the addNode command concludes that the deployment manager is not responding and will respond with an error. Other factors that may affect server timeouts include the processing load or excessive paging on the deployment manager and network latency. Some of these conditions may be transient.
- If we receive an addNode error message regarding bad clock syncs, make sure that the computer with the node to be federated is in time sync with the deployment manager computer to which the node is to be federated.
- If we use the addNode command from a node that was federated to an existing deployment manager, the deployment manager will be corrupt. You will not be able to start the second deployment manager after you stop it. This happens because the addNode command creates a dmgrProfile/config/cells/dmgrCell/dmgrCell directory in the master configuration. This is an incomplete node configuration directory.
You will come into contact with the issue if we have a federated node and run the addNode command again for a different deployment manager. This causes the deployment manager to be corrupted and we will not be able to start the deployment manager afterwards because of the incomplete node directory.
Perform one of the following solutions to resolve this issue:
- If the deployment manager is running, we can use the cleanupNode command on deployment manager where the incomplete node resides.
- Manually delete the directory created on the deployment manager configuration during an addNode command operation that was incomplete. For example: app_server_root/profiles/dmgrProfile/config/cells/dmgrCell/nodeName.
Add, manage, and remove nodes addNode command removeNode command