Security considerations when in a multi-node WAS WAS ND environment
WebSphere Application Server, Network Deployment supports centralized management of distributed nodes and application servers. This support inherently brings complexity, especially when security is included. Because everything is distributed, security plays an even larger role in ensuring that communications are appropriately secure between application servers and node agents, and between node agents (a node-specific configuration manager) and the deployment manager (a domain-wide, centralized configuration manager).
Because the processes are distributed, the authentication mechanism that must be used is LTPA. The LTPA tokens are encrypted, signed and forwardable to remote processes. However, the tokens have expirations. The SOAP connector, which is the default connector, is used for administrative security and does not have retry logic for expired tokens. However, the protocol is stateless so a new token is created for each request if there is not sufficient time to run the request with the given time left in the token. An alternative connector is the RMI connector, which is stateful, and has some retry logic to correct expired tokens by resubmitting the requests after the error is detected. Also, because tokens have time-specific expiration, the synchronization of the system clocks is crucial to the proper operation of token-based validation. If the clocks are off by too much (approximately 10-15 minutes), we can encounter unrecoverable validation failures that can be avoided by having them in sync. Verify that the clock time, date, and time zones are all the same between systems. It is acceptable for nodes to be across time zones, provided that the times are correct within the time zones (for example, 5 PM CST = 6 PM EST, and so on).
Verify that the keystores and truststores that we configure are set up to trust only the servers to which they communicate. Make sure they do include the necessary signer certificates from those servers in the trust files of all servers in the domain. When using a certificate authority (CA) to create personal certificates, it is easier to ensure that all servers trust one another by having the CA root certificate in all the signers.
Z/OS
Because the processes are distributed, an authentication mechanism must be selected that supports an authentication token such as LTPA. The tokens are encrypted, signed and forwardable to remote processes. However, the tokens have expiration times which are set on the WAS administrative console. The SOAP connector which is the default connector, is used for administrative security and does not have retry logic for expired tokens. However, the protocol is stateless so a new token is created for each request if there is not sufficient time to run the request with the given time left in the token. An alternative connector is the RMI connector, which is stateful, and has some retry logic to correct expired tokens by resubmitting the requests after the error is detected. Also, because tokens have time-specific expiration, the synchronization of the system clocks is crucial to the proper operation of token-based validation. If the clocks are off by too much (approximately 10-15 minutes), we can encounter unrecoverable validation failures that can be avoided by having them in sync. Verify that the clock time, date, and time zones are all the same between systems. It is acceptable for nodes to be across time zones, provided that the times are correct within the time zones (for example, 5 PM CST = 6 PM EST, and so on).
We have additional considerations with SSL. WAS for z/OS can use Resource Access Control Facility (RACF ) keyrings to store the keys and the truststores used for SSL, but different SSL protocols are used internally. We must be sure to set up both:
- A system SSL repertoire for use by the web container
- A JSSE SSL repertoire for use by the SOAP HTTP connector if the SOAP connector is used for administrative requests
(ZOS) The WebSphere z/OS Profile Management Tool or the zpmt command uses the same certificate authority to generate certificates for all servers within a given cell, including those of the node agents and the deployment manager.
Tasks
- When attempting to run system management commands such as the stopNode command, explicitly specify administrative credentials to perform the operation. Most commands accept -user and -password parameters to specify the user ID and password, respectively. User ID and password of an administrative user; for example, a user who is a member of the console users with Operator or Administrator privileges or the administrative user ID configured in the user registry. For example...
stopNode -username user -password pass
- Verify that the configuration at the node agents is always synchronized with the deployment manager prior to starting or restarting a node. To manually get the configuration synchronized, issue the syncNode command from each node that is not synchronized. To synchronize the configuration for node agents that are started, click System Administration > Nodes. Select all the started nodes then click Synchronize.
- (iSeries) Verify that the clocks on all systems are in sync, including the time and date. If they are out of sync, the tokens expire immediately when they reach the target server due to the time differences. Coordinated Universal Time (UTC) is used by default, and all other machines must have the same UTC time. Consult the operating system documentation for information regarding how to ensure this.
- Verify that the LTPA token expiration period is long enough to complete your longest downstream request. Some credentials are cached and therefore the timeout does not always include the length of the request. Specifically for cached credentials, we might need to evaluate our settings for the security cache (WSSecureMap) and LTPA timeout.
- The administrative connector used by default for system management is SOAP. SOAP is a stateless HTTP protocol. For most situations, this connector is sufficient. If we have a problem using the SOAP connector, we might want to change the default connector on all the servers from SOAP to RMI. The RMI connector uses Common Secure Interoperability v2 (CSIv2), a stateful, interoperable protocol, and can be configured to use identity assertion (downstream delegation), message-layer authentication (BasicAuth or Token), and client certificate authentication (for server trust isolation). To change the default connector on a given server, go to Administration Services under Additional properties for that server.
- An error message might occur within the administrative subsystem security. This error indicates that the sending process did not supply a credential to the receiving process. Typically the cause of this problem is the sending process has security disabled while the receiving process has security enabled. This setup typically indicates that one of the two processes are not synchronized with the cell. Having security disabled for a specific application server does not have any effect on administrative security.
- (iSeries) An error message might occur within the administrative subsystem security. This error indicates that the sending process did not supply a credential to the receiving process. Typically the causes of this problem are:
- The sending process has security disabled while the receiving process has security enabled. This setup typically indicates one of the two processes are not synchronized with the cell. Having security disabled for a specific application server does not have any effect on administrative security.
- The clocks between the systems are not synchronized; this immediately makes the credential tokens not valid. Verify that the time, date, and time zones are consistent between the two machines. An error similar to the following might occur:
[9/18/02 16:48:23:859 CDT] 3b9cef35 RoleBasedAuth A CWSCJ0305I: Role based authorization check failed for security name <null>, accessId NO_CRED_NO_ACCESS_ID while invoking method propagateNotifications:[Ljavax.management.Notification; on resource NotificationService and module NotificationService.
- (iSeries) When getting the following error message, validate that the clocks are synchronized between all servers within the cell, and the configurations are synchronized between all nodes and the Deployment Manager. An error similar to the following might occur:
[9/18/02 16:48:22:859 CDT] 3bd06f34 LTPAServerObj E CWSCJ0372E: Validation of the token failed.
Proper understanding of the security interactions between distributed servers greatly reduces the problems that are encountered with secure communications. Security adds complexity because additional function must be managed. For security to work properly, it needs thorough consideration during the planning of our infrastructure.
What to do next
When we have security problems related to the WAS ND environment, see Troubleshoot security configurations to find additional information about the problem. When trace is needed to solve a problem because servers are distributed, it is often required to gather trace on all servers simultaneously while recreating the problem. This trace can be enabled dynamically or statically, depending on the type of problem that is occurring.
Subtopics
Related:
(ZOS) Java thread identity and an operating system thread identity Troubleshoot security configurations