Troubleshoot the cluster

Troubleshoot the cluster

If you experience a problem with the operation of WebSphere Portal, refer to the troubleshooting section of the WebSphere Portal Information Center to determine whether it is a known problem for which a workaround might exist. You can also review a list of problems addressed by interim fixes and service packs at the WebSphere Portal support site. Be sure to also look at Technotes section on the WebSphere Portal support site for any additional known issues and workarounds which may have been identified after the initial release.
WebSphere Portal uses and depends on the WebSphere Application Server infrastructure. Often an underlying issue in WebSphere Application Server can affect WebSphere Portal, so it can be helpful to monitor the WebSphere Application Server support site for known issues, fixes, and workarounds.
For information related to specific components, see the appropriate troubleshooting topic.

Receiving a "DRSW0002e replicators are down and cannot be recovered" error message
"Error 503: Failed to load target servlet [portal]" received when attempting to access portal
"EJPAQ1319E: Cannot install the selected WAR file" error generated when attempting to deploy portlets
Basic configuration fails when installing on a federated node
Portlet larger than 10MB/10000000 bytes will exceed the default PostSizeLimit
WebSphere Portal does not start due to missing class file: com/ibm/wps/services/puma/AccessBean

Problem: Receiving a "DRSW0002e replicators are down and cannot be recovered" error message

DRSW0002e replicators are down and cannot be recovered" error message is received in a WebSphere Portal cluster.
Solution: There are two possible solutions to this problem:

If the WebSphere Portal sysout.log shows that the DRSW0002e error message is being issued every few seconds, you may have created a replicator on server1, and server1 is down and not being used. If this is the case, when mem-mem replication started, the error message was issued because server1 was not accessible. Remove the replicator on server1 and create a replicator on the portal server for that node to resolve the problem.
The DRSW0002e error message can also be issued if the Network Deployment Manager maintenance level does not match the maintenance level of the WebSphere Portal clustered servers. Check the maintenance with the update.sh command for each server in the cluster.

Problem: "Error 503: Failed to load target servlet [portal]" received when attempting to access portal

After federating a WebSphere Portal node and then attempting to access the portal, you might receive an "Error 503: Failed to load target servlet [portal]" message in the browser. In addition a message similar to the following message is generated in the wps_timestamp.log file:
2004.09.15 11:18:08.428 E com.ibm.wps.engine.Servlet init EJPFD0016E: Initialization of service failed. - StackTrace follows...

Solution: When this error occurs, it could indicate that you have not updated the deployment manager configuration for the new WebSphere Portal node. To perform this update, complete the following steps:

Ensure that the CellName property in the wpconfig.properties file is set for the new WebSphere Portal.

Specify the name of the cell to which the WebSphere Portal node belongs.
The cell name can be identified by the was_root/config/cells/cell_name directory on the node, where cell_name indicates the cell to which the node belongs.

Update the deployment manager configuration for the new WebSphere Portal.
Run the following command from the wp_root/config directory:

Windows: WPSconfig.bat post-portal-node-federation-configuration
UNIX: ./WPSconfig.sh post-portal-node-federation-configuration

Problem: "EJPAQ1319E: Cannot install the selected WAR file" error generated when attempting to deploy portlets

When attempting to deploy portlets in a clustered environment, you might receive a series of error messages similar to the following messages:
EJPAQ1319E: Cannot install the selected WAR file. com.ibm.portal.WpsException: EJPAQ1319E: Cannot install the selected WAR file. ... EJPPD0015E: Portlet application manager failed when user uid=wpsadmin,cn=users,o=ibm executed command InstallPortletApplication. ... EJPPH0019E: Installation of Web module wp.struts.legacy.examples.StockQuote from WAR file /opt/WebSphere/PortalServer/deployed/SPFLegacyStockQuote.war failed ( display name: Struts Leg_ock Quotes (PA_1_0_9D), options: AppServerDeploymentData: id = wp.struts.legacy.examples.StockQuote displayName = Struts Leg_ock Quotes (PA_1_0_9D) warfileName = /opt/WebSphere/PortalServer/deployed/SPFLegacyStockQuote.war contextRoot = /wps/PA_1_0_9D policyFile = /opt/WebSphere/PortalServer/deployed/temp/SPFLegacyStockQuote.war.0/META-INF/was.policy ). ... EJPPH0056E: The installation of portlet application /opt/WebSphere/PortalServer/deployed/SPFLegacyStockQuote.war did not complete successfully. Please check the WAS log files for a possible explanation.

Solution: When these errors occur, edit the DeploymentService.properties file and ensure that you have set the wps.appserver.name property to the name of the cluster you defined.

Windows/UNIX location: wp_root/shared/app/config/services/DeploymentService.properties

Problem: Basic configuration fails when installing on a federated node

If basic configuration of WebSphere Portal fails during installation on a federated node, you might experience errors if you attempt to reinstall. This is because during the first installation, the WebSphere_Portal appserver and its related resource definitions were created in the deployment manager configuration. When you attempt to reinstall and thus run basic configuration again, you will receive errors because the WebSphere_Portal appserver already exists.
Solution: To clean up the deployment manager configuration before reinstalling WebSphere Portal, complete the following steps:

Access the administrative console on the deployment manager.
Click Servers > Application Servers.
Select the WebSphere_Portal appserver associated with the node on which you intend to reinstall, and click Delete.
Click Resources > JDBC Providers.
Select the resources associated with the WebSphere Portal node on which you intend to reinstall, and click Delete.
Click Applications > Enterprise Applications.
Select the enterprise applications associated with the WebSphere Portal node on which you intend to reinstall, and click Delete.
Save your changes.

Portlet larger than 10MB/10000000 bytes will exceed the default PostSizeLimit

The default PostSizeLimit in plugin-cfg.xml for a single server is 10 million bytes.
Solution: In a federated or clustered environment, the ServerClusterName element in the plug-cfg.xml file for the WebSphere Portal cluster, the PostSizeLimit value should be changed to -1. This value will enable requests larger than 10MB to be made without error. Note: search the plugin-cfg.xml for the ServerCluster element entry for the WebSphere Portal cluster and change the value in that element. When the PostSizeLimit is set to -1 on the ServerClusterName element for the WebSphere Portal cluster, the size of the request that can be made to an appserver is not limited.
Before
<ServerCluster Name="Portal_Cluster" CloneSeparatorChange="false" LoadBalance="Round Robin" PostSizeLimit="10000000" RemoveSpecialHeaders="true" RetryInterval="60">

After
<ServerCluster Name="Portal_Cluster" CloneSeparatorChange="false" LoadBalance="Round Robin" PostSizeLimit="-1" RemoveSpecialHeaders="true" RetryInterval="60">

WebSphere Portal does not start due to missing class file: com/ibm/wps/services/puma/AccessBean

When attempting to access a WebSphere Portal cluster for the first time through an external Web server, you might receive an Error 503 response. In addition, the following messages are generated in the WebSphere Portal log file:
com.ibm.hrl.pse.portlets.WebScannerManager getSecretKeyFromCredentialVault java.lang.NoClassDefFoundError: com/ibm/wps/services/puma/AccessBean

Solution: This problem can occur when the PortalAdminID property values specified for each node do not match and you enabled security with an LDAP registry. In this situation, the value of PortalAdminID is the fully-qualified distinguished name (DN) of the WebSphere Portal administrator. Because the DN is case sensitive, differences in case between PortalAdminID values on the various nodes can cause a problem.
For example, the error can be generated if you have two nodes in the cluster with the following values for PortalAdminId in the wpconfig.properties files:

Node 1: uid=wpsadmin,ou=People,dc=raleigh,dc=ibm,dc=com
Node 2: uid=wpsadmin,ou=people,dc=raleigh,dc=ibm,dc=com

To correct this problem, ensure that you have used the proper case for the value of the PortalAdminId property and that the property values match on each node in the cluster.

See also

Clustering and WebSphere Portal

WebSphere is a trademark of the IBM Corporation in the United States, other countries, or both.

IBM is a trademark of the IBM Corporation in the United States, other countries, or both.