Detecting and fixing problems with WS-ReliableMessaging
The nature of WS-ReliableMessaging is that network and server failures are assumed, and therefore the target Web service or message store might not be available. In these cases, message sequences cannot be completed and collections of Web service messages are held awaiting transmission. Use SystemOut.log, system events, and the runtime admin panels to monitor the system and detect and fix problems with WS-ReliableMessaging.
If a sequence fails, a message is written to the application server SystemOut.log file and a system event is generated. Therefore we can detect failed sequences by looking at SystemOut.log, or by writing an event listener (or using third party software) to monitor system events.
After a sequence has been established, WS-ReliableMessaging provides retransmission of messages to a service. However if the sequence is not established (that is, if the initial CreateSequence request is refused) then the messages are not transmitted to the service.
See the troubleshooting tip A sequence is not established, and therefore WS-ReliableMessaging cannot ensure messages are transmitted.
For more detailed status information at run time, and facilities to help you fix problems, use the WS-ReliableMessaging admin console runtime panels. These panels are available at many different scopes (for example cell; appserver; messaging engine). For a full list of the WS-ReliableMessaging runtime panels, and details of the scopes at which they are available, see WS-ReliableMessaging - admin console panels. At all scopes, the parent panel is Reliable messaging state settings. From this panel we can investigate each of the three key runtime aspects of reliable messaging:
- Message stores
- Inbound sequences
- Outbound sequences
The following icons are displayed here and on several other reliable messaging runtime panels:
OK Everything here, and (if there is a link) in all runtime panels below this link, is running normally.
Warning Something here, or (if there is a link) in one of the runtime panels below this link, is in an unusual state and we might have to take some action to resolve it. For example, the system might be awaiting a response from an endpoint. In this case, either the response will be received (in which case we need take no action and the runtime information will be updated to "OK") or the reliable messaging destination has stopped acknowledging messages (in which case we have to take some action to resolve the failed sequence).
Error There is a definite error that take some action to resolve, either here or (if there is a link) in one of the runtime panels below this link. Note that for troubleshooting purposes you only have to follow links to the sub-panels if states other than "OK" are displayed.
To use the reliable messaging runtime panels to detect and fix problems with WS-ReliableMessaging...
- Investigate problems with message stores.
In the navigation pane, click one of the paths to this panel. For example Servers > Server Types > WebSphere application servers > server_name > [Additional Properties] Reliable messaging state > Runtime > Message store . The list of reliable messaging storage managers for the current scope is displayed in the Message store collection form.
For the managed qualities of service, the messages are written to a messaging engine. For the unmanaged non-persistent quality of service, the messages are stored in memory. For in-memory stores the only possible value is "Running". For messages stored by a messaging engine, the possible values are "Running" or "Messaging engine not contactable", probably because the messaging engine is not running. The "OK" icon indicates that the message store is running. If the messaging engine is not contactable, the "Error" icon is displayed.
For each message store in the list, the name of the associated reliable messaging application is given in the description column. If a messaging engine is not contactable, restart the message store for that application.
- Investigate problems with inbound sequences.
In the navigation pane, click one of the paths to this panel. For example Servers > Server Types > WebSphere application servers > server_name > [Additional Properties] Reliable messaging state > Runtime > Inbound sequences . The runtime state of each of the inbound sequences for the current scope is displayed in the Inbound sequence collection form.
Use a filter to look at sequences that are in a particular state (for example Failed due to missing message) or that have a large number of messages awaiting dispatch to applications. If the sequence status is Error, there is a problem with the sequence and the source server hosting the other end of the sequence has terminated it. If the sequence is active and there are a large number of messages awaiting dispatch to the application, then there could be a problem with the application or, if in-order delivery is specified, delivery could be held up because the sequence has gaps in it.
We can select one or more sequences, then use the buttons provided to dispatch the messages to their associated applications, to export the messages to ZIP files, to close or terminate the selected sequences, or to delete the selected sequences and all their messages. To see more detailed information about a particular sequence, click the Sequence identifier field. The Inbound sequences settings form is displayed. This detailed information includes addressing information to help you identify the source of the sequence, and the value (true or false) for in-order delivery for the sequence. From this panel we can also display the following forms:
- The Acknowledgement state collection form. (The ranges of message sequence numbers received from the WS-ReliableMessaging source. If more than one range is displayed, this indicates a gap in the messages received. If "In-order delivery" is selected for the sequence manager, messages with a sequence number greater than the lowest gap cannot be delivered to the application until the gap is closed.)
- The Inbound message collection form. (The messages on the inbound sequence. You can use this form to delete individual messages.)
- The Message settings form. (The contents of an individual message in the sequence.)
For more guidance on diagnosing problems with inbound sequences, see Diagnosing the problem when a reliable messaging source cannot deliver its messages
- Investigate problems with outbound sequences.
In the navigation pane, click one of the paths to this panel. For example Servers > Server Types > WebSphere application servers > server_name > [Additional Properties] Reliable messaging state > Runtime > Outbound sequences . The runtime state of each of the outbound sequences for the current scope is displayed in the Outbound sequence collection form.
Use a filter to look at sequences that are in a particular state. For example, the state Cannot contact the remote endpoint indicates that the sequence has been established but the reliable messaging destination has stopped acknowledging messages (which, coupled with a high number of messages awaiting transmission, could indicate a potential problem). If the sequence status is Error, there is a problem with the sequence and the server hosting the other end of the sequence has terminated it.
We can select one or more sequences, and use one of the buttons provided to reallocate messages to new sequences, to export the messages to ZIP files, to close or terminate the selected sequences, or to delete the selected sequences and all their messages.
Before you delete a sequence, we can reassign its messages to new sequences or export them to a ZIP file.
See Delete a failed WS-ReliableMessaging outbound sequence
To see more detailed information about a particular sequence, click the Sequence identifier field. The Outbound sequences settings form is displayed. This detailed information includes addressing information to help you identify the server at which the sequence is targeted. From this panel we can also display the following forms:
- The Outbound message collection form. (The messages on the outbound sequence. You can use this form to delete individual messages.)
- The Message settings form. (The contents of an individual message in the sequence.)
For more guidance on diagnosing problems with outbound sequences, see Diagnosing and recovering a WS-ReliableMessaging outbound sequence that is in retransmitting state.
Diagnosing the problem when a reliable messaging source cannot deliver its messages
Diagnosing and recovering a WS-ReliableMessaging outbound sequence that is in retransmitting state
Delete a failed WS-ReliableMessaging outbound sequence
WS-ReliableMessaging troubleshooting tips
Related tasks
Learn about WS-ReliableMessaging
Building a reliable Web service application
Set WS-SecureConversation to work with WS-ReliableMessaging
Add assured delivery to Web services through WS-ReliableMessaging
Related
WS-ReliableMessaging - requirements for interaction with other implementations
WS-ReliableMessaging - admin console panels
WS-ReliableMessaging: supported specifications and standards