Use this topic to understand what happens when the CICS adapter restarts, and then explains how to deal with any
unresolved units of recovery that arise.
What happens when the CICS adapter
restarts
Whenever a connection is broken, the adapter has to go through a restart phase during the
reconnect process. The restart phase resynchronizes resources. Resynchronization between
CICS and IBM MQ enables in-doubt units of work to be identified and
resolved.
An implicit request when a connection is made to IBM MQ
If the resynchronization is caused by connecting to IBM MQ, the sequence of events is:
The connection process retrieves a list of in-doubt units of work (UOW) IDs from IBM MQ.
The UOW IDs are displayed on the console in CSQC313I messages.
The UOW IDs are passed to CICS.
CICS initiates a resynchronization task (CRSY)
for each in-doubt UOW ID.
The result of the task for each in-doubt UOW is displayed on the console.
We need to check the messages that are displayed during the connect process:
CSQC313I
Shows that a UOW is in doubt.
CSQC400I
Identifies the UOW and is followed by one of these messages:
CSQC402I or CSQC403I shows that the UOW was resolved successfully (committed or backed out).
CSQC404E, CSQC405E, CSQC406E, or CSQC407E shows that the UOW was not resolved.
CSQC409I
Shows that all UOWs were resolved successfully.
CSQC408I
Shows that not all UOWs were resolved successfully.
CSQC314I
Warns that UOW IDs highlighted with a * are not resolved automatically. These UOWs must be
resolved explicitly by the distributed queuing component when it is restarted.
Figure 1 shows an example
set of restart messages displayed on the z/OS
console.
Figure 1. Example restart messages
CSQ9022I +CSQ1 CSQYASCP ' START QMGR' NORMAL COMPLETION
+CSQC323I VICIC1 CSQCQCON CONNECT received from TERMID=PB62 TRANID=CKCN
+CSQC303I VICIC1 CSQCCON CSQCSERV loaded. Entry point is 850E8918
+CSQC313I VICIC1 CSQCCON UOWID=VICIC1.A6E5A6F0E2178D25 is in doubt
+CSQC313I VICIC1 CSQCCON UOWID=VICIC1.A6E5A6F055B2AC25 is in doubt
+CSQC313I VICIC1 CSQCCON UOWID=VICIC1.A6E5A6EFFD60D425 is in doubt
+CSQC313I VICIC1 CSQCCON UOWID=VICIC1.A6E5A6F07AB56D22 is in doubt
+CSQC307I VICIC1 CSQCCON Successful connection to subsystem VC2
+CSQC472I VICIC1 CSQCSERV Server subtask (TCB address=008BAD18) connect
successful
+CSQC472I VICIC1 CSQCSERV Server subtask (TCB address=008BAA10) connect
successful
+CSQC472I VICIC1 CSQCSERV Server subtask (TCB address=008BA708) connect
successful
+CSQC472I VICIC1 CSQCSERV Server subtask (TCB address=008CAE88) connect
successful
+CSQC472I VICIC1 CSQCSERV Server subtask (TCB address=008CAB80) connect
successful
+CSQC472I VICIC1 CSQCSERV Server subtask (TCB address=008CA878) connect
successful
+CSQC472I VICIC1 CSQCSERV Server subtask (TCB address=008CA570) connect
successful
+CSQC472I VICIC1 CSQCSERV Server subtask (TCB address=008CA268) connect
successful
+CSQC403I VICIC1 CSQCTRUE Resolved BACKOUT for
+CSQC400I VICIC1 CSQCTRUE UOWID=VICIC1.A6E5A6F0E2178D25
+CSQC403I VICIC1 CSQCTRUE Resolved BACKOUT for
+CSQC400I VICIC1 CSQCTRUE UOWID=VICIC1.A6E5A6F055B2AC25
+CSQC403I VICIC1 CSQCTRUE Resolved BACKOUT for
+CSQC400I VICIC1 CSQCTRUE UOWID=VICIC1.A6E5A6F07AB56D22
+CSQC403I VICIC1 CSQCTRUE Resolved BACKOUT for
+CSQC400I VICIC1 CSQCTRUE UOWID=VICIC1.A6E5A6EFFD60D425
+CSQC409I VICIC1 CSQCTRUE Resynchronization completed successfully
The total number of CSQC313I messages should equal the total number of CSQC402I plus CSQC403I
messages. If the totals are not equal, there are UOWs that the connection process cannot resolve.
Those UOWs that cannot be resolved are caused by problems with CICS (for example, a cold start) or with IBM MQ, or by distributing queuing. When these problems have been
fixed, we can initiate another resynchronization by disconnecting and then reconnecting.
Alternatively, we can resolve each outstanding UOW yourself using the RESOLVE INDOUBT command
and the UOW ID shown in message CSQC400I. We must then initiate a disconnect and a connect to clean
up the unit of recovery descriptors in CICS.
We need to know the correct outcome of the UOW to resolve UOWs manually.
All messages that are associated with unresolved UOWs are locked by IBM MQ and no Batch, TSO, or CICS task can access them.
If CICS fails and an emergency restart is
necessary, do not vary the GENERIC APPLID of the CICS system. If you do and then reconnect to IBM MQ, data integrity with IBM MQ cannot be guaranteed. This is because IBM MQ treats the new instance of CICS as a different CICS (because the APPLID is different). In-doubt resolution
is then based on the wrong CICS log.
How to resolve CICS units of recovery
manually
If the adapter ends abnormally, CICS and IBM MQ build in-doubt lists either dynamically or during restart,
depending on which subsystem caused the abend.
Note: If we use the DFH$INDB sample program to show units of work, you might find that it does not
always show IBM MQ UOWs correctly.
When CICS connects to IBM MQ, there might be one or more units of recovery that have
not been resolved.
One of the following messages is sent to the console:
CICS retains details of units of recovery that were
not resolved during connection startup. An entry is purged when it no longer appears on the list
presented by IBM MQ.
Any units of recovery that CICS cannot resolve
must be resolved manually using IBM MQ commands. This
manual procedure is rarely used within an installation, because it is required only where
operational errors or software problems have prevented automatic resolution. Any inconsistencies
found during in-doubt resolution must be investigated.
To resolve the units of recovery:
Obtain a list of the units of recovery from IBM MQ using the following command:
For CICS
connections, the NID consists of the CICS applid and
a unique number provided by CICS at the time the
syncpoint log entries are written. This unique number is stored in records written to both the
CICS system log and the IBM MQ log at syncpoint processing time. This value is referred
to in CICS as the recovery token.
Scan the CICS log for entries related to a
particular unit of recovery.
Look for a PREPARE record for the task-related installation where
the recovery token field (JCSRMTKN) equals the value obtained from the network ID. The network ID is
supplied by IBM MQ in the DISPLAY CONN command output.
The PREPARE record in the CICS log for the units
of recovery provides the CICS task number. All other
entries on the log for this CICS task can be located
using this number.
We can use the CICS journal print utility DFHJUP
when scanning the log. For details of using this program, see the
CICS Operations and Utilities Guide.
Scan the IBM MQ log for records with the NID related
to a particular unit of recovery. Then use the URID from this record to obtain the rest of the log
records for this unit of recovery.
When scanning the IBM MQ log, note that the IBM MQ startup message CSQJ001I provides the start RBA for this
session.
The print log records program (CSQ1LOGP) can be used for that purpose.
For to, do in-doubt resolution in IBM MQ.
IBM MQ can be directed to take the recovery action for a
unit of recovery using an IBM MQ RESOLVE INDOUBT command.
To recover all threads associated with a specific connection-name, use the NID(*) option.
The command produces one of the following messages showing whether the thread is committed or
backed out:
When performing in-doubt resolution, CICS and the
adapter are not aware of the commands to IBM MQ to
commit or back out units of recovery, because only IBM MQ resources are affected. However, CICS keeps
details about the in-doubt threads that could not be resolved by IBM MQ. This information is purged either when the list presented
is empty, or when the list does not include a unit of recovery of which CICS has details.