Recovery manager messages (CSQR...)

CSQR001I

RESTART INITIATED

Explanation:

This message delimits the beginning of the restart process within startup. The phases of restart are about to begin. These phases are necessary to restore the operational environment to that which existed at the time of the previous termination and to perform any recovery actions that might be necessary to return MQ-managed resources to a consistent state.

CSQR002I

RESTART COMPLETED

Explanation:

This message delimits the completion of the restart process within startup.

System Action:

Startup continues.

CSQR003I

RESTART - PRIOR CHECKPOINT RBA=rba

Explanation:

The message indicates the first phase of the restart process is in progress and identifies the log positioning RBA of the checkpoint from which the restart process will obtain its initial recovery information.

System Action:

Restart processing continues.

CSQR004I

RESTART - UR COUNTS - IN COMMIT=nnnn, INDOUBT=nnnn, INFLIGHT=nnnn, IN BACKOUT=nnnn

Explanation:

This message indicates the completion of the first phase of the restart process. The counts indicate the number of units of recovery whose execution state during a previous queue manager termination was such that (to ensure MQ resource consistency) some recovery action must be performed during this restart process. The counts might provide an indication of the time required to perform the remaining two phases of restart (forward and backward recovery).

The IN COMMIT count specifies the number that had started, but not completed, phase-2 of the commit process. These must undergo forward recovery to complete the commit process.

The INDOUBT count specifies the number that were interrupted between phase-1 and phase-2 of the commit process. These must undergo forward recovery to ensure that resources modified by them are unavailable until their INDOUBT status is resolved.

The INFLIGHT count specifies the number that neither completed phase-1 of the commit process nor began the process of backing out. These must undergo backward recovery to restore resources modified by them to their previous consistent state.

The IN BACKOUT count specifies the number that were in the process of backing out. These must undergo backward recovery to restore resources modified by them to their previous consistent state.

System Action:

Restart processing continues.

CSQR005I

RESTART - FORWARD RECOVERY COMPLETE - IN COMMIT=nnnn, INDOUBT=nnnn

Explanation:

The message indicates the completion of the forward recovery restart phase. The counts indicate the number of units of recovery whose recovery actions could not be completed during the phase. Typically, those in an IN COMMIT state remain because the recovery actions of some subcomponents have not been completed. Those units of recovery in an INDOUBT state will remain until connection is made with the subsystem that acts as their commit coordinator.

System Action:

Restart processing continues.

Operator Response:

No action is required unless the conditions persist beyond some installation-defined period of time. Recovery action will be initiated when the resource is brought online. Indoubt resolution will be initiated as part of the process of reconnecting the subsystems.

CSQR006I

RESTART - BACKWARD RECOVERY COMPLETE - INFLIGHT=nnnn, IN BACKOUT=nnnn

Explanation:

The message indicates the completion of the backward recovery restart phase. The counts indicate the number of units of recovery whose recovery actions could not be completed during the phase. Typically, those in either state remain because the recovery actions of some subcomponents have not been completed.

System Action:

Restart processing continues.

Operator Response:

No action is required unless the condition persists beyond some installation-defined period of time. Recovery action will be initiated when the resource collection is brought online.

CSQR007I

UR STATUS

Explanation:

This message precedes a table showing the status of units of recovery (URs) after each restart phase. The message and the table will accompany the CSQR004I, CSQR005I, or CSQR006I message after each nested phase. At the end of the first phase, it shows the status of any URs that require processing. At the end of the second (forward recovery) and third (backout) phases, it shows the status of only those URs which needed processing but were not processed. The table helps to identify the URs that were active when the queue manager stopped, and to determine the log scope required to restart.

The format of the table is:

 T  CON-ID     THREAD-XREF     S   URID     TIME

The columns contain the following information:

T

Connection type. The values can be:

B: Batch: From an application using a batch connection
R: RRS: From an RRS-coordinated application using a batch connection
C: CICS: From CICS
I: IMS: From IMS
S: System: From an internal function of the queue manager or from the channel initiator.

CON-ID

Connection identifier for related URs. Batch connections are not related to any other connection. Subsystem connections with the same identifier indicate URs that originated from the same subsystem.

THREAD-XREF

The recovery thread cross-reference identifier associated with the thread; see the WebSphere MQ for z/OS System Administration Guide for more information.

S

Restart status of the UR. When the queue manager stopped, the UR was in one of these situations:

B: INBACKOUT: the UR was in the 'must-complete' phase of backout, and is yet to be completed
C: INCOMMIT: the UR was in the 'must-complete' phase of commit, and is yet to be completed
D: INDOUBT: the UR had completed the first phase of commit, but MQ had not received the second phase instruction (the UR must be remembered so that it can be resolved when the owning subsystem reattaches)
F: INFLIGHT: the UR had not completed the first phase of commit, and will be backed out.

URID

UR identifier, the log RBA of the beginning of this unit of recovery. It is the earliest RBA required to process the UR during restart.

TIME

The time the UR was created, in the format yyyy-mm-dd hh:mm:ss. It is approximately the time of the first MQ API call of the application or the first MQ API call following a commit point.

CSQR009E

NO STORAGE FOR UR STATUS TABLE, SIZE REQUESTED=xxxx, REASON CODE=yyyyyyyy

Explanation:

There was not enough storage available during the creation of the recoverable UR (unit of recovery) display table.

System Action:

Restart continues but the status table is not displayed.

System Programmer Response:

Increase the region size of the xxxxMSTR region before restarting the queue manager.

Problem Determination:

The size requested is approximately 110 bytes for each unit of recovery (UR). See the message CSQR004I to determine the total number of URs to process. Use this value with the storage manager reason code from this message to determine the reason for the shortage. The reason codes are documented in Storage manager codes (X'E2').

CSQR010E

ERROR IN UR STATUS TABLE SORT/TRANSLATE, ERROR LOCATION CODE=xxxx

Explanation:

An internal error has occurred.

System Action:

Restart continues but the status table is not displayed.

System Programmer Response:

Note the error code in the message and contact your IBM support center.

CSQR011E

ERROR IN UR STATUS TABLE DISPLAY, ERROR LOCATION CODE=xxxx

Explanation:

An internal error has occurred.

System Action:

Restart continues but the status table is not displayed.

System Programmer Response:

Note the error code in the message and contact your IBM support center.

CSQR015E

CONDITIONAL RESTART CHECKPOINT RBA rba NOT FOUND

Explanation:

The checkpoint RBA in the conditional restart control record, which is deduced from the end RBA or LRSN value that was specified, is not available. This is probably because the log data sets available for use at restart do not include that end RBA or LRSN.

System Action:

Restart ends abnormally with reason code X'00D99001' and the queue manager terminates.

System Programmer Response:

Run the change log inventory utility (CSQJU003) specifying an ENDRBA or ENDLRSN value on the CRESTART control statement that is in the log data sets that are to be used for restarting the queue manager.

CSQR020I

OLD UOW FOUND

Explanation:

During restart, a unit of work was found that predates the oldest active log. Information about the unit of work is displayed in a table in the same format as in message CSQR007I.

System Action:

Message CSQR021D is issued and the operator's reply is awaited.

Operator Response:

The operator has two options:

Commit the unit of work, by replying 'Y'.
Continue, by replying 'N'. The unit of work will be handled by the normal restart recovery processing. Because the unit of work is old, this is likely to involve using the archive logs.

CSQR021D

REPLY Y TO COMMIT OR N TO CONTINUE

Explanation:

An old unit of work was found, as indicated in the preceding CSQR020I message.

System Action:

The queue manager waits for the operator's reply.

Operator Response:

See message CSQR020I.

CSQR022I

OLD UOW COMMITTED, URID=urid

Explanation:

This message is sent if the operator answers 'Y' to message CSQR021D.

System Action:

The indicated unit of work is committed.

CSQR023I

OLD UOW UNCHANGED, URID=urid

Explanation:

This message is sent if the operator answers 'N' to message CSQR021D.

System Action:

The indicated unit of work is left for handling by the normal restart recovery process.

CSQR026I

Long-running UOW shunted to RBA=rba, URID=urid connection name=name

Explanation:

During checkpoint processing, an uncommitted unit of recovery was encountered that has been active for at least 3 checkpoints. The associated log records have been rewritten ('shunted') to a later point in the log, at RBA rba. The unit of recovery identifier urid together with the connection name name identify the associated thread.

System Action:

Processing continues.

System Programmer Response:

Uncommitted units of recovery can lead to difficulties later, so consult with the application programmer to determine if there is a problem that is preventing the unit of recovery from being committed, and to ensure that the application commits work frequently enough.

CSQR027I

Long-running UOW shunting failed, URID=urid connection name=name

Explanation:

During checkpoint processing, an uncommitted unit of recovery was encountered that has been acvtive for at least 3 checkpoints. However, the associated log records could not be rewritten ('shunted') to a later point in the log. The unit of recovery identifier urid together with the connection name name identify the associated thread.

System Action:

The unit of recovery is not shunted, and will not participate in any future log shunting.

System Programmer Response:

The most likely cause is insufficient active log data sets being available, in which case you should add more log data sets for the queue manager to use. Use the DISPLAY LOG command or the print log map utility (CSQJU004) to determine how many log data sets there are and what their status is.

CSQR029I

INVALID RESPONSE - NOT Y OR N

Explanation:

The operator did not respond correctly to the reply message CSQR021D. Either 'Y' or 'N' must be entered.

System Action:

The original message is repeated.

Operator Response:

Reply as indicated in the repeated message.

CSQR030I

Forward recovery log range from RBA=from-rba to RBA=to-rba

Explanation:

This indicates the log range that must be read to perform forward recovery during restart.

System Action:

Restart processing continues.

CSQR031I

Reading log forwards, RBA=rba

Explanation:

This is issued periodically during restart recovery processing to show the progress of the forward recovery phase. The log range that needs to be read is shown in the preceding CSQR030I message.

System Action:

Restart processing continues.

Operator Response:

If this message is issued repeatedly with the same RBA value, investigate the cause; for example, MQ might be waiting for a tape with an archive log data set to be mounted.

CSQR032I

Backward recovery log range from RBA=from-rba to RBA=to-rba

Explanation:

This indicates the log range that must be read to perform backward recovery during restart.

System Action:

Restart processing continues.

CSQR033I

Reading log backwards, RBA=rba

Explanation:

This is issued periodically during restart recovery processing to show the progress of the backward recovery phase. The log range that needs to be read is shown in the preceding CSQR032I message.

System Action:

Restart processing continues.

Operator Response:

If this message is issued repeatedly with the same RBA value, investigate the cause; for example, MQ might be waiting for a tape with an archive log data set to be mounted.