How consistency is maintained in IBM MQ for z/OS

Data in IBM MQ for z/OS must be consistent with batch, CICS, IMS, or TSO. Any data changed in one must be matched by a change in the other.

Before one system commits the changed data, it must know that the other system can make the corresponding change. So, the systems must communicate.

During a two-phase commit (for example under CICS ), one subsystem coordinates the process. That subsystem is called the coordinator; the other is the participant. CICS or IMS is always the coordinator in interactions with IBM MQ, and IBM MQ is always the participant. In the batch or TSO environment, IBM MQ can participate in two-phase commit protocols coordinated by z/OS RRS.

During a single-phase commit (for example under TSO or batch), IBM MQ is always the coordinator in the interactions and completely controls the commit process.

In a WebSphere Application Server environment, the semantics of the JMS session object determine whether single-phase or two-phase commit coordination is used.

Consistency with CICS or IMS

The connection between IBM MQ and CICS or IMS supports the following syncpoint protocols:

Two-phase commit - for transactions that update resources owned by more than one resource manager.
This is the standard distributed syncpoint protocol. It involves more logging and message flows than a single-phase commit.
Single-phase commit - for transactions that update resources owned by a single resource manager ( IBM MQ).
This protocol is optimized for logging and message flows.
Bypass of syncpoint - for transactions that involve IBM MQ but which do nothing in the queue manager that requires a syncpoint (for example, browsing a queue).

In each case, CICS or IMS acts as the syncpoint manager.

The stages of the two-phase commit that IBM MQ uses to communicate with CICS or IMS are as follows:

In phase 1, each system determines independently whether it has recorded enough recovery information in its log, and can commit its work.
At the end of the phase, the systems communicate. If they agree, each begins the next phase.
In phase 2, the changes are made permanent. If one of the systems abends during phase 2, the operation is completed by the recovery process during restart.

Illustration of the two-phase commit process

Figure 1 illustrates the two-phase commit process. Events in the CICS or IMS coordinator are shown on the upper line, events in IBM MQ on the lower line.

The numbers in the following section are linked to those shown in the figure.

The data in the coordinator is at a point of consistency.
An application program in the coordinator calls IBM MQ to update a queue by adding a message.
This starts a unit of recovery in IBM MQ.
Processing continues in the coordinator until an application synchronization point is reached.
The coordinator then starts commit processing. CICS programs use a SYNCPOINT command or a normal application termination to start the commit. IMS programs can start the commit by using a CHKP call, a SYNC call, a GET UNIQUE call to the IOPCB, or a normal application termination. Phase 1 of commit processing begins.
As the coordinator begins phase 1 processing, so does IBM MQ.
IBM MQ successfully completes phase 1, writes this fact in its log, and notifies the coordinator.
The coordinator receives the notification.
The coordinator successfully completes its phase 1 processing. Now both subsystems agree to commit the data changes, because both have completed phase 1 and could recover from any errors. The coordinator records in its log the instant of commit - the irrevocable decision of the two subsystems to make the changes.
The coordinator now begins phase 2 of the processing - the actual commitment.
The coordinator notifies IBM MQ to begin its phase 2.
IBM MQ logs the start of phase 2.
Phase 2 is successfully completed, and this is now a new point of consistency for IBM MQ. IBM MQ then notifies the coordinator that it has finished its phase 2 processing.
The coordinator finishes its phase 2 processing. The data controlled by both subsystems is now consistent and available to other applications.

How consistency is maintained after an abnormal termination

When a queue manager is restarted after an abnormal termination, it must determine whether to commit or to back out units of recovery that were active at the time of termination. For some units of recovery, IBM MQ has enough information to make the decision. For others, it does not, and must get information from the coordinator when the connection is reestablished.

Figure 1 shows four periods within the two phases: a, b, c, and d. The status of a unit of recovery depends on the period in which termination happened. The status can be one of the following:

In flight: The queue manager ended before finishing phase 1 (period a or b); during restart, IBM MQ backs out the updates.
In doubt: The queue manager ended after finishing phase 1 and before starting phase 2 (period c); only the coordinator knows whether the error happened before or after the commit (point 9). If it happened before, IBM MQ must back out its changes; if it happened after, IBM MQ must make its changes and commit them. At restart, IBM MQ waits for information from the coordinator before processing this unit of recovery.
In commit: The queue manager ended after it began its own phase 2 processing (period d); it makes committed changes.
In backout: The queue manager ended after a unit of recovery began to be backed out but before the process was complete (not shown in the figure) during restart, IBM MQ continues to back out the changes.

Parent topic: Recovery and restart on z/OS