Active log problems
Use this topic to resolve different problems with the active logs.
This topic covers the following active log problems:
- Dual logging is lost
- Active log stopped
- One or both copies of the active log data set are damaged
- Write I/O errors on an active log data set
- I/O errors occur while reading the active log
- Active log is becoming full
- Active log is full
Dual logging is lost
- Symptoms
- IBM MQ issues the following message:
CSQJ004I +CSQ1 ACTIVE LOG COPY n INACTIVE, LOG IN SINGLE MODE, ENDRBA=...Having completed one active log data set, IBM MQ found that the subsequent (COPY n) data sets were not offloaded or were marked stopped.
- System action
- IBM MQ continues in single mode until offloading has been completed, then returns to dual mode.
- System programmer action
- None.
- Operator action
- Check that the offload process is proceeding and is not waiting for a tape mount. We might need to run the print log map utility to determine the state of all data sets. We might also need to define additional data sets.
Active log stopped
- Symptoms
- IBM MQ issues the following message:
CSQJ030E +CSQ1 RBA RANGE startrba TO endrba NOT AVAILABLE IN ACTIVE LOG DATA SETS
- System action
- The active log data sets that contain the RBA range reported in message CSQJ030E are unavailable to IBM MQ. The status of these logs is STOPPED in the BSDS. The queue manager terminates with a dump.
- System programmer action
- We must resolve this problem before restarting the queue manager. The log RBA range must be available for IBM MQ to be recoverable. An active log that is marked as STOPPED in the BSDS will never be reused or archived and this creates a hole in the log.
Look for messages that indicate why the log data set has stopped, and follow the instructions for those messages.
Modify the BSDS active log inventory to reset the STOPPED status. To do this, follow this procedure after the queue manager has terminated:
- Use the print log utility (CSQJU004) to obtain a copy of the BSDS log inventory. This shows the status of the log data sets.
- Use the DELETE function of the change log inventory utility (CSQJU003) to delete the active log data sets that are marked as STOPPED.
- Use the NEWLOG function of CSQJU003 to add the active logs back into the BSDS inventory. The starting and ending RBA for each active log data set must be specified on the NEWLOG statement. (The correct values to use can be found from the print log utility report obtained in Step 1.)
- Rerun CSQJU004. The active log data sets that were marked as STOPPED are now shown as NEW and NOT REUSABLE. These active logs will be archived in due course.
- Restart the queue manager.
Note: If your queue manager is running in dual BSDS mode, we must update both BSDS inventories.
One or both copies of the active log data set are damaged
- Symptoms
- IBM MQ issues the following messages:
CSQJ102E +CSQ1 LOG RBA CONTENT OF LOG DATA SET DSNAME=..., STARTRBA=..., ENDRBA=..., DOES NOT AGREE WITH BSDS INFORMATION CSQJ232E +CSQ1 OUTPUT DATA SET CONTROL INITIALIZATION PROCESS FAILED
- System action
- Queue manager startup processing is terminated.
- System programmer action
- If one copy of the data set is damaged, carry out these steps:
- Rename the damaged active log data set and define a replacement data set.
- Copy the undamaged data set to the replacement data set.
- Use the change log inventory utility to:
- Remove information relating to the damaged data set from the BSDS.
- Add information relating to the replacement data set to the BSDS.
- Restart the queue manager.
If both copies of the active log data sets are damaged, the current page sets are available, and the queue manager shut down cleanly, carry out these steps:
- Rename the damaged active log data sets and define replacement data sets.
- Use the change log records utility to:
- Remove information relating to the damaged data set from the BSDS.
- Add information relating to the replacement data set to the BSDS.
- Rename the current page sets and define replacement page sets.
- Use CSQUTIL (FORMAT and RESETPAGE) to format the replacement page sets and copy the renamed page sets to them. The RESETPAGE function also resets the log information in the replacement page sets.
If the queue manager did not shut down cleanly, we must either restore the system from a previous known point of consistency, or perform a cold start (described in Reinitializing a queue manager ).
- Operator action
- None.
Write I/O errors on an active log data set
- Symptoms
- IBM MQ issues the following message:
CSQJ105E +CSQ1 csect-name LOG WRITE ERROR DSNAME=..., LOGRBA=..., ERROR STATUS=ccccffss
- System action
- IBM MQ carries out these steps:
- Marks the log data set that has the error as TRUNCATED in the BSDS.
- Goes on to the next available data set.
- If dual active logging is used, truncates the other copy at the same point.
The data in the truncated data set is offloaded later, as usual.
The data set will be reused on the next cycle.
- System programmer action
- None.
- Operator action
- If errors on this data set still exist, shut down the queue manager after the next offload process. Then use Access Method Services (AMS) and the change log inventory utility to add a replacement. (For instructions, see Change the BSDS.)
I/O errors occur while reading the active log
- Symptoms
- IBM MQ issues the following message:
CSQJ106E +CSQ1 LOG READ ERROR DSNAME=..., LOGRBA=..., ERROR STATUS=ccccffss
- System action
- This depends on when the error occurred:
- If the error occurs during the offload process, the process tries to read the RBA range from a second copy.
- If no second copy exists, the active log data set is stopped.
- If the second copy also has an error, only the original data set that triggered the offload process is stopped. The archive log data set is then terminated, leaving a gap in the archived log RBA range.
- This message is issued:
CSQJ124E +CSQ1 OFFLOAD OF ACTIVE LOG SUSPENDED FROM RBA xxxxxx TO RBA xxxxxx DUE TO I/O ERROR- If the second copy is satisfactory, the first copy is not stopped.
- If the error occurs during recovery, IBM MQ provides data from specific log RBAs requested from another copy or archive. If this is unsuccessful, recovery does not succeed, and the queue manager terminates abnormally.
- If the error occurs during restart, if dual logging is used, IBM MQ continues with the alternative log data set, otherwise the queue manager ends abnormally.
- System programmer action
- Look for system messages, such as IEC prefixed messages, and try to resolve the problem using the recommended actions for these messages.
If the active log data set has been stopped, it is not used for logging. The data set is not deallocated; it is still used for reading. Even if the data set is not stopped, an active log data set that gives persistent errors should be replaced.
- Operator action
- None.
- Replacing the data set
How you replace the data set depends on whether we are using single or dual active logging.
If we are using dual active logging:
- Ensure that the data has been saved.
The data is saved on the other active log and this can be copied to a replacement active log.
- Stop the queue manager and delete the data set with the error using Access Method Services.
- Redefine a new log data set using Access Method Services DEFINE so that we can write to it. Use DFDSS or Access Method Services REPRO to copy the good log in to the redefined data set so that you have two consistent, correct logs again.
- Use the change log inventory utility, CSQJU003, to update the information in the BSDS about the corrupted data set as follows:
- Use the DELETE function to remove information about the corrupted data set.
- Use the NEWLOG function to name the new data set as the new active log data set and give it the RBA range that was successfully copied.
We can run the DELETE and NEWLOG functions in the same job step. Put the DELETE statement before NEWLOG statement in the SYSIN input data set.
- Restart the queue manager.
If we are using single active logging:
- Ensure that the data has been saved.
- Stop the queue manager.
- Determine whether the data set with the error has been offloaded:
- Use the CSQJU003 utility to list information about the archive log data sets from the BSDS.
- Search the list for a data set with an RBA range that includes the RBA of the corrupted data set.
- If the corrupted data set has been offloaded, copy its backup in the archive log to a new data set. Then, skip to step 6.
- If an active log data set is stopped, an RBA is not offloaded. Use DFDSS or Access Method Services REPRO to copy the data from the corrupted data set to a new data set.
If further I/O errors prevent you from copying the entire data set, a gap occurs in the log.
Note: Queue manager restart will not be successful if a gap in the log is detected.- Use the change log inventory utility, CSQJU003, to update the information in the BSDS about the corrupted data set as follows:
- Use the DELETE function to remove information about the corrupted data set.
- Use the NEWLOG function to name the new data set as the new active log data set and to give it the RBA range that was successfully copied.
The DELETE and NEWLOG functions can be run in the same job step. Put the DELETE statement before NEWLOG statement in the SYSIN input data set.
- Restart the queue manager.
Active log is becoming full
The active log can fill up for several reasons, for example, delays in offloading and excessive logging. If an active log runs out of space, this has serious consequences. When the active log becomes full, the queue manager halts processing until an offload process has been completed. If the offload processing stops when the active log is full, the queue manager can end abnormally. Corrective action is required before the queue manager can be restarted.
- Symptoms
- Because of the serious implications of an active log becoming full, the queue manager issues the following warning message when the last available active log data set is 5% full:
CSQJ110E +CSQ1 LAST COPYn ACTIVE LOG DATA SET IS nnn PERCENT FULLand reissues the message after each additional 5% of the data set space is filled. Each time the message is issued, the offload process is started.
- System action
- Messages are issued and offload processing started. If the active log becomes full, further actions are taken. See Active log is full
- System programmer action
- Use the DEFINE LOG command to dynamically add further active log data sets. This permits IBM MQ to continue its normal operation while the error causing the offload problems is corrected. For more information about the DEFINE LOG command, see DEFINE LOG.
Active log is full
- Symptoms
- When the active log becomes full, the queue manager halts processing until an offload process has been completed. If the offload processing stops when the active log is full, the queue manager can end abnormally. Corrective action is required before the queue manager can be restarted. IBM MQ issues the following CSQJ111A message:
CSQJ111A +CSQ1 OUT OF SPACE IN ACTIVE LOG DATA SETSand an offload process is started. The queue manager then halts processing until the offload process has been completed.
- System action
- IBM MQ waits for an available active log data set before resuming normal IBM MQ processing. Normal shut down, with either QUIESCE or FORCE, is not possible because the shutdown sequence requires log space to record system events related to shut down (for example, checkpoint records). If the offload processing stops when the active log is full, the queue manager stops with an X'6C6' abend; restart in this case requires special attention. For more details, see Problem determination on z/OS.
- System programmer action
- We can provide additional active log data sets before restarting the queue manager. This permits IBM MQ to continue its normal operation while the error causing the offload process problems is corrected. To add new active log data sets, use the change log inventory utility (CSQJU003) when the queue manager is not active. For more details about adding new active log data sets, see Change the BSDS. Consider increasing the number of logs by:
- Making sure that the queue manager is stopped, then using the Access Method Services DEFINE command to define a new active log data set.
- Defining the new active log data set in the BSDS, using the change log inventory utility (CSQJU003).
- Adding additional log data sets dynamically, using the DEFINE LOG command.
When you restart the queue manager, offloading starts automatically during startup, and any work that was in progress when IBM MQ was forced to stop is recovered.
- Operator action
- Check whether the offload process is waiting for a tape drive. If it is, mount the tape. If you cannot mount the tape, force IBM MQ to stop by using the z/OS CANCEL command.
Parent topic: Example recovery procedures on z/OS