Recovery for journal management after abnormal system end

 

This topic describes the recovery actions that take place in the event of an abnormal system end.

If the system abnormally ends while you are journaling objects, the system does the following:

  1. Brings all journals, journal receivers, and objects you are journaling to a usable and predictable condition during the IPL or vary on of an independent disk pool, including any access paths being journaled and in use at the time the system abnormally ended.

  2. Checks all recently recorded entries in the journal receivers that were attached to a journal.

  3. Places an entry in the journal to indicate that an abnormal system end occurred. When the system completes the IPL or vary on of an independent disk pool, all entries are available for processing.

  4. Checks that the journal receivers attached to journals can be used for normal processing of the journal entries. If some of the objects you are journaling could not be synchronized with the journal, the system sends message CPF3172 to the history log (QHST) that identifies the journals that could not be synchronized. If a journal or a journal receiver is damaged, the system sends a message to the history log identifying the damage that occurred (message CPF3171 indicates that the journal is damaged, and messages CPF3173 or CPF3174 indicate that the journal receiver is damaged). If a journal or journal receiver is found to no longer exist within a library, the system sends message CPI70EE to the history log.

  5. Recovers each object that was in use at the time the system ended abnormally, using the normal system recovery procedures for objects.

In addition, if an object being journaled was opened for output, update, or delete operations, the system performs the following functions so changes to that object will not be lost:

  1. Ensures that the changes appear in the object. Changes that do not appear in the journal receiver are not in the object.

  2. Places an entry in the journal receiver that indicates whether the object was synchronized with the journal. For database files, if the file could not be synchronized with the journal, the system places message CPF3175 in the history log identifying the failure, and correct the problem. For other journaled objects, the system places message CPF700C in the history log identifying the failure, and correct the problem.

A synchronization failure can occur if the data portion of the object is damaged, a journal receiver required to perform the synchronization is damaged, or the journal is inoperable. After an abnormal system end, perform the following steps:

  1. Perform a manual IPL.

  2. Check the history log to determine if there are any damaged objects, objects that are not synchronized, or any damaged journals or journal receivers.

  3. If necessary, recover the damaged journals or journal receivers as described in Recover a damaged journal receiver and Recover a damaged journal.

  4. If there is a damaged object:

    1. Delete the object.

    2. Restore the object from the latest saved version.

    3. Allocate the object so no one else can access it.

    4. Restore the needed journal receivers if they are not online. Journal receivers do not need to be restored in a particular sequence. The system establishes the receiver chains correctly when they are restored.

    5. Use the APYJRNCHG or APYJRNCHGX command to apply the changes to the object.

    6. Deallocate the object.

  5. If an object could not be synchronized, use the information in the history log and in the journal to determine why the object could not be synchronized and how to proceed with recovery. For example, you may need to use the DFU or a user-written program to bring a database file to a usable condition.

  6. Determine which applications or programs were active, and determine where to restart the applications from the information in the history log and in the journal.

If a journaled access path is in use during an abnormal system end, that access path does not appear on the Edit Rebuild Access Path display.

If the maintenance for the access path is immediate or delayed, the system automatically recovers the access path during IPL or vary on of an independent disk pool. A status message is displayed for each access path whose maintenance is immediate or delayed as it is being recovered during an IPL or vary on of an independent disk pool. The system places message CPF3123 in the system history log for each access path that is recovered through the journal during the IPL or vary on of an independent disk pool. This message appears for access paths that are explicitly journaled and for access paths that are protected by SMAPP.

 

Parent topic:

Recovery operations for journal management