Recovery support for a distributed relational database
Failures that can occur on a computer system are a system failure (when the entire system is not operating); a loss of the site because of fire, flood, or similar catastrophe; or the damage or loss of an object. For a distributed relational database, a failure on one system in the network prevents users across the entire network from accessing the relational database on that system.
If the relational database is critical to daily business activities at other locations, enterprise operations across the entire network can be disrupted for the duration of one system's recovery time. Clearly, planning for data protection and recovery after a failure is particularly important in a distributed relational database.
Each system in a distributed relational database is responsible for backing up and recovering its own data. Each system in the network also handles recovery procedures after an abnormal system end. However, backup and recovery procedures can be done by the distributed relational database administrator using display station pass-through for those systems with an inexperienced operator or no operator at all.
The most common type of loss is the loss of an object or group of objects. An object can be lost or damaged because of several factors, including power failure, hardware failures, system program errors, application program errors, or operator errors. The i5/OS® operating system provides several methods for protecting the system programs, application programs, and data from being permanently lost. Depending on the type of failure and the level of protection chosen, most of the programs and data can be protected, and the recovery time can be significantly reduced.
You can use the following methods to protect your data and programs:
Writing data to auxiliary storageThe Force-Write Ratio (FRCRATIO) parameter on the Create File command can be used to force data to be written to auxiliary storage. A force-write ratio of one causes every add, update, and delete request to be written to auxiliary storage immediately for the table in question. However, choosing this option can reduce system performance. Therefore, saving your tables and journaling tables should be considered the primary methods for protecting the database.
Physical protectionMaking sure your system is protected from sudden power loss is an important part of ensuring that your application server (AS) is available to an application requester (AR). An uninterruptible power supply, which can be ordered separately, protects the system from loss because of power failure, power interruptions, or drops in voltage by supplying power to the system devices until power can be restored. Normally, an uninterruptible power supply does not provide power to all workstations. With the System i™ product, the uninterruptible power supply allows the system to:
- Continue operations during brief power interruptions or momentary drops in voltage.
- End operations normally by closing files and maintaining object integrity.
- Data recovery after disk failures for distributed relational databases
Recovery is not possible for recently entered data if a disk failure occurs and all objects are not saved on tape or disk immediately before the failure. After previously saved objects are restored, the system is operational, but the database is not current.
- Journal management for distributed relational databases
Journal management can be used as a part of the backup and recovery strategy for relational databases and indexes.
- Transaction recovery through commitment control
Commitment control is an extension of the i5/OS journal management function. The system can identify and process a group of relational database changes as a single unit of work (transaction).
- Save and restore processing for a distributed relational database
Saving and restoring data and programs allows recovery from a program or system failure, exchange of information between systems, or storage of objects or data offline. A comprehensive backup policy at each system in the distributed relational database network ensures that a system can be restored and quickly made available to network users in the event of a problem.
Parent topic:
Data availability and protection