General operations for a distributed relational database
To plan for the general operation of a distributed relational database, consider both performance and availability.
The following design considerations can help you improve both the performance and availability of a distributed relational database:
- If an application involves transactions that run frequently or that send or receive a lot of data, you should try to keep it in the same location as the data.
- For data that needs to be shared by applications in different locations, put the data in the location with the most activity.
- If the applications in one location need the data as much as the applications in another location, consider keeping copies of the data at both locations. When keeping copies at multiple locations, ask yourself the following questions about your management strategy:
- Will users be allowed to make updates to the copies?
- How and when will the copies be refreshed with current data?
- Will all copies have to be backed up or will backing up one copy be sufficient?
- How will general administration activities be performed consistently for all copies?
- When is it permissible to delete one of the copies?
- Consider whether the distributed databases will be administered from a central location or from each database location.
You can also improve performance by doing the following things:
- If data and applications must be kept at different locations, do the following things to keep the performance within acceptable limits:
- Keep data traffic across the network as low as possible by only retrieving the data columns that will be used by the application; that is, avoid using * in place of a list of column names as part of a SELECT statement.
- Discourage programmers from coding statements that send large amounts of data to or receive large amounts of data from a remote location; that is, encourage the use of the WHERE clause of the SELECT statement to limit the number of rows of data.
- Use referential integrity, triggers, and stored procedures (an SQL CALL statement after a CONNECT to a remote relational database management system); this improves performance by distributing processing to the application server (AS), which can substantially reduce line traffic.
- Use read-only queries where appropriate by specifying the FOR FETCH ONLY clause.
- Be aware of rules for blocking of queries. For example, in queries between i5/OS® operating systems, blocking of read-only data is done only for COMMIT(*NONE), or for COMMIT(*CHG) and COMMIT(*CS) when ALWBLK(*ALLREAD) is specified.
- Keep the number of accesses to remote data low by using local data in place of remote data whenever possible.
- Use SQL set operations to process multiple rows at the application requester with a single SQL request.
- Try to avoid dropping of connections by using DDMCNV(*KEEP) when running with remote unit of work (RUW) connection management, or by running with distributed unit of work (DUW) connection management.
- Provide sufficient network capacity by doing the following things:
- Increase the capacity of the network by installing high-speed, high-bandwidth lines or by adding lines at appropriate points in the network.
- Reduce the contention or improve the contention balance on certain processors. For example, move existing applications from a host system to a departmental system, or group some distributed relational database work into batch.
- Encourage good table design. At the distributed relational database locations, encourage appropriate use of primary keys, table indexes, and normalization techniques.
- Ensure data types of host variables used in WHERE clauses are consistent with the data types of the associated key column data types. For example, a floating-point host variable has been known to disqualify the use of an index built over a column of a different data type.
You can also improve availability by doing the following things:
- In general, try to limit the amount of data traffic across the network.
- If data and applications must be kept at different locations, do the following things to keep the availability within acceptable limits:
- Establish alternate network routes.
- Consider the effect of time zone differences on availability:
- Will qualified people be available to start the system?
- Will off-hours batch work interfere with processing?
- Ensure good backup and recovery features.
- Ensure people are skilled in backup and recovery.
Parent topic:
Developing a management strategy for a distributed relational database