Core group administration considerations
Core group configuration information is stored in a CoreGroup configuration object that is backed by a coregroup.xml document. Process-specific configuration information for each core group member is stored in a HAManagerService configuration object that is backed by a hamanagerservice.xml document.
The coregroup.xml document is a cell-scoped document. The master copy of this document is stored in the configuration repository for the deployment manager. A copy of this document is shadowed to every node in the cell. The coregroup.xml document includes of the following configuration information:
- The list of core group members
- The high availability policies for the core group
- The core group coordinator configuration information
- The core group transport configuration information
The core group member process-specific configuration information stored in the hamanagerservice.xml document includes:
- Whether the high availability manager is enabled.
- The transport buffer size.
- The name of the core group to which the member belongs.
- How frequently the high availability manager checks the health of highly available singletons running on the member, if a length of time is in affect for this function.
Core group configuration document
The master copy of the core group configuration document is directly modified when direct attributes, such as the coordinator configuration, are modified. The master copy of the core group configuration document is implicitly modified when a server is created or deleted, or a node is added or removed. In either case, the list of core group members is updated to reflect which processes are added or removed.
The set of core group members for which the View Synchrony Protocol is established is commonly referred to as a view. Whenever a view is installed, one of the core group members is elected to send its current configuration to all other members of the view. This processing ensures that all members of the view are running with a consistent core group configuration. This processing also means that inconsistencies in a high availability policy or coordinator configuration are tolerated. However, inconsistencies in the list of core group members or the core group transport are not tolerated.
Before you modify a list of core group members, remember that all core groups must contain at least one administrative process. In a situation where the configuration document is synchronized to all of the nodes in the cell, and you have multiple core group processes running, the running core group administrative processes are notified whenever the configuration document is modified. The high availability manager selects one of the administrative processes to reread the configuration and distribute the updated configuration to all of the other core group members in the same view. These changes are then dynamically picked up by all of these other core group members. If the core group does not contain at least one administrative process that is running when a configuration change is made, the updated configuration is not properly passed on to the core group members.
If you modify a list of core group members, do not start a member of that core group until you are sure that the change is fully synchronized to all nodes in the cell. If a node agent is down when the configuration change is made, manually synchronize the configuration change before any processes are started on that node. If you do not manually synchronize the change, the process that is starting cannot establish the View Synchrony Protocol with the other core group members because when a core group member starts, it reads the core group configuration information from the repository on the local node. It then opens connections to other core group members and attempts to establish the View Synchrony Protocol with them. If the local copy of the coregroup.xml document is not synchronized with the master core group configuration document, problems occur. For example, if the running processes dynamically reloaded the updated configuration, the configuration for the process that just started is out of sync with the configurations of the other core group members. If the update changed the list of core group members, the list is now inconsistent across the nodes in the cell, and any attempt to establish view synchrony fails because of these inconsistent member lists. When this condition is detected, an error message similar to the following message is logged:
DCSV8022I: DCS Stack {0} at Member {1}: Inconsistency of configured defined set with that of another member. Inconsistent member is {2}. The list of members only in the local defined set is {3}, whereas the list of members only in the defined set at the inconsistent member is {4}.When a process detects an inconsistent core group membership condition, the process attempts to reread the core group configuration several times. It is possible that the configuration document is in the process of being synchronized to the node. In such a case, rereading the configuration document can resolve the inconsistency. However, if the process can not resolve the inconsistency after trying to reread the configuration several times, the process stops trying to resolve the inconsistency. To recover from this situation, resynchronize the configuration and restart the process.
Core group process-specific configuration document
Unlike the cell-scoped core group configuration information that is contained in the coregroup.xml document, the process-specific configuration information for each core group member that is contained in the hamanagerservice.xml document cannot be dynamically reloaded. You must restart a process before core group process-specific configuration changes go into affect.
Related concepts
Core groups (high availability domains)