Configure messaging engine and server behavior when a data store connection is lost
If the connection between a running messaging engine and its data store is lost, either due to a failure or because you stop the database for maintenance, we can ensure that the messaging engine functions correctly after the connection is restored, by configuring the server to restart automatically.
The behavior described in this topic occurs only if the messaging engine is running and has established exclusive locks on its data store.
By setting the sib.msgstore.jdbcFailoverOnDBConnectionLoss custom property on a messaging engine, we can determine the behavior of the messaging engine and its hosting server in the event that the connection to the data store is lost.
Property value Behavior when the data store connection is lost true (default) The high availability manager stops the messaging engine and its hosting application server when the next core group service Is alive check takes place (the default value is 120 seconds). If a node agent is monitoring the server, and we have enabled automatic restart in the monitoring policy for the server, the server restarts. The messaging engine starts when an appropriate server is available.
Messages with a reliability level that is lower than assured persistent might be accepted by the messaging engine during the interval between Is alive checks, and might be lost.
false The messaging engine continues to run and accept work, and periodically attempts to regain the connection to the data store. If work continues to be submitted to the messaging engine while the data store is unavailable, the results can be unpredictable, and the messaging engine can be in an inconsistent state when the data store connection is restored. If work continues to be submitted to the messaging engine, even nonpersistent messaging can fail because the messaging engine might need to use the data store, for example to allocate a unique ID to a message, or to move nonpersistent messages out of memory.
(ZOS) false (ZOS) The messaging engine continues to run and accept work, and periodically attempts to regain the connection to the data store.
On z/OS where the high availability environment is in place (incorporating clustered WASs, and DB2 data sharing groups), the setting of false is preferred and recommended. One scenario where the setting of false is not appropriate is a cluster with one member only and no server for the messaging engine to failover to.
Tasks
- Click Service integration -> Buses -> bus_name -> [Topology] Messaging engines -> engine_name -> [Additional Properties] Custom properties to navigate to the custom properties panel for the messaging engine.
- Click New.
- Type sib.msgstore.jdbcFailoverOnDBConnectionLoss in the Name field and true in the Value field.
- Click OK.
- Save changes to the master configuration.
- Restart the application server.
- If we have a cluster, repeat the previous steps to add this property for every messaging engine in the cluster.
If the connection between the messaging engine and its data store is lost, the application server hosting the messaging engine shuts down.
If we want the server to restart, ensure that Automatic restart is selected in the monitoring policy for the server.
What to do next
If a server restarts automatically in this situation, CWSID0039E messages appear in the JVM logs for the server.
After a server restart, click Service integration -> Buses -> bus_name -> [Topology] Messaging engines to view the status of the messaging engine. Check that the messaging engine has been restarted and is running.
If the server is a member of a cluster, check that the cluster members are still enabled for high availability, by following the instructions in the topic Manage high availability when messaging engines fail to start.
We might want to tune the system so that the loss of the database connection is detected quickly, and the messaging engine waits for a reasonable amount of time for the data store to become available again before attempting to start on another server.
Tune the detection of database connection loss Service integration custom properties Core group service settings Monitor policy settings