WAS v8.5 > Tune performance > Tune Service integration > Tune messaging engine data stores

Tune the detection of database connection loss

If a messaging engine is configured to use a data store and cannot connect to its data store, for example because the database containing the data store is not running, the messaging engine does not start. We can tune your system to increase the chance of a successful start of the messaging engine. In a single-server environment, when we start the application server the messaging engine attempts to start. If the database is unavailable for more than 15 minutes, the messaging engine might enter the stopped state and need to be started manually.

We can increase the chance of the messaging engine starting successfully by configuring various parameters, such as the 15 minute default timeout, on the database server or application server.

  1. On the database server, configure the operating system to minimize the amount of time taken to detect the loss of a network connection to an application server. Refer to the documentation for the operating system for details. For example, the following table lists the relevant parameters for Windows and AIX operating systems:

    TCP/IP parameters. The first column of the table provides the list of TCP/IP parameters for the Windows operating systems. The second column of the table provides the list of TCP/IP parameters for the AIX operating systems. The third column provides the description of the parameters.

    Parameter name on Windows operating systems Parameter name on AIX operating systems Description
    KeepAliveTime tcp_keepidle The amount of time (in milliseconds on Windows operating systems and in 0.5 seconds on AIX operating systems) to wait before sending a keepalive request for an inactive connection.
    KeepAliveInterval tcp_keepintvl The amount of time (in milliseconds on Windows operating systems and in 0.5 seconds on AIX operating systems) to wait for a response.
    TCPMaxDataRetransmissions tcp_keepcnt The number of requests to send before ending the connection.
    We can calculate the total amount of time taken for the database server to detect the failure of the connection to the application server, using the following formula:

    time to detect connection failure = keep alive time + (keep alive interval x number of requests) For example, for a Windows system with the parameters set according to the following table, the total amount of time taken for the database server to detect the failure of the connection to the application server is 350 seconds.

    Example parameter values. The first column provides the parameter names. The second column provides a sample value for the parameters.

    Parameter Value
    KeepAlive 300000 milliseconds
    KeepAliveInterval 10000 milliseconds
    TCPMaxDataRetransmissions 5
    Your database product might also have relevant parameters that we can configure, for example, the IDLE THREAD TIMEOUT parameter in DB2 for z/OS .

    When the database server detects the loss of the connection to the application server, the database releases the locks on the data store. The messaging engine can now access the data store and can therefore start successfully.

  2. On the application server, tune the messaging engine to wait for an appropriate amount of time for the data store to become available. By default, the messaging engine will attempt to connect to the data store every 2 seconds for 15 minutes. Complete the rest of this step to adjust these timings.

    1. Click Service integration -> Buses -> bus_name -> [Topology] Messaging engines -> engine_name -> [Additional Properties] Custom properties to navigate to the custom properties panel for the messaging engine.

    2. Click New.

    3. Type sib.msgstore.jdbcInitialDatasourceWaitTimeout in the Name field and an appropriate value in the Value field. This property is the time, in milliseconds, to wait for the data store to become available. Default is 900000 (15 minutes). This time includes the time required to establish a connection to the database and to obtain the required table locks.

      Ensure the value of this property is greater than the total time taken for the database server to detect the loss of a network connection, as configured in step 1.

    4. Click OK.

    5. Click New.

    6. Type sib.msgstore.jdbcStaleConnectionRetryDelay in the Name field and an appropriate value in the Value field. This property is the time, in milliseconds, to wait between attempts to connect to the data store. Default is 2000 (2 seconds). For example, if you set the sib.msgstore.jdbcInitialDatasourceWaitTimeout property to 600000, and the sib.msgstore.jdbcStaleConnectionRetryDelay property to 3000, the messaging engine will attempt to connect every 3 seconds until 10 minutes has passed.

    7. Click OK.

    8. Save your changes to the master configuration.

    9. Restart the application server.


Results

By configuring these parameters and custom properties, you minimize the amount of time taken for the database server to detect the loss of a network connection, and ensure the messaging engine waits for a reasonable amount of time for the database connection to recover before attempting to start.

You might want to configure the messaging engine and server to restart in the event of a database connection failure. This behavior reduces the risk of the messaging engine being in an inconsistent state when the database connection is restored.


Related concepts:

Data store life cycle


Related

Microsoft support article: TCP/IP and NBT configuration parameters for Windows XP
Configure messaging engine and server behavior when a data store connection is lost


Reference:

Service integration custom properties


+

Search Tips   |   Advanced Search