Multi-instance queue managers

Multi-instance queue managers are instances of the same queue manager configured on different servers. One instance of the queue manager is defined as the active instance and another instance is defined as the standby instance. If the active instance fails, the multi-instance queue manager restarts automatically on the standby server.


Example multi-instance queue manager configuration

Figure 1 shows an example of a multi-instance configuration for queue manager QM1. IBM MQ is installed on two servers, one of which is a spare. One queue manager, QM1, has been created. One instance of QM1 is active, and is running on one server. The other instance of QM1 is running in standby on the other server, doing no active processing, but ready to take over from the active instance of QM1, if the active instance fails.

Figure 1. Multi-instance queue manager

When you intend to use a queue manager as a multi-instance queue manager, create a single queue manager on one of the servers using the crtmqm command, placing its queue manager data and logs in shared network storage. On the other server, rather than create the queue manager again, use the addmqinf command to create a reference to the queue manager data and logs on the network storage.

We can now run the queue manager from either of the servers. Each of the servers references the same queue manager data and logs; there is only one queue manager, and it is active on only one server at a time.

The queue manager can run either as a single instance queue manager, or as a multi-instance queue manager. In both cases only one instance of the queue manager is running, processing requests. The difference is that when running as a multi-instance queue manager, the server that is not running the active instance of the queue manager runs as a standby instance, ready to take over from the active instance automatically if the active server fails.

The only control you have over which instance becomes active first is the order in which you start the queue manager on the two servers. The first instance to acquire read/write locks to the queue manager data becomes the active instance.

We can swap the active instance to the other server, once it has started, by stopping the active instance using the switchover option to transfer control to the standby.

The active instance of QM1 has exclusive access to the shared queue manager data and logs folders when it is running. The standby instance of QM1 detects when the active instance has failed, and becomes the active instance. It takes over the QM1 data and logs in the state they were left by the active instance, and accepts reconnections from clients and channels.

The active instance might fail for various reasons that result in the standby taking over:

  • Failure of the server hosting the active queue manager instance.
  • Failure of connectivity between the server hosting the active queue manager instance and the file system.
  • Unresponsiveness of queue manager processes, detected by IBM MQ, which then shuts down the queue manager.

We can add the queue manager configuration information to multiple servers, and choose any two servers to run as the active/standby pair. There is a limit of a total of two instances. We cannot have two standby instances and one active instance.


Additional components needed to build a high availability solution

A multi-instance queue manager is one part of a high availability solution. You need some additional components to build a useful high availability solution.

  • Client and channel reconnection to transfer IBM MQ connections to the computer that takes over running the active queue manager instance.

  • A high performance shared network file system (NFS) that manages locks correctly and provides protection against media and file server failure. Important: We must stop all multi-instance queue manager instances that are running in the environment before we can perform maintenance on the NFS drive. Make sure that we have queue manager configuration backups to recover, in the event of an NFS failure.

  • Resilient networks and power supplies to eliminate single points of failure in the basic infrastructure.

  • Applications that tolerate failover. In particular we need to pay close attention to the behavior of transactional applications, and to applications that browse IBM MQ queues.

  • Monitor and management of the active and standby instances to ensure that they are running, and to restart active instances that have failed. Although multi-instance queue managers restart automatically, we need to be sure that your standby instances are running, ready to take over, and that failed instances are brought back online as new standby instances.

IBM MQ MQI clients and channels reconnect automatically to the standby queue manager when it becomes active. More information about reconnection, and the other components in a high availability solution can be found in related topics. Automatic client reconnect is not supported by IBM MQ classes for Java.


Supported platforms

We can create a multi-instance queue manager on any non-z/OS platform supported by IBM WebSphere MQ Version 7.0.1 and later.

Automatic client reconnection is supported for MQI clients by IBM WebSphere MQ Version 7.0.1 and later.

  • Create a multi-instance queue manager
    Create the queue manager on one server, and configure IBM MQ on another server. Multi-instance queue managers share queue manager data and logs.

  • Delete a multi-instance queue manager
    To delete a multi-instance queue manager completely, we use the dltmqm command to delete the queue manager, and then remove instances from other servers using either the rmvmqinf or dltmqm commands.

  • Start and stop a multi-instance queue manager
    Starting and stopping a queue manager configured on Multiplatforms either as a single instance or a multi-instance queue manager.

  • Shared file system
    A multi-instance queue manager uses a networked file system to manage queue manager instances.

  • Multiple queue manager instances
    A multi-instance queue manager is resilient because it uses a standby queue manager instance to restore queue manager availability after failure.

  • Failover or switchover
    A standby queue manager instance takes over from the active instance either on request (switchover), or when the active instance fails (failover).

  • Channel and client reconnection
    Channel and client reconnection is an essential part of restoring message processing after a standby queue manager instance has become active.

  • Application recovery
    Application recovery is the automated continuation of application processing after failover. Application recovery following failover requires careful design. Some applications need to be aware failover has taken place.

  • Data recovery and high availability
    High availability solutions using multi-instance queue managers must include a mechanism to recover data after a storage failure.

Parent topic: High availability configurations