RDQM disaster recovery and high availability

We can configure a replicated data queue manager (RDQM) that runs on a high availability group on one site, but can fail over to another high availability group at another site if some disaster occurs that makes the first group unavailable. This is known as a DR/HA RDQM.

A DR/HA RDQM combines the features of a high availability RDQM (see RDQM high availability) and a disaster recovery RDQM (see RDQM disaster recovery).

The following diagram shows an example DR/HA RDQM.

The replication between the DR/HA RDQMs on the main site and the disaster recovery site is always asynchronous. With asynchronous replication, operations such as IBM MQ PUT or GET complete and return to the application before the event is replicated to the secondary queue manager.

We can have two active sites rather than 'main' and 'recovery' sites, if required, so some of your DR/HA RDQMs run on one site and some on the other during normal operation. If a disaster occurs and one site becomes unavailable, then all DR/HA RDQMs run on the same HA group at the same site.

Each HA group is configured in the same way as an ordinary HA group. We can define floating IP addresses for a DR/HA RDQM in each HA group. The floating IP address can be the same or different for each HA group.

We cannot upgrade an existing RDQM to be a DR/HA RDQM, create a DR/HA RDQM. (If required, you could back up the data of an existing RDQM, delete it, recreate it as a DR/HA RDQM, and then restore the data, see Backing up and restoring IBM MQ queue manager data.)

To configure DR/HA RDQMs, we must complete the following major steps:
  1. Configure an HA group on the 'main' site.
  2. Configure an HA group on the 'recovery' site.
  3. Create a primary/primary DR/HA RDQM on one node of the HA group in the 'main' site.
  4. Create primary/secondary DR/HA RDQMs on the other two nodes in the 'main' site.
  5. Define a floating IP address for an application to access the DR/HA RDQM when it is running on any of the nodes of the HA group on the 'main' site.
  6. Create a secondary/primary DR/HA RDQM on one node of the HA group on the 'recovery' site.
  7. Create secondary/secondary DR/HA RDQMs on the other two nodes in the 'recovery' site.
  8. Define a floating IP address for an application to access the DR/HA RDQM when it is running on any of the nodes of the HA group on the 'recovery' site.

Details about each of these steps are given in the following topics.

Parent topic: High availability configurations