Queue manager performs the coordination

 


Contents

  1. Database coordination
  2. DB2 configuration
  3. Oracle configuration
  4. Sybase configuration
  5. Multiple database configurations
  6. Security considerations
  7. Administration tasks

 


Database coordination

When the queue manager coordinates global units of work itself, it becomes possible to integrate database updates within the units of work. That is, a mixed MQI and SQL application can be written, and the MQCMIT and MQBACK verbs can be used to commit or roll back the changes to the queues and databases together.

The queue manager achieves this using the two-phase commit protocol described in X/Open Distributed Transaction Processing: The XA Specification. When a unit of work is to be committed, the queue manager first asks each participating database manager whether it is prepared to commit its updates. Only if all the participants, including the queue manager itself, are prepared to commit, are all the queue and database updates committed. If any participant cannot prepare its updates, the unit of work is backed out instead.

In general, a global unit of work is implemented in an application by the following method (in pseudocode):

MQBEGIN
MQGET (include the flag MQGMO_SYNCPOINT in the message options)
MQPUT (include the flag MQPMO_SYNCPOINT in the message options)
SQL INSERT
MQCMIT

The purpose of MQBEGIN is to denote the beginning of a global unit of work. The purpose of MQCMIT is to denote the end of the global unit of work, and to complete it with all participating resource managers, using the two-phase commit protocol.

When the unit of work (also known as a transaction) is completed successfully using MQCMIT, all actions taken within that unit of work are made permanent or irreversible. If, for any reason, the unit of work fails, all actions are instead backed out. It is not acceptable for one action comprising a unit of work to be made permanent while another is forgotten. This is the principle of a unit of work: either all actions within the unit of work are made permanent or none of them are.

Notes:

  1. The application programmer can force a unit of work to be backed out by calling MQBACK. The unit of work is also backed out by the queue manager if the application or database crashes before MQCMIT is called.

  2. If an application calls MQDISC without calling MQCMIT, the queue manager behaves as if MQCMIT had been called, and commits the unit of work.

In between MQBEGIN and MQCMIT, the queue manager does not make any calls to the database to update its resources. That is, the only way a database's tables are changed is by your code (for example, the SQL INSERT in the pseudocode above).

Full recovery support is provided if the queue manager loses contact with any of the database managers during the commit protocol. If a database manager becomes unavailable while it is in doubt, that is, it has successfully prepared to commit, but has yet to receive a commit or backout decision, the queue manager remembers the outcome of the unit of work until that outcome has been successfully delivered to the database. Similarly, if the queue manager terminates with incomplete commit operations outstanding, these are remembered over queue manager restart. If an application terminates unexpectedly, the integrity of the unit of work is not compromised, but the outcome depends on where in the process the application terminated, as described below.

Before the application call to MQCMIT. The unit of work is backed out.
During the application call to MQCMIT, before all databases have indicated that they have successfully prepared. The unit of work is backed out with a reason code of MQRC_BACKED_OUT.
During the application call to MQCMIT, after all databases have indicated that they have successfully prepared, but before all have indicated that they have successfully committed. The unit of work is held in recoverable state by the queue manager, with a reason code of MQRC_OUTCOME_PENDING.
During the application call to MQCMIT, after all databases have indicated that they have successfully committed. The unit of work is committed with a reason code of MQRC_NONE.
After the application call to MQCMIT. The unit of work is committed with a reason code of MQRC_NONE.


Table 15. What happens when an application program crashes

Before the application call to MQCMIT. The unit of work is backed out.
During the application call to MQCMIT, before the queue manager has received the application's MQCMIT request. The unit of work is backed out.
During the application call to MQCMIT, after the queue manager has received the application's MQCMIT request. The queue manager tries to commit using two-phase commit (subject to the database products successfully executing and committing their parts of the unit of work).

In the case where the reason code on return from MQCMIT is MQRC_OUTCOME_PENDING, the unit of work is remembered by the queue manager until it has been able to reestablish contact with the database server, and tell it to commit its part of the unit of work. Refer to Administration tasks for information on how and when recovery is done.

The queue manager communicates with database managers using the XA interface as described in X/Open Distributed Transaction Processing: The XA Specification. Examples of these function calls are xa_open, xa_start, xa_end, xa_prepare, and xa_commit. We use the terms transaction manager and resource manager in the same sense as they are used in the XA specification.

 

Restrictions

The following restrictions apply to the database coordination support:

  • The ability to coordinate database updates within WebSphere MQ units of work is not supported in an MQI client application. The use of MQBEGIN in a client application fails. A program that calls MQBEGIN must run as a server application on the same machine as the queue manager.

    Note:
    A server application is a program that has been linked with the necessary WebSphere MQ server libraries; a client application is a program that has been linked with the necessary WebSphere MQ client libraries.

  • The database server can reside on a different machine from the queue manager server, as long as the database client is installed on the same machine as the queue manager, and it supports this function. Consult the database product's documentation to determine whether their client software can be used for two-phase commit systems.

  • Although the queue manager behaves as a resource manager (for the purposes of being involved in Scenario 2 global units of work), it is not possible to make one queue manager coordinate another queue manager within its Scenario 1 global units of work.

 

Switch load files

The switch load file is a shared library that is loaded by the code in your WebSphere MQ application and the queue manager. Its purpose is to simplify the loading of the database's client shared library, and to return the pointers to the XA functions.

The path to the switch load file is specified before the queue manager is started. The details are placed in the qm.ini file (UNIX systems).

The C source for the switch load file is supplied with the WebSphere MQ installation if it supports Scenario 1 global units of work. The source contains a function called MQStart. When the switch load file is loaded, the queue manager calls this function, which returns the address of a structure called an XA switch.

The XA switch structure exists in the database client shared library, and contains a number of function pointers, as described in Table 16:

Table 16. XA switch function pointers

Function pointer name XA function Purpose
xa_open_entry xa_open Connect to database
xa_close_entry xa_close Disconnect from database
xa_start_entry xa_start Start a branch of a global unit of work
xa_end_entry xa_end Suspend a branch of a global unit of work
xa_rollback_entry xa_rollback Roll back a branch of a global unit of work
xa_prepare_entry xa_prepare Prepare to commit a branch of a global unit of work
xa_commit_entry xa_commit Commit a branch of a global unit of work
xa_recover_entry xa_recover Discover from the database whether it has an in-doubt unit of work
xa_forget_entry xa_forget Allow a database to forget a branch of a global unit of work
xa_complete_entry xa_complete Complete a branch of a global unit of work

During the first MQBEGIN call in your application, the WebSphere MQ code that executes as part of MQBEGIN loads the switch load file, and calls the xa_open function in the database shared library. Similarly, during queue manager startup, and on other subsequent occasions, some queue manager processes load the switch load file and call xa_open.

You can reduce the number of xa_* calls by using dynamic registration. For a complete description of this optimization technique, see XA dynamic registration.

 

Configuring your system for database coordination

There are several tasks that perform before a database manager can participate in global units of works coordinated by the queue manager. These are described here as follows:

 

Installing and configuring the database product

The steps involved in installing and configuring your database product are, of course, described in that product's own documentation. Installation issues are well beyond the scope of this chapter, but we can list general configuration issues, as they relate to the interoperation between WebSphere MQ and the database.

Database connections

An application that establishes a standard connection to the queue manager is associated with a thread in a separate local queue manager agent process. (A connection that is not a fastpath connection is a standard connection in this context. For more information, see "Connecting to a queue manager using the MQCONNX call" in the WebSphere MQ Application Programming Guide.)

When the application issues MQBEGIN, both it and the agent process call the xa_open function in the database client library. In response to this, the database client library code connects to the database that is to be involved in the unit of work from both the application and queue manager processes. These database connections are maintained as long as the application remains connected to the queue manager.

This is an important consideration if the database supports only a limited number of users or connections, because two connections are being made to the database to support the one application program.

Client/server configuration

The database client library that is loaded into the WebSphere MQ queue manager and application processes must be able to send to and receive from its server. Ensure that:

  • The database's client/server configuration files have the correct details

  • The relevant environment variables are set in the environment of the queue manager and the application processes

 

Creating switch load files

WebSphere MQ comes with a sample makefile, used to build switch load files for the supported database managers. This makefile, together with all the associated C source files required to build the switch load files, is installed:

/opt/mqm/samp/xatm/

The sample source modules used to build the switch load files are:

  • For DB2, db2swit.c
  • For Oracle, oraswit.c
  • For Sybase, sybswit.c

 

Adding configuration information to the queue manager

When you have created a switch load file for your database manager, and placed it in a safe location, specify that location to your queue manager. This is done in the queue manager's qm.ini file in the XAResourceManager stanza.

Add an XAResourceManager stanza for the database that your queue manager is going to coordinate. The most common case is for there to be only one database, and therefore only one XAResourceManager stanza. More complicated configurations involving multiple databases, are discussed in Multiple database configurations. The attributes of the XAResourceManager stanza are as follows:

Name=name
User-chosen string that identifies the resource manager. In effect, it gives a name to the XAResourceManager stanza. The name is mandatory and can be up to 31 characters in length.

The name you choose must be unique; there must be only one XAResourceManager stanza with this name in this qm.ini file. The name should also be meaningful, because the queue manager uses it to refer to this resource manager both in queue manager error log messages and in output when the dspmqtrn command is used. (See Displaying outstanding units of work with the dspmqtrn command for more information.)

Once you have chosen a name, and have started the queue manager, do not change the Name attribute. This is discussed in more detail in Changing configuration information.

SwitchFile=name
This is the fully-qualified pathname of the XA switch load file you built earlier. This is a mandatory attribute. The code in the queue manager and WebSphere MQ application processes tries to load the switch load file from this pathname on two occasions:

  1. At queue manager startup
  2. When you make the first call to MQBEGIN in your WebSphere MQ application process

The security and permissions attributes of your switch load file must allow these processes to perform this action.

XAOpenString=string
This is a string of data that WebSphere MQ code passes in its calls to the database manager's xa_open function. This is an optional attribute; if it is omitted a zero-length string is assumed.

The code in the queue manager and WebSphere MQ application processes call the xa_open function on two occasions:

  1. At queue manager startup
  2. When you make the first call to MQBEGIN in your WebSphere MQ application process

The format for this string is particular to each database product, and will be described in the documentation for that product. In general, the xa_open string contains authentication information (user name and password) to allow a connection to the database in both the queue manager and the application processes.

XACloseString=string
This is a string of data that WebSphere MQ code passes in its calls to the database manager's xa_open function. This is an optional attribute; if it is omitted a zero-length string is assumed.

The code in the queue manager and WebSphere MQ application processes call the xa_close function on two occasions:

  1. At queue manager startup
  2. When you make a call to MQDISC in your WebSphere MQ application process, having earlier made a call to MQBEGIN

The format for this string is particular to each database product, and will be described in the documentation for that product. In general, the string is empty, and it is common to omit the XACloseString attribute from the XAResourceManager stanza.

ThreadOfControl=THREAD|PROCESS
The ThreadOfControl value can be THREAD or PROCESS. The queue manager uses it for serialization purposes. This is an optional attribute; if it is omitted, the value PROCESS is assumed.

If the database client code allows threads to call the XA functions without serialization, the value for ThreadOfControl can be THREAD. The queue manager assumes that it can call the XA functions in the database client shared library from multiple threads at the same time, if necessary.

If the database client code does not allow threads to call its XA functions in this way, the value for ThreadOfControl must be PROCESS. In this case, the queue manager serializes all calls to the database client shared library so that only one call at a time is made from within a particular process. You probably also need to ensure that your application performs similar serialization if it runs with multiple threads.

Note that this issue, of the database product's ability to cope with multi-threaded processes in this way, is an issue for that product's vendor. Consult the database product's documentation for details on whether you can set the ThreadOfControl attribute to THREAD or PROCESS. We recommend that, if you can, you set ThreadOfControl to THREAD. If in doubt, the safer option is to set it to PROCESS, although you will lose the potential performance benefits of using THREAD.

 

Writing and modifying your applications

The sample application programs for Scenario 1 global units of work that are supplied with a WebSphere MQ installation are described in the WebSphere MQ Application Programming Guide.

In general, a global unit of work is implemented in an application by the following method (in pseudocode):

MQBEGIN
MQGET
MQPUT
SQL INSERT
MQCMIT

The purpose of MQBEGIN is to denote the beginning of a global unit of work. The purpose of MQCMIT is to denote the end of the global unit of work, and to complete it with all participating resource managers, using the two-phase commit protocol.

In between MQBEGIN and MQCMIT, the queue manager does not make any calls to the database to update its resources. That is, the only way a database's tables are changed is by your code (for example, the SQL INSERT in the pseudocode above).

The role of the queue manager, as far as the database is concerned, is to tell it when a global unit of work has started, when it has ended, and whether the global unit of work should be committed or rolled-back.

As far as your application is concerned, the queue manager performs two roles: a resource manager (where the resources are messages on queues) and the transaction manager for the global unit of work.

We recommend that you start with the supplied sample programs, and work through the various WebSphere MQ and database API calls that are being made in those programs.

 

Testing the system

You only know whether your application and system are correctly configured by running them during testing. You can test the system's configuration (the successful communication between queue manager and database) by building and running one of the supplied sample programs.

 

DB2 configuration

The supported levels of DB2 are defined at:

http://www.ibm.com/software/ts/mqseries/platforms/supported.html

You need to do the following:

  1. Check the environment variable settings.
  2. Create the DB2 switch load file.
  3. Add resource manager configuration information.
  4. Change DB2 configuration parameters if necessary.

Read this information in conjunction with the general information provided in Configuring your system for database coordination.

 

Checking the DB2 environment variable settings

Ensure that your DB2 environment variables are set for queue manager processes as well as in your application processes. In particular, always set the DB2INSTANCE environment variable before you start the queue manager. The DB2INSTANCE environment variable identifies the DB2 instance containing the DB2 databases that are being updated. For example:

export DB2INSTANCE=db2inst1

 

Creating the DB2 switch load file

The easiest way to create the DB2 switch load file is to use the sample file xaswit.mak, which WebSphere MQ provides to build the switch load files for a variety of database products.

On all UNIX systems the directory also contains the C source files that you need to build the switch load file.

Edit xaswit.mak to uncomment the lines appropriate to the version of DB2 you are using. Then execute the makefile using the command:

make -f xaswit.mak db2swit

The generated switch file is placed in your current directory.

 

Adding resource manager configuration information for DB2

The next step is to modify the configuration information for the queue manager, as described in Adding configuration information to the queue manager, to declare DB2 as a participant in global units of work. Use the XAResourceManager stanza of the qm.ini configuration file.

Figure 11 is a UNIX sample, showing some XAResourceManager entries where the database to be updated is called mydbname, this name being specified in the XAOpenString.:

Figure 11. Sample XAResourceManager entry for DB2 on UNIX platforms


XAResourceManager:
  Name=mydb2
  SwitchFile=/home/myuser/mylibraries/db2swit
  XAOpenString=mydbname,myuser,mypasswd
  ThreadOfControl=PROCESS

Note:
DB2 Version 7.x introduces an improved XAOpenString format. However, it continues to support the method shown above, so we have used it as the method that applies to all DB2 versions.

 

Changing DB2 configuration parameters

For each DB2 database that the queue manager is coordinating, you need to:

Set database privileges
The queue manager processes run with effective user and group mqm on UNIX systems. This can be one of:

  1. The user who issued the strmqm command, or
  2. The user under which the IBM MQSeries Service COM server runs

By default, this user is called MUSR_MQADMIN.

If you have not specified a user name and password on the xa_open string, the user under which the queue manager is running is used by DB2 to authenticate the xa_open call. If this user (for example, user mqm on UNIX systems) does not have minimal privileges in the database, the database refuses to authenticate the xa_open call.

The same considerations apply to your application process. If you have not specified a user name and password on the xa_open string, the user under which your application is running is used by DB2 to authenticate the xa_open call that is made during the first MQBEGIN. Again, this user must have minimal privileges in the database for this to work.

For example, give the mqm user connect authority in the mydbname database by issuing the following DB2 commands:

db2 connect to mydbname 
db2 grant connect on database to user mqm 
See Security considerations for more information about security.

Change the tp_mon_name parameter
For DB2 for Windows systems only, change the TP_MON_NAME configuration parameter to name the DLL that DB2 uses to call the queue manager for dynamic registration.

Use the command db2 update dbm cfg using TP_MON_NAME mqmax to name MQMAX.DLL as the library that DB2 uses to call the queue manager. This must be present in a directory within PATH.

Reset the maxappls parameter
You might need to review your setting for the maxappls parameter, which limits the maximum number of applications that can be connected to a database. Refer to Database connections.

 

Oracle configuration

Do the following:

  1. Check environment variable settings.
  2. Create the Oracle switch load file.
  3. Add resource manager configuration information.
  4. Change the Oracle configuration parameters, if necessary.

A current list of levels of Oracle supported by WebSphere MQ is provided at:

http://www.ibm.com/software/ts/mqseries/platforms/supported.html

 

Checking the Oracle environment variable settings

Ensure that your Oracle environment variables are set for queue manager processes as well as in your application processes. In particular, always set the following environment variables before starting the queue manager:

ORACLE_HOME
The Oracle home directory. For example:

export ORACLE_HOME=/opt/oracle/product/8.1.6

ORACLE_SID
The Oracle SID being used. If you are using Net8 for client/server" connectivity, you might not need to set this environment variable. Consult your Oracle documentation.

For example:

export ORACLE_SID=sid1

 

Creating the Oracle switch load file

The easiest way to create the Oracle switch load file is to use the sample file xaswit.mak, which WebSphere MQ provides to build the switch load files for a variety of database products. You can find it in the directory /opt/mqm/samp/xatm. The directory also contains the C source files that you need to build the switch load file.

Edit xaswit.mak to uncomment the lines appropriate to the version of Oracle you are using. Then execute the makefile using the command:

make -f xaswit.mak oraswit

The generated switch file is placed in your current directory.

 

Adding resource manager configuration information for Oracle

The next step is to modify the configuration information for the queue manager, as described in Adding configuration information to the queue manager, to declare Oracle as a participant in global units of work. Use the XAResourceManager stanza of the qm.ini configuration file.

Figure 12 shows a UNIX sample. We recommend that you add a LogDir to the XA open string so that all error and tracing information is logged to the same place.

Figure 12. Sample XAResourceManager entry for Oracle on UNIX platforms


XAResourceManager:
  Name=myoracle
  SwitchFile=/home/myuser/mylibraries/oraswit
  XAOpenString=Oracle_XA+Acc=P/myuser/mypasswd+SesTm=35+LogDir=/tmp
  ThreadOfControl=PROCESS

See the Oracle8 Server Application Developer's Guide for more information on the xa_open string.

 

Changing Oracle configuration parameters

For each Oracle database that the queue manager is coordinating, you need to:

Review your maximum sessions
You might need to review your LICENSE_MAX_SESSIONS and PROCESSES settings to take into account the additional connections required by processes belonging to the queue manager. Refer to Database connections for more details.

Set database privileges
The Oracle user name specified in the xa_open string must have privileges to access the DBA_PENDING_TRANSACTIONS view, as described in the Oracle documentation.

The necessary privilege can be given using the following example command:

 grant select on DBA_PENDING_TRANSACTIONS to myuser; 

 

Multiple database configurations

If you want to configure the queue manager so that updates to multiple databases can be included within global units of work, add an XAResourceManager stanza for each database.

If the databases are all managed by the same database manager, each stanza defines a separate database. Each stanza specifies the same SwitchFile, but the contents of the XAOpenString are different because it specifies the name of the database being updated. For example, the stanzas shown in Figure 15 configure the queue manager with the DB2 databases MQBankDB and MQFeeDB on UNIX systems.

Figure 15. Sample XAResourceManager entries for multiple DB2 databases


XAResourceManager:
  Name=DB2 MQBankDB
  SwitchFile=/usr/bin/db2swit
  XAOpenString=MQBankDB
 
XAResourceManager:
  Name=DB2 MQFeeDB
  SwitchFile=/usr/bin/db2swit
  XAOpenString=MQFeeDB

If the databases to be updated are managed by different database managers, add an XAResourceManager stanza for each. In this case, each stanza specifies a different SwitchFile. For example, if MQFeeDB is managed by Oracle instead of DB2, use the following stanzas on UNIX systems:

Figure 16. Sample XAResourceManager entries for a DB2 and Oracle database


XAResourceManager:
  Name=DB2 MQBankDB
  SwitchFile=/usr/bin/db2swit
  XAOpenString=MQBankDB
 
XAResourceManager:
  Name=Oracle MQFeeDB
  SwitchFile=/usr/bin/oraswit
  XAOpenString=Oracle_XA+Acc=P/scott/tiger+SesTm=35
       +LogDir=/tmp/ora.log+DB=MQFeeDB

In principle, there is no limit to the number of database instances that can be configured with a single queue manager.

 

Security considerations

The following information is provided for guidance only. In all cases, refer to the documentation provided with the database manager to determine the security implications of running your database under the XA model.

An application process denotes the start of a global unit of work using the MQBEGIN verb. The first MQBEGIN call that an application issues connects to all participating databases by calling their client library code at the xa_open entry point. All the database managers provide a mechanism for supplying a user ID and password in their XAOpenString. This is the only time that authentication information flows.

Note that, on UNIX platforms, fastpath applications must run with an effective user ID of mqm while making MQI calls.

 

Administration tasks

In normal operations, only a minimal amount of administration is necessary after you have completed the configuration steps. The administration job is made easier because the queue manager tolerates database managers not being available. In particular this means that:

  • The queue manager can start at any time without first starting each of the database managers.

  • The queue manager does not need to stop and restart if one of the database managers becomes unavailable.

This allows you to start and stop the queue manager independently from the database server.

Whenever contact is lost between the queue manager and a database, they need to resynchronize when both become available again. Resynchronization is the process by which any in-doubt units of work involving that database are completed. In general, this occurs automatically without the need for user intervention. The queue manager asks the database for a list of units of work that are in doubt. It then instructs the database to either commit or roll back each of these in-doubt units of work.

When a queue manager starts, it resynchronizes with each database. When an individual database becomes unavailable, only that database needs to be resynchronized the next time that the queue manager notices it is available again.

The queue manager regains contact with a previously unavailable database automatically as new global units of work are started with MQBEGIN. It does this by calling the xa_open function in the database client library. If this xa_open call fails, MQBEGIN returns with a completion code of MQCC_WARNING and a reason code of MQRC_PARTICIPANT_NOT_AVAILABLE. You can retry the MQBEGIN call later.

Do not continue to attempt a global unit of work that involves updates to a database that has indicated failure during MQBEGIN. There will not be a connection to that database through which updates can be made. Your only options are to end the program, or to retry MQBEGIN periodically in the hope that the database might become available again.

Alternatively, you can use the rsvmqtrn command to resolve explicitly all in-doubt units of work.

 

In-doubt units of work

A database might be left with in-doubt units of work if contact with the queue manager is lost after the database manager has been instructed to prepare. Until the database server receives the outcome from the queue manager (commit or roll back), it needs to retain the database locks associated with the updates.

Because these locks prevent other applications from updating or reading database records, resynchronization needs to take place as soon as possible.

If, for some reason, you cannot wait for the queue manager to resynchronize with the database automatically, you can use facilities provided by the database manager to commit or roll back the database updates manually. In the X/Open Distributed Transaction Processing: The XA Specification, this is called making a heuristic decision. Use it only as a last resort because of the possibility of compromising data integrity; you might, for example, mistakenly roll back the database updates when all other participants have committed their updates.

It is far better to restart the queue manager or use the rsvmqtrn command when the database has been restarted, to initiate automatic resynchronization.

 

Displaying outstanding units of work with the dspmqtrn command

While a database manager is unavailable, you can use the dspmqtrn command to check the state of outstanding global units of work involving that database.

The dspmqtrn command displays only those units of work in which one or more participants are in doubt, awaiting the decision from the queue manager to commit or roll back the prepared updates.

For each of these global units of work, the state of each participant is displayed in the output from dspmqtrn. If the unit of work did not update the resources of a particular resource manager, it is not displayed.

With respect to an in-doubt unit of work, a resource manager is said to have done one of the following things:

Prepared
The resource manager is prepared to commit its updates.

Committed
The resource manager has committed its updates.

Rolled-back
The resource manager has rolled back its updates.

Participated
The resource manager is a participant, but has not prepared, committed, or rolled back its updates.

When the queue manager is restarted, it asks each database having an XAResourceManager stanza for a list of its in-doubt global units of work. If the database has not been restarted, or is otherwise unavailable, the queue manager cannot yet deliver to the database the final outcomes for those units of work. The outcome of the in-doubt units of work is delivered to the database at the first opportunity when the database is again available.

In this case, the database manager is reported as being in prepared state until such time as resynchronization has occurred.

Whenever the dspmqtrn command displays an in-doubt unit of work, it first lists all the possible resource managers that could be participating. These are allocated a unique identifier, RMId, which is used instead of the Name of the resource managers when reporting their state with respect to an in-doubt unit of work.

Figure 17 shows the result of issuing the following command:

dspmqtrn -m MY_QMGR

Figure 17. Sample dspmqtrn output


AMQ7107: Resource manager 0 is MQSeries.
AMQ7107: Resource manager 1 is DB2 MQBankDB.
AMQ7107: Resource manager 2 is DB2 MQFeeDB.
 
AMQ7056: Transaction number 0,1.
    XID: formatID 5067085, gtrid_length 12, bqual_length 4
         gtrid [3291A5060000201374657374]
         bqual [00000001]
AMQ7105: Resource manager 0 has committed.
AMQ7104: Resource manager 1 has prepared.
AMQ7104: Resource manager 2 has prepared.

The output in Figure 17 shows that there are three resource managers associated with the queue manager. The first is resource manager 0, which is the queue manager itself. The other two resource manager instances are the MQBankDB and MQFeeDB DB2 databases.

The example shows only a single in-doubt unit of work. A message is issued for all three resource managers, which means that updates were made to the queue manager and both DB2 databases within the unit of work.

The updates made to the queue manager, resource manager 0, have been committed. The updates to the DB2 databases are in prepared state, which means that DB2 must have become unavailable before it was called to commit the updates to the MQBankDB and MQFeeDB databases.

The in-doubt unit of work has an external identifier called an XID (transaction id). This is a piece of data given to DB2 by the queue manager to identify its portion of the global unit of work.

 

Resolving outstanding units of work with the rsvmqtrn command

The output shown in Figure 17 shows a single in-doubt unit of work in which the commit decision has yet to be delivered to both DB2 databases.

To complete this unit of work, the queue manager and DB2 need to resynchronize when DB2 next becomes available. The queue manager uses the start of new units of work as an opportunity to regain contact with DB2. Alternatively, you can instruct the queue manager to resynchronize explicitly using the rsvmqtrn command.

Do this soon after DB2 has been restarted, so that any database locks associated with the in-doubt unit of work are released as quickly as possible. Use the -a option, which tells the queue manager to resolve all in-doubt units of work. In the following example, DB2 has restarted, so the queue manager can resolve the in-doubt unit of work:

> rsvmqtrn -m MY_QMGR -a
Any in-doubt transactions have been resolved.

 

Mixed outcomes and errors

Although the queue manager uses a two-phase commit protocol, this does not completely remove the possibility of some units of work completing with mixed outcomes. This is where some participants commit their updates and some back out their updates.

Units of work that complete with a mixed outcome have serious implications because shared resources that should have been updated as a single unit of work are no longer in a consistent state.

Mixed outcomes are mainly caused when heuristic decisions are made about units of work instead of allowing the queue manager to resolve in-doubt units of work itself. Such decisions are outside the queue manager's control.

Whenever the queue manager detects a mixed outcome, it produces FFST information and documents the failure in its error logs, with one of two messages:

  • If a database manager rolls back instead of committing:
    AMQ7606 A transaction has been committed but one or more resource
            managers have rolled back.
    

  • If a database manager commits instead of rolling back:
    AMQ7607 A transaction has been rolled back but one or more resource
            managers have committed.
    

Further messages identify the databases that are heuristically damaged. It is then your responsibility to locally restore consistency to the affected databases. This is a complicated procedure in which you need first to isolate the update that has been wrongly committed or rolled back, then to undo or redo the database change manually.

 

Changing configuration information

After the queue manager has successfully started to coordinate global units of work, do not change any of the resource manager configuration information.

To change the configuration information you can do so at any time, but the changes do not take effect until after the queue manager has been restarted.

If you remove the resource manager configuration information for a database, you are effectively removing the ability for the queue manager to contact that database manager.

Never change the Name attribute in any of your resource manager configuration information. This attribute uniquely identifies that database manager instance to the queue manager. If you change this unique identifier, the queue manager assumes that the database has been removed and a completely new instance has been added. The queue manager still associates outstanding units of work with the old Name, possibly leaving the database in an in-doubt state.

 

Removing database manager instances

To remove a database from your configuration permanently, ensure that the database is not in doubt before you restart the queue manager. Database products provide commands for listing in-doubt transactions. If there are any in-doubt transactions, first allow the queue manager to resynchronize with the database. Do this by starting the queue manager. You can verify that resynchronization has taken place by using the rsvmqtrn command or the database's own command for viewing in-doubt units of work. Once you are satisfied that resynchronization has taken place, end the queue manager and remove the database's configuration information.

If you fail to observe this procedure the queue manager still remembers all in-doubt units of work involving that database. A warning message, AMQ7623, is issued every time the queue manager is restarted. If you are never going to configure this database with the queue manager again, use the -r option of the rsvmqtrn command to instruct the queue manager to forget about the database's participation in its in-doubt transactions. The queue manager forgets about such transactions only when in-doubt transactions have been completed with all participants.

There are times when you might need to remove some resource manager configuration information temporarily. On UNIX systems this is best achieved by commenting out the stanza so that it can be reinstated at a later time. You might decide to do this if there are errors every time the queue manager contacts a particular database or database manager. Temporarily removing the resource manager configuration information concerned allows the queue manager to start global units of work involving all the other participants. Here is an example of a commented-out XAResourceManager stanza follows:

Figure 18. Commented- out XAResourceManager stanza on UNIX systems


# This database has been temporarily removed
#XAResourceManager:
#  Name=DB2 MQBankDB
#  SwitchFile=/usr/bin/db2swit
#  XAOpenString=MQBankDB

 

XA dynamic registration

The XA specification provides a way of reducing the number of xa_* calls that a transaction manager makes to a resource manager. This optimization is known as dynamic registration. Dynamic registration is supported by DB2. Other databases might support it; consult the documentation for your database product for details.

Why is the dynamic registration optimization useful? In your application, some global units of work might contain updates to database tables; others might not contain such updates. When no persistent update has been made to a database's tables, there is no need to include that database in the commit protocol that occurs during MQCMIT.

Whether or not your database supports dynamic registration, your application calls xa_open during the first MQBEGIN call on a WebSphere MQ connection. It also calls xa_close on the subsequent MQDISC call. The pattern of subsequent XA calls depends on whether the database supports dynamic registration:

If your database does not support dynamic registration...
Every global unit of work involves several XA function calls made by WebSphere MQ code into the database client library, regardless of whether you made a persistent update to the tables of that database within your unit of work. These include:

  • xa_start and xa_end from the application process. These are used to declare the beginning and end of a global unit of work.

  • xa_prepare, xa_commit, and xa_rollback from the queue manager agent process, amqzlaa0. These are used to deliver the outcome of the global unit of work: the commit or rollback decision.

In addition, the queue manager agent process also calls xa_open during the first MQBEGIN.

If your database supports dynamic registration...
The WebSphere MQ code makes only those XA function calls that are necessary. For a global unit of work that has not involved persistent updates to database resources, there are no XA calls to the database. For a global unit of work that has involved such persistent updates, the calls are to:

  • xa_end from the application process to declare the end of the global unit of work.

  • xa_prepare, xa_commit, and xa_rollback from the queue manager agent process, amqzlaa0. These are used to deliver the outcome of the global unit of work: the commit or rollback decision.

For dynamic registration to work, it is vital that the database has a way of telling WebSphere MQ when it has performed a persistent update that it wants to be included in the current global unit of work. WebSphere MQ provides the ax_reg function for this purpose.

The database's client code that runs in your application process finds the ax_reg function and calls it, to dynamically register the fact it has done persistent work within the current global unit of work. In response to this ax_reg call, WebSphere MQ records that the database has participated. If this is the first ax_reg call on this WebSphere MQ connection, the queue manager agent process calls xa_open.

The database client code make this ax_reg call when it is running in your process, for example, during an SQL UPDATE call or whatever call in the database's client API is responsible

 

Error conditions

There is an opportunity here for a confusing failure in the queue manager. Here is a common example. If you forget to set your database environment variables properly before starting your queue manager, the queue manager's calls to xa_open fail. No global units of work can be used.

To avoid this, ensure that you have set the relevant environment variables before starting the queue manager. Review your database product's documentation, and the advice given in Checking the DB2 environment variable settings, Checking the Oracle environment variable settings, and Checking the Sybase environment variable settings.

With all database products, the queue manager calls xa_open once at queue manager startup, as part of the recovery session (as explained in Administration tasks). This xa_open call fails if you set your database environment variables incorrectly, but it does not cause the queue manager to fail to start. This is because the same xa_open error code is used by the database client library to indicate that the database server is unavailable. We do not treat this as a serious error, as the queue manager must be able to start to continue processing data outside global units of work involving that database.

Subsequent calls to xa_open are made from the queue manager during the first MQBEGIN on a WebSphere MQ connection (if dynamic registration is not being used) or during a call by the database client code to the WebSphere MQ-provided ax_reg function (if dynamic registration is being used).

The timing of any error conditions (or, occasionally, FFST reports) depends on whether you are using dynamic registration:

  • If you are using dynamic registration, your MQBEGIN call could succeed, but your SQL UPDATE (or similar) database call will fail.

  • If you are not using dynamic registration, your MQBEGIN call will fail.

Ensure that your environment variables are set correctly in your application and queue manager processes.

 

Summarizing XA calls

Table 17 lists the calls that are made to the XA functions in a database client library as a result of the various MQI calls that control global units of work. This is not a complete description of the protocol described in the XA specification; we have provided it as a brief overview.

Note that xa_start and xa_end calls are always called by WebSphere MQ code in the application process, whereas xa_prepare, xa_commit, and xa_rollback are always called from the queue manager agent process, amqzlaa0.

The xa_open and xa_close calls shown in this table are all made from the application process. The queue manager agent process calls xa_open in the circumstances described in Error conditions.

Table 17. Summary of XA function calls

MQI call XA calls made with dynamic registration XA calls made without dynamic registration
First MQBEGIN
xa_open
xa_open
xa_start
Subsequent MQBEGIN No XA calls
xa_start
MQCMIT (without ax_reg being called during the current global unit of work) No XA calls
xa_end
xa_prepare
xa_commit
xa_rollback
MQCMIT (with ax_reg being called during the current global unit of work)
xa_end
xa_prepare
xa_commit
xa_rollback
Not applicable. No calls are made to ax_reg in non-dynamic mode.
MQBACK (without ax_reg being called during the current global unit of work) No XA calls
xa_end
xa_rollback
MQBACK (with ax_reg being called during the current global unit of work)
xa_end
xa_rollback
Not applicable. No calls are made to ax_reg in non-dynamic mode.
MQDISC, where MQCMIT or MQBACK was called first. If they were not, MQCMIT processing is first done during MQDISC.
xa_close
xa_close
HEADERS="COL1 COL2 COL3" COLSPAN="3"

Notes:

  1. For MQCMIT, xa_commit is called if xa_prepare is successful. Otherwise, xa_rollback is called.

 

WebSphere is a trademark of the IBM Corporation in the United States, other countries, or both.

 

IBM is a trademark of the IBM Corporation in the United States, other countries, or both.