+

Search Tips | Advanced Search

Application issues seen when running REFRESH CLUSTER

Issuing REFRESH CLUSTER is disruptive to the cluster. It might make cluster objects invisible for a short time until the REFRESH CLUSTER processing completes. This can affect running applications. These notes describe some of the application issues you might see.


Reason codes that you might see from MQOPEN, MQPUT, or MQPUT1 calls

During REFRESH CLUSTER the following reason codes might be seen. The reason why each of these codes appears is described in a later section of this topic.

All these reason codes indicate name lookup failures at one level or another in the IBM MQ code, which is to be expected if apps are running throughout the time of the REFRESH CLUSTER operation.

The REFRESH CLUSTER operation might be happening locally, or remotely, or both, to cause these outcomes. The likelihood of them appearing is especially high if full repositories are very busy. This happens if REFRESH CLUSTER activities are running locally on the full repository, or remotely on other queue managers in the cluster or clusters that the full repository is responsible for.

In respect of cluster queues that are absent temporarily, and will shortly be reinstated, then all of these reason codes are temporary retry-able conditions (although for 2041 MQRC_OBJECT_CHANGED it can be a little complicated to decide whether the condition is retry-able). If consistent with application rules (for example maximum service times) you should probably retry for about a minute, to give time for the REFRESH CLUSTER activities to complete. For a modest sized cluster, completion is likely to be much quicker than that.

If any of these reason codes is returned from MQOPEN, then no object handle is created, but a later retry should be successful in creating one.

If any of these reason codes is returned from MQPUT, then the object handle is not automatically closed, and retrying should eventually succeed without a need first to close the object handle. However, if the application opened the handle using bind-on-open options, and so requires all messages to go to the same channel, then (contrary to the application's expectations) it is not guaranteed that the retried put would go to the same channel or queue manager as before. It is therefore wise to close the object handle and open a new one, in that case, to regain the bind-on-open semantics.

If any of these reason codes is returned from MQPUT1, then it is unknown whether the problem happened during the open or the put part of the operation. Whichever it is, the operation can be retried. There are no bind-on-open semantics to worry about in this case, because the MQPUT1 operation is an open-put-close sequence that is performed in one continuous action.


Multi-hop scenarios

If the message flow incorporates a multi-hop, such as that shown in the following example, then a name lookup failure caused by REFRESH CLUSTER can occur on a queue manager that is remote from the application. In that case, the application receives a success (zero) return code, but the name lookup failure, if it occurs, prevents a CLUSRCVR channel program from routing the message to any proper destination queue. Instead, the CLUSRCVR channel program follows normal rules to write the message to a dead letter queue, based on the persistence of the message. The reason code associated with that operation is this:

If there are persistent messages, and no dead letter queues have been defined to receive them, you will see channels ending. Here is an example multi-hop scenario:

When you test the multi-hop, you might see the following queue manager error log entries:


More details about why each of these reason codes might be displayed when running REFRESH CLUSTER


Further remarks

If there is any clustered publish/subscribe activity in this environment, then REFRESH CLUSTER can have additional unwanted effects. For example temporarily losing subscriptions for subscribers, that then find they missed a message. See REFRESH CLUSTER considerations for publish/subscribe clusters.