Character string conversions in IBM MQ classes for JMS

From IBM MQ Version 8.0, some of the default behavior regarding character string conversion in the IBM MQ classes for JMS has changed.

Before IBM MQ Version 8.0, string conversions in IBM MQ classes for JMS was done by calling the java.nio.charset.Charset.decode(ByteBuffer) and Charset.encode(CharBuffer) methods.

Use either of these methods results in a default replacement ( REPLACE) of malformed or untranslatable data.

This behavior can obscure errors in applications, and lead to unexpected characters, for example ?, in translated data. From IBM MQ Version 8.0, to detect such issues earlier and more effectively, the IBM MQ classes for JMS use CharsetEncoders and CharsetDecoders directly and configure the handling of malformed and untranslatable data explicitly.

From IBM MQ Version 8.0, the default behavior is to REPORT such issues by throwing a suitable MQException.

Configure

Translating from UTF-16 (the character representation used in Java) to a native character set, such as UTF-8, is termed encoding, while translating in the opposite direction is termed decoding.

Currently, decoding takes the default behavior for CharsetDecoders, reporting errors by throwing an exception.

One setting is used to specify a java.nio.charset.CodingErrorAction to control error handling on both encoding and decoding. One other setting is used to control the replacement byte, or bytes, when encoding. The default Java replacement String will be used in decoding operations.

IBM MQ Classes for JMS

From IBM MQ Version 8.0, two new properties are available. The appropriate constant definitions are in com.ibm.msg.client.wmq.WMQConstants

JMS_IBM_UNMAPPABLE_ACTION

Sets or gets the CodingErrorAction to apply when a character cannot be mapped in an encoding or decoding operation. We should set this as CodingErrorAction.{REPLACE|REPORT|IGNORE}.toString() as follows:

public static final String JMS_IBM_UNMAPPABLE_ACTION = "JMS_IBM_Unmappable_Action";

JMS_IBM_UNMAPPABLE_REPLACEMENT

Sets or gets the replacement bytes to apply when a character cannot be mapped in an encoding operation. The default Java replacement String is used in decoding operations.

public static final String JMS_IBM_UNMAPPABLE_REPLACEMENT = "JMS_IBM_Unmappable_Replacement";

The JMS_IBM_UNMAPPABLE_ACTION and JMS_IBM_UNMAPPABLE_REPLACEMENT properties can be set on destinations or messages. A value set on a message overrides the value set on the destination to which the message is being sent.

Note that JMS_IBM_UNMAPPABLE_REPLACEMENT must be set as a single byte.

Set system defaults

From IBM MQ Version 8.0, the following two Java system properties are available to configure default behavior regarding character string conversion.

com.ibm.mq.cfg.jmqi.UnmappableCharacterAction: Specifies the action to be taken for untranslatable data on encoding and decoding. The value can be REPORT, REPLACE, or IGNORE.
com.ibm.mq.cfg.jmqi.UnmappableCharacterReplacement: Sets or gets the replacement bytes to apply when a character cannot be mapped in an encoding operation The default Java replacement string is used in decoding operations.

To avoid confusion between Java character and native byte representations, we should specify com.ibm.mq.cfg.jmqi.UnmappableCharacterReplacement as a decimal number representing the replacement byte in the native character set.

For example, the decimal value of ?, as a native byte, is 63 if the native character set is ASCII-based, such as ISO-8859-1, while it is 111 if the native character set is EBCDIC. Note: Note that if an MQMD or MQMessage object has either the unmappableAction or unMappableReplacement fields set, then the values of these fields take precedence over the Java system properties. This allows the values specified by the Java system properties to be overridden for each message if required. Parent topic: Use IBM MQ classes for JMS

Related concepts

Character string conversions in IBM MQ classes for Java