Resource problems
How you determine and resolve problems connected to IBM MQ resources, including resource usage by IBM MQ processes, determining and resolving problems related to insufficient resources, and your resource limit configurations.
Useful commands and the configuration file for investigating resource issues
Useful commands that display current values on the system or make a temporary change to the system:
- ulimit -a
- Display user limits
- ulimit -Ha
- Display user hard limits
- ulimit -Sa
- Display user soft limits
- ulimit -<paramflag> <value>
- Where paramflag is the flag for the resource name, for example, s for stack.
To make permanent changes to the resource limits on the system use /etc/security/limits.conf or /etc/security/limits.
We can obtain the current resource limit set for a process from the proc file system on Linux . For example, cat /proc/<pid of MQ process>/limits.
Basic checks before tuning IBM MQ or kernel parameters
We need to investigate the following:- Whether the number of active connections is within the expected limit.
For example, suppose that the system is tuned to allow 2000 connections when the number of user processes is no greater than 3000. If the number of connections increases to more than 2000, then either the number of user processes has increased to more than 3000 (because new applications have been added), or there is a connection leak.
To check for these problems use the following commands:- Number of IBM MQ processes:
ps -elf|egrep "amq|run"|wc -l
- Number of IBM MQ processes:
ps -eLf|egrep "amq|run"|wc -l
- Number of connections:
echo "dis conn(*) all" | runmqsc <qmgr name>|grep EXTCONN|wc -l
- Shared memory usage:
ipcs -ma
- Number of IBM MQ processes:
- If the number of connections is higher than the expected limit, check the source of the connections.
- If the shared memory usage is very high, check the following number of:
- Topics
- Open queue handles
- From an IBM MQ perspective, the following resources
need to be checked and tuned:
- Maximum number of threads allowed for a given number of user processes.
- Data segment
- Stack segment
- File size
- Open file handles
- Shared memory limits
- Thread limits, for example, threads-max on Linux
- Use the mqconfig command to check the current resource usage.
Notes:
- Some of resources listed in the preceding text need to be tuned at user level and some at the operating system level.
- The preceding list is not a complete list, but is sufficient for most common resource issues reported by IBM MQ.
- Tuning is required at thread level, as each thread is a light weight process (LWP).
Problem in creating threads or processes from IBM MQ or an application
Failure in xcsExecProgram and xcsCreateThread
- Probe IDs, error messages, and components
- XY348010 from xtmStartTimerThread from an IBM MQ process (for example amqzlaa0) or an application
- Resolving the problem on AIX and Linux
- IBM MQ sets the error code
xecP_E_PROC_LIMIT when pthread_create or fork fails with EAGAIN.
- EAGAIN
- Review and increase the max user processes and stack size user process resource limits.
- Additional configuration required
- Review and increase the limits for kernel.pid_max (/proc/sys/kernel/kernel.pid_max) and kernel.threads-max (/proc/sys/kernel/threads-max) kernel parameters.
Problems in creating shared memory
Error : shmget fails with error number 28(ENOSPC)| Probe Id :- XY132002 | | Component :- xstCreateExtent | | ProjectID :- 0 | | Probe Description :- AMQ6119: An internal IBM MQ error has occurred | | (Failed to get memory segment: shmget(0x00000000, 2547712) [rc=-1 | | errno=28] No space left on device) | | FDCSequenceNumber :- 0 | | Arith1 :- 18446744073709551615 (0xffffffffffffffff) | | Arith2 :- 28 (0x1c) | | Comment1 :- Failed to get memory segment: shmget(0x00000000, | | 2547712) [rc=-1 errno=28] No space left on device | | Comment2 :- No space left on device | +-----------------------------------------------------------------------------+ MQM Function Stack ExecCtrlrMain? xcsAllocateMemBlock xstExtendSet xstCreateExtent xcsFFSTshmget fails with error number 22(EINVAL)
| Operating System :- SunOS 5.10 | | Probe Id :- XY132002 | | Application Name :- MQM | | Component :- xstCreateExtent | | Program Name :- amqzxma0 | | Major Errorcode :- xecP_E_NO_RESOURCE | | Probe Description :- AMQ6024: Insufficient resources are available to | | complete a system request. | | FDCSequenceNumber :- 0 | | Arith1 :- 18446744073709551615 (0xffffffffffffffff) | | Arith2 :- 22 (0x16) | | Comment1 :- Failed to get memory segment: shmget(0x00000000, | | 9904128) [rc=-1 errno=22] Invalid argument | | Comment2 :- Invalid argument | | Comment3 :- Configure kernel (for example, shmmax) to allow a | | shared memory segment of at least 9904128 bytes | +-----------------------------------------------------------------------------+ MQM Function Stack ExecCtrlrMain zxcCreateECResources zutCreateConfig xcsInitialize xcsCreateSharedSubpool xcsCreateSharedMemSet xstCreateExtent xcsFFST
Unexpected process termination and queue manager crash, or queue manager crash
Process ending unexpectedly followed by FDCs from amqzxma0Example FDC:Date/Time :- Mon May 02 2016 01:00:58 CEST Host Name :- test.ibm.com LVLS :- 8.0.0.4 Product Long Name :- IBM MQ for Linux (x86-64 platform) Probe Id :- XC723010 Component :- xprChildTermHandler Build Date :- Oct 17 2015 Build Level :- p800-004-151017 Program Name :- amqzxma0 Addressing mode :- 64-bit Major Errorcode :- xecP_E_USER_TERM Minor Errorcode :- OK Probe Description :- AMQ6125: An internal IBM MQ error has occurred.Possible Causes and Solutions
- Check if the user has ended any process.
- Check if the IBM MQ process ended because of a
memory exception:
- Did the process end with an FDC of Component :- xehExceptionHandler?
- Apply the fix for known issues corrected in this area.
- Check if the operating system ended the process because of high memory usage by the process:
- Has the IBM MQ process consumed lot of memory?
- Has the operating system ended the process?Review the operating system log. For example, the
OOM-killer on Linux:
Jan 2 01:00:57 ibmtest kernel: amqrmppa invoked oom-killer: gfp_mask=0x201da, order=0, oom_score_adj=0)
- Apply the fix for known memory leak issues.
Difference in user limits used by a process against the configured limits
The user limits used by the process might be different from the configured limits. This is likely to happen if the process is started by a different user, or by user scripts, or a high availability script for example. It is important that you to check the user who is starting the queue manager, and set the appropriate resource limits for this user.
Parent topic: Making initial checks on UNIX, Linux, and Windows