Investigating performance problems
Performance problems can arise from various factors. For example, incorrect resource allocation, poor application design, and I/O restraints. Use this topic to investigate some of the possible causes of performance problems.
Performance can be adversely affected by:
- Buffer pools that are an incorrect size
- Lack of real storage
- I/O contention for page sets or logs
- Log buffer thresholds that are set incorrectly
- Incorrect setting of the number of log buffers
- Large messages
- Units of recovery that last a long time, incorporating many messages for each sync point
- Messages that remain on a queue for a long time
- RACF auditing
- Unnecessary security checks
- Inefficient program design
When you analyze performance data, always start by looking at the overall system before you decide that we have a specific IBM MQ problem. Remember that almost all symptoms of reduced performance are magnified when there is contention. For example, if there is contention for DASD, transaction response times can increase. Also, the more transactions there are in the system, the greater the processor usage and greater the demand for both virtual and real storage.
In such situations, the system shows heavy use of all its resources. However, the system is actually experiencing normal system stress, and this stress might be hiding the cause of a performance reduction. To find the cause of such a loss of performance, we must consider all items that might be affecting your active tasks.
Investigating the overall system
Within IBM MQ, the performance problem is either increased response time or an unexpected and unexplained heavy use of resources. First check factors such as total processor usage, DASD activity, and paging. An IBM tool for checking total processor usage is resource management facility ( RMF ). In general, we must look at the system in some detail to see why tasks are progressing slowly, or why a specific resource is being heavily used.
Start by looking at general task activity, then focus on particular activities, such as specific tasks or a specific time interval.
Another possibility is that the system has limited real storage; therefore, because of paging interrupts, the tasks progress more slowly than expected.
Investigating individual tasks
We can use the accounting trace to gather information about IBM MQ tasks. These trace records tell you a great deal about the activity that the task has performed, and about how much time the task spent suspended, waiting for latches. The trace record also includes information about how much Db2 and coupling facility activity were performed by the task.
Interpreting IBM MQ accounting data is described in Interpreting IBM MQ for z/OS accounting data.
Long running units of work can be identified by the presence of message CSQR026I in the job log. This message indicates that a task has existed for more than three queue manager checkpoints and its log records have been shunted. For a description of log record shunting, see The log files.
Parent topic: Introduction to monitoring IBM MQ for z/OS