Tuning for Oracle9i on UNIX
Overview
Oracle9i is a highly-optimizable software product. Frequent tuning optimizes system performance and prevents data bottlenecks. Although this chapter is written from the perspective of single-node computer systems, most of the performance tuning tips provided here are also valid when using Oracle9i Real Application Clusters and features available with Oracle9i.
Before tuning the system, observe its normal behavior using the tools described in "Operating System Tools".
Types of Performance Bottlenecks
Performance bottlenecks are often caused by the following:
- Memory contention
Memory contention occurs when processes require more memory than is available. When this occurs, the system pages and swaps processes between memory and disk.
- Disk I/O contention
Disk I/O contention is caused by poor memory management, poor distribution of tablespaces and files across disks, or a combination of both.
- CPU contention
Although the UNIX kernel usually allocates CPU resources effectively, many processes compete for CPU cycles and this can cause contention. If you installed Oracle9i in a multiprocessor environment, there might be a different level of contention on each CPU.
- Oracle resources contention
Contention is also common for Oracle resources such as locks and latches.
Operating System Tools
Several operating system tools are available to help you assess database performance and determine database requirements. In addition to providing statistics for Oracle processes, these tools provide statistics for CPU usage, interrupts, swapping, paging, context switching, and I/O for the entire system.
vmstat
Use the vmstat command to view process, virtual memory, disk, trap, and CPU activity, depending on the switches you supply with the command. Enter one of the following commands to display a summary of CPU activity eight times, at five-second intervals:
- HP and Solaris:
$ vmstat -S 5 8- AIX, Linux, and Tru64:
$ vmstat 5 8
The following example shows sample output from the command on Solaris:
procs memory page disk faults cpu r b w swap free si so pi po fr de sr f0 s0 s1 s3 in sy cs us sy id 0 0 0 1892 5864 0 0 0 0 0 0 0 0 0 0 0 90 74 24 0 0 99 0 0 0 85356 8372 0 0 0 0 0 0 0 0 0 0 0 46 25 21 0 0 100 0 0 0 85356 8372 0 0 0 0 0 0 0 0 0 0 0 47 20 18 0 0 100 0 0 0 85356 8372 0 0 0 0 0 0 0 0 0 0 2 53 22 20 0 0 100 0 0 0 85356 8372 0 0 0 0 0 0 0 0 0 0 0 87 23 21 0 0 100 0 0 0 85356 8372 0 0 0 0 0 0 0 0 0 0 0 48 41 23 0 0 100 0 0 0 85356 8372 0 0 0 0 0 0 0 0 0 0 0 44 20 18 0 0 100 0 0 0 85356 8372 0 0 0 0 0 0 0 0 0 0 0 51 71 24 0 0 100The w column, under the procs column, shows the number of potential processes that have been swapped out and written to disk. If the value is not zero, swapping is occurring and your system is short of memory. The si and so columns indicate the number of swap-ins and swap-outs per second, respectively. Swap-ins and swap-outs should always be zero.
The output from the vmstat command differs between platforms. See the man page on your platform for information on interpreting the output.
sar
Use the sar command to display cumulative activity counters in the operating system, depending on the switches that you supply with the command. The following commands display a summary of I/O activity ten times, at ten-second intervals:
$ sar -b 10 10The following example shows sample output from the command on Solaris:
13:32:45 bread/s lread/s %rcache bwrit/s lwrit/s %wcache pread/s pwrit/s 13:32:55 0 14 100 3 10 69 0 0 13:33:05 0 12 100 4 4 5 0 0 13:33:15 0 1 100 0 0 0 0 0 13:33:25 0 1 100 0 0 0 0 0 13:33:35 0 17 100 5 6 7 0 0 13:33:45 0 1 100 0 0 0 0 0 13:33:55 0 9 100 2 8 80 0 0 13:34:05 0 10 100 4 4 5 0 0 13:34:15 0 7 100 2 2 0 0 0 13:34:25 0 0 100 0 0 100 0 0 Average 0 7 100 2 4 41 0 0
On Tru64 systems, the sar command is available in the UNIX SVID2 compatibility subset, OSFSVID2400.
iostat
Use the iostat command to report terminal and disk activity, depending on the switches you supply with the command. The output from the iostat command does not include disk request queues, but it shows which disks are busy. This information is valuable when balance I/O loads.
The following command displays terminal and disk activity five times, at five-second intervals:
$ iostat 5 5The following example shows sample output from the command on Solaris:
tty fd0 sd0 sd1 sd3 cpu tin tout Kps tps serv Kps tps serv Kps tps serv Kps tps serv us sy wt id 0 1 0 0 0 0 0 31 0 0 18 3 0 42 0 0 0 99 0 16 0 0 0 0 0 0 0 0 0 1 0 14 0 0 0 100 0 16 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 100 0 16 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 100 0 16 0 0 0 0 0 0 2 0 14 12 2 47 0 0 1 98Use the iostat command to look for large disk request queues. A request queue shows how long the I/O requests on a particular disk device must wait to be serviced. Request queues are caused by a high volume of I/O requests to that disk or by I/O with long average seek times. Ideally, disk request queues should be at or near zero.
swap, swapinfo, swapon, and lsps
Use the swap, swapinfo, swapon or lsps command to report information on swap space usage. A shortage of swap space can cause slow response times or even cause the system to stop responding. The following table lists the appropriate command to use for your platform:
Platform Command AIX lsps -a HP swapinfo -m Linux swapon -s Solaris swap -l Tru64 swapon -s The following example shows sample output from the swap -l command on Solaris:
swapfile dev swaplo blocks free /dev/dsk/c0t3d0s1 32,25 8 197592 162136
Linux Tools
On Linux systems, use the free command to view information on swap space, memory, and buffer usage. A shortage of swap space can result in the system hanging and slow response time.
Solaris Tools
On Solaris systems, use the mpstat command to view statistics for each processor in a multiprocessor system . Each row of the table represents the activity of one processor. The first row summarizes all activity since the last system reboot; each subsequent row summarizes activity for the preceding interval. All values are events per second unless otherwise noted. The arguments are for time intervals between statistics and number of iterations. The following example shows sample output from the mpstat command:
CPU minf mjf xcal intr ithr csw icsw migr smtx srw syscl usr sys wt idl 0 0 0 1 71 21 23 0 0 0 0 55 0 0 0 99 2 0 0 1 71 21 22 0 0 0 0 54 0 0 0 99 CPU minf mjf xcal intr ithr csw icsw migr smtx srw syscl usr sys wt idl 0 0 0 0 61 16 25 0 0 0 0 57 0 0 0 100 2 1 0 0 72 16 24 0 0 0 0 59 0 0 0 100
AIX System Management Interface Tool
The AIX System Management Interface Tool (SMIT) provides a menu-driven interface to various system administrative and performance tools. Using SMIT, you can navigate through large numbers of tools and focus on the jobs you want to execute.
Base Operation System Tools
The AIX Base Operation System (BOS) contains performance tools that are historically part of UNIX systems or are required to manage the implementation-specific features of AIX. The following table lists the most important BOS tools:
Tool Description lsattr Displays the attributes of devices lslv Displays information about a logical volume or the logical volume allocations of a physical volume netstat Displays the contents of network-related data structures nfsstat Displays statistics about Network File System (NFS) and Remote Procedure Call (RPC) activity nice Changes the initial priority of a process no Displays or sets network options ps Displays the status of one or more processes reorgvg Reorganizes the physical-partition allocation within a volume group time Prints the elapsed execution, user CPU processing, and system CPU processing time trace Records and reports selected system events
AIX Performance Toolbox
The AIX Performance Toolbox (PTX) contains tools for monitoring and tuning system activity locally and remotely. PTX consists of two main components, the PTX Manager and the PTX Agent. The PTX Manager collects and displays data from various systems in the configuration by using the xmperf utility. The PTX Agent collects and transmits data to the PTX Manager by using the xmserd utility. The PTX Agent is also available as a separate product called Performance Aide for AIX.
Both PTX and Performance Aide include the following separate monitoring and tuning tools:
Tool Description fdpr Optimizes an executable program for a particular workload filemon Uses the trace facility to monitor and report the activity of the file system fileplace Displays the placement of a file's blocks within logical or physical volumes lockstat Displays statistics about contention for kernel locks lvedit Facilitates interactive placement of logical volumes within a volume group netpmon Uses the trace facility to report on network I/O and network-related CPU usage rmss Simulates systems with various sizes of memory for performance testing svmon Captures and analyzes information about virtual-memory usage syscalls Records and counts system calls tprof Uses the trace facility to report CPU usage at module and source-code-statement levels BigFoot Reports the memory access patterns of processes stem Permits subroutine-level entry and exit instrumentation of existing executables
Performance Tuning Tools
The following table lists the tools that you can use for additional performance tuning on HP 9000 Series HP-UX systems:
Tools Description gprof Creates an execution profile for programs monitor Monitors the program counter and calls to certain functions netfmt Monitors the network netstat Reports statistics on network performance nfsstat Reports statistics for each processor nettl Captures network events or packets by logging and tracing prof Creates an execution profile of C programs and displays performance statistics for your program, showing where your program is spending most of its execution time profil Copies program counter information into a buffer top Displays the top processes on the system and periodically updates the information
HP Performance Analysis Tools
The following HP-UX performance analysis tools are available:
- GlancePlus/UX
This HP-UX utility is an online diagnostic tool that measures the system's activities. GlancePlus displays how system resources are being used. It displays dynamic information about the system's I/O, CPU, and memory usage in a series of screens. You can also use the utility to monitor how individual processes are using resources.
- HP PAK
HP Programmer's Analysis Kit (HP PAK) currently consists of two tools, Puma and Thread Trace Visualizer (TTV):
- Puma collects performance statistics during a program run. It provides several graphical displays for viewing and analyzing the collected statistics.
- TTV displays trace files produced by the instrumented thread library, libpthread_tr.sl, in a graphical format. It allows you to view how threads are interacting and to find where threads are blocked waiting for resources.
HP PAK is bundled with the HP FORTRAN 77, HP Fortran90, HP C, HP C++, HP ANSI C++, and HP Pascal compilers.
Oracle SQL Tuning Tools
Oracle provides a variety of tools for tuning SQL, including the V$ performance views, the EXPLAIN PLAN command, the SQL TRACE facility, the TKPROF facility, and the Autotrace report.
Tuning Memory Management
Start the memory tuning process by measuring paging and swapping space to determine how much memory is available. After you have determined your system's memory usage, tune the Oracle buffer cache.
The Oracle buffer manager ensures that the more frequently accessed data is cached longer. If you monitor the buffer manager and tune the buffer cache, you can have a significant influence on Oracle9i performance. The optimal Oracle9i buffer size for your system depends on the overall system load and the relative priority of Oracle over other applications.
Allocate Sufficient Swap Space
Try to minimize swapping because it causes significant UNIX overhead. To check for swapping, enter one of the following commands:
- On AIX, Linux, Solaris, and Tru64, use the sar or vmstat commands. See the man pages for information on the appropriate options to use with these commands.
- On HP, use the swapinfo -m command.
If your system is swapping and conserve memory:
- Avoid running unnecessary system daemon processes or application processes.
- Decrease the number of database buffers to free some memory.
- Decrease the number of UNIX file buffers, especially if you are using raw devices.
To determine how much swap space is in use, enter one of the following commands:
Platform Command AIX lsps -a HP swapinfo -m Linux swapon -s Solaris swap -l Tru64 swapon -s To add swap space to your system, enter one of the following commands:
Platform Command AIX chps or mkps HP swapon Linux swapon -a Solaris swap -a Tru64 swapon -a Set the swap space to between two and four times the system's physical memory. Monitor the use of swap space and increase it as required.
Control Paging
Paging might not present as serious a problem as swapping, because an entire program does not have to be stored in memory to run. A small number of page-outs might not noticeably affect the performance of your system.
To detect excessive paging, run measurements during periods of fast response or idle time to compare against measurements from periods of slow response.
Use the vmstat or sar command to monitor paging. See the man pages or your operating system documentation for information on interpreting the results for your platform. The following columns from the output of this command are important on Solaris:
Column Description vflt/s Indicates the number of address translation page faults. Address translation faults occur when a process references a valid page not in memory. rclm/s Indicates the number of valid pages that have been reclaimed and added to the free list by page-out activity. This value should be zero. If your system consistently has excessive page-out activity, consider the following solutions:
- Install more memory.
- Move some of the work to another system.
- Configure your kernel to use less memory.
Adjust Oracle Block Size
A UNIX system reads entire operating system blocks from the disk. If the database block size is smaller than the UNIX file system buffer size, I/O bandwidth is inefficient. If you adjust the Oracle database block size to be a multiple of the operating system block size, you can increase performance by up to five percent.
The DB_BLOCK_SIZE initialization parameter sets the database block size. You can change the block size by recreating the database.
To see the current value of the DB_BLOCK_SIZE parameter, enter the SHOW PARAMETERS command in SQL*Plus.
Tuning Disk I/O
Balance I/O evenly across all available disks to reduce disk access times. For smaller databases and those not using RAID, ensure that different datafiles and tablespaces are distributed across the available disks.
Choose the Appropriate File System Type
Depending on the operating system that you use, you can choose from a range of file systems. File systems have different characteristics, and the techniques they use to access data can have a substantial impact on database performance. The following table lists typical file system choices and the platforms on which they are available:
File System Platform Description S5 AIX, HP, Solaris UNIX System V file system UFS AIX, HP, Solaris, Tru64 Unified file system, derived from BSD UNIX VXFS AIX, Solaris Veritas file system Raw AIX, HP, Linux, Solaris, Tru64 Raw devices/volumes (No file system) EXT2 Linux Extended file system for Linux AdvFS Tru64 Poly advanced file system CFS Tru64 Cluster file system JFS AIX Journaled file system The suitability of a file system to an application is usually undocumented. For example, even different implementations of the Unified file system are hard to compare. Performance differences may vary from 0 to 20 percent, depending on the file system you choose.
If you choose to use a file system:
- Make a new file system to ensure that the hard disk is clean and unfragmented.
- Perform a file system check on the partition before using it for database files.
- Distribute disk I/O as evenly as possible.
- Separate log files from database files.
Monitoring Disk Performance
To monitor disk performance, use the sar -b and sar -u commands.
sar -b Output Columns
Columns Description bread/s, bwrit/s Blocks read and blocks written per second (important for file system databases) pread/s, pwrit/s Partitions read and partitions written per second (important for raw partition database systems) An important sar -u column for analyzing disk performance is %wio, the percentage of CPU time waiting on blocked I/O.
Not all Linux distributions display the %wio column in the output of the sar -u command.
Key indicators are:
- The sum of the bread, bwrit, pread, and pwrit columns indicates the level of activity of the disk I/O subsystem. The higher the sum, the busier the I/O subsystem. The larger the number of physical drives, the higher the sum threshold number can be. A good default value is no more than 40 for two drives and no more than 60 for four to eight drives.
- The %rcache column value should be greater than 90 and the %wcache column value should be greater than 60. Otherwise, the system may be disk I/O bound.
- If the %wio column value is consistently greater than 20, the system is I/O bound.
Tuning CPU Usage
This section provides information on tuning CPU usage.
Keep All Oracle Users/Processes at the Same Priority
Oracle is designed to operate with all users and background processes operating at the same priority level. Changing priority levels causes unexpected effects on contention and response times.
For example, if the log writer process (LGWR) gets a low priority, it is not executed frequently enough and LGWR becomes a bottleneck. On the other hand, if LGWR has a high priority, user processes may suffer poor response time.
Tuning Oracle Resource Contention and UNIX Kernel Parameters
You can improve performance by keeping the UNIX kernel as small as possible. The UNIX kernel typically pre-allocates physical memory, leaving less memory available for other processes such as the Oracle processes.
Traditionally, kernel parameters such as NBUF, NFILE, and NOFILES were used to adjust kernel size. However, most UNIX implementations dynamically adjust those parameters at run time, even though they are present in the UNIX configuration file.
Look for memory-mapped video drivers, networking drivers, and disk drivers that can be removed, freeing more memory for use by other processes.
Remember to make a backup copy of your UNIX kernel. See your operating system documentation for additional information.
Tuning the Operating System Buffer Cache
To take full advantage of raw devices, adjust the size of the Oracle9i buffer cache and, if memory is limited, the operating system buffer cache.
The operating system buffer cache holds blocks of data in memory while they are being transferred from memory to disk, or from disk to memory.
The Oracle9i buffer cache is the area in memory that stores the Oracle database buffers. Because Oracle9i can use raw devices, it does not need to use the operating system buffer cache.
If you use raw devices, increase the size of the Oracle9i buffer cache. If the amount of memory on the system is limited, make a corresponding decrease in the operating system buffer cache size.
Use the sar command to determine which buffer caches increase or decrease.
Using Raw Devices/Volumes
Guidelines for Using Raw Devices/Volumes
Raw devices/volumes have the following disadvantages when used on UNIX:
- Raw devices/volumes may not solve problems with file size writing limits.
To display current file size limits, enter the following command:
$ ulimit -a
- Small client systems might not be able to use sufficiently large raw device/volume partitions.
- If a particular disk drive has intense I/O activity and performance would benefit from movement of an Oracle datafile to another drive, it is likely that no acceptably sized section exists on a drive with less I/O activity. It might not be possible to move files to other disk drives if you are using raw devices/volumes.
- Raw devices/volumes may be more difficult to administer than datafiles stored on a file system.
In addition to the factors described in this section, consider the following issues when deciding whether to use raw devices/volumes:
- Oracle9i Real Application Clusters installation
Each instance of Oracle9i Real Application Clusters has it's own log files. Therefore, in addition to the partitions required for the tablespaces and control files, each instance requires a minimum of three partitions for the log files. All the files must be on disks that can be shared by all nodes of a cluster.
- Raw disk partition availability
Use raw devices/volumes for Oracle files only if your site has at least as many raw disk partitions as Oracle datafiles. If disk space is a consideration and the raw disk partitions are already formatted, match datafile size to partition size as closely as possible to avoid wasting space.
You must also consider the performance implications of using all of the disk space on a few disks as opposed to using less space on more disks.
- Logical volume manager
The logical volume manager manages disk space at a logical level. With logical volumes, you can create logical disks based on raw partition availability. The logical volume manager controls fixed-disk resources by:
- Mapping data between logical and physical storage
- Allowing data to span multiple disks and to be discontiguous, replicated, and dynamically expanded
- Dynamic performance tuning
You can optimize disk performance when the database is online by moving files from disk drives with high activity to disk drives with less activity. Most hardware vendors who provide the logical disk facility also provide a graphical user interface you can use for tuning.
- Mirroring and online disk replacement
You can mirror logical volumes to protect against loss of data. If one copy of a mirror fails, dynamic resynchronization is possible. Some vendors also provide the ability to replace drives online in conjunction with the mirroring facility.
For Oracle9i Real Application Clusters, you can use logical volumes for drives associated with a single UNIX system, as well as those that can be shared with more than one computer of a UNIX cluster. Shared drives allow for all files associated with the Oracle9i Real Application Clusters to be placed on these shared logical volumes.
Raw Device Setup
Keep the following items in mind when creating raw devices:
- When creating the volumes, ensure that the owner is oracle and the group is oinstall.
- The size of an Oracle datafile created in a raw partition must be at least two Oracle block sizes smaller than the size of the raw partition.
Using Trace and Alert Files
This section describes the trace (or dump) and alert files that Oracle9i creates to diagnose and resolve operating problems.
Trace Files
Each server and background process can write to an associated trace file. When a process detects an internal error, it writes information on the error to its trace file. The filename format of a trace file is processname_unixpid_sid.trc, where:
processname Is a three or four-character abbreviated process name identifying the Oracle9i process that generated the file (for example, pmon, dbwr, ora, or reco) sid Is the instance system identifier unixpid Is the UNIX process ID number A sample trace filename is $ORACLE_BASE/admin/TEST/bdump/lgwr_1237_TEST.trc.
All trace files for background processes are written to the destination directory specified by the BACKGROUND_DUMP_ DEST initialization parameter. All trace files for server processes are written to the destination directory specified by the USER_DUMP_DEST initialization parameter.
Set the MAX_DUMP_FILE initialization parameter to at least 5000 to ensure that the trace file is large enough to store error information.
Alert Files
The alert_sid.log file stores significant database events and messages. Anything that affects the database instance or global database is recorded in this file. This file is associated with a database and is located in the directory specified by the BACKGROUND_DUMP_DEST initialization parameter. If you do not set this initialization parameter, the default directory is $ORACLE_HOME/rdbms/log.