GROUP BY clause
The GROUP BY clause allows you to find the characteristics of groups of rows rather than individual rows.
When you specify a GROUP BY clause, SQL divides the selected rows into groups such that the rows of each group have matching values in one or more columns or expressions. Next, SQL processes each group to produce a single-row result for the group. You can specify one or more columns or expressions in the GROUP BY clause to group the rows. The items you specify in the SELECT statement are properties of each group of rows, not properties of individual rows in a table or view.
Without a GROUP BY clause, the application of SQL aggregate functions returns one row. When GROUP BY is used, the function is applied to each group, thereby returning as many rows as there are groups.
For example, the CORPDATA.EMPLOYEE table has several sets of rows, and each set consists of rows describing members of a specific department. To find the average salary of people in each department, you can issue:
SELECT WORKDEPT, DECIMAL (AVG(SALARY),5,0) FROM CORPDATA.EMPLOYEE GROUP BY WORKDEPTThe result is several rows, one for each department.
Notes:
WORKDEPT AVG-SALARY A00 40850 B01 41250 C01 29722 D11 25147 D21 25668 E01 40175 E11 21020 E21 24086
- Grouping the rows does not mean ordering them. Grouping puts each selected row in a group, which SQL then processes to derive characteristics of the group. Ordering the rows puts all the rows in the results table in ascending or descending collating sequence. Depending on the implementation selected by the database manager, the resulting groups might appear to be ordered.
- If there are null values in the column you specify in the GROUP BY clause, a single-row result is produced for the data in the rows with null values.
- If the grouping occurs over character, or UCS-2 or UTF-16 graphic columns, the sort sequence in effect when the query is run is applied to the grouping.
When you use GROUP BY, you list the columns or expressions you want SQL to use to group the rows. For example, suppose that you want a list of the number of people working on each major project described in the CORPDATA.PROJECT table. You can issue:
SELECT SUM(PRSTAFF), MAJPROJ FROM CORPDATA.PROJECT GROUP BY MAJPROJThe result is a list of the company's current major projects and the number of people working on each project.
SUM(PRSTAFF) MAJPROJ 6 AD3100 5 AD3110 10 MA2100 8 MA2110 5 OP1000 4 OP2000 3 OP2010 32.5 ? You can also specify that you want the rows grouped by more than one column or expression. For example, you can issue a select statement to find the average salary for men and women in each department, using the CORPDATA.EMPLOYEE table. To do this, you can issue:
SELECT WORKDEPT, SEX, DECIMAL(AVG(SALARY),5,0) AS AVG_WAGES FROM CORPDATA.EMPLOYEE GROUP BY WORKDEPT, SEXThe result follows.
WORKDEPT SEX AVG_WAGES A00 F 49625 A00 M 35000 B01 M 41250 C01 F 29722 D11 F 25817 D11 M 24764 D21 F 26933 D21 M 24720 E01 M 40175 E11 F 22810 E11 M 16545 E21 F 25370 E21 M 23830 Because you did not include a WHERE clause in this example, SQL examines and processes all rows in the CORPDATA.EMPLOYEE table. The rows are grouped first by department number and next (within each department) by sex before SQL derives the average SALARY value for each group.
Parent topic:
Retrieving data using the SELECT statement
Related concepts
Sort sequences and normalization in SQL
Related reference
ORDER BY clause