Using the UNION keyword to combine subselects

 

Using the UNION keyword, you can combine two or more subselects to form a fullselect.

When SQL encounters the UNION keyword, it processes each subselect to form an interim result table, then it combines the interim result table of each subselect and deletes duplicate rows to form a combined result table. You can use different clauses and techniques when coding select-statements.

You can use UNION to eliminate duplicates when merging lists of values obtained from several tables. For example, you can obtain a combined list of employee numbers that includes:

The combined list is derived from two tables and contains no duplicates. To do this, specify:

SELECT EMPNO    FROM CORPDATA.EMPLOYEE    WHERE WORKDEPT = 'D11'
 UNION
 SELECT EMPNO    FROM CORPDATA.EMPPROJACT    WHERE PROJNO = 'MA2112' OR
      PROJNO = 'MA2113' OR
      PROJNO = 'AD3111'
 ORDER BY EMPNO

To better understand the results from these SQL statements, imagine that SQL goes through the following process:

Step 1. SQL processes the first SELECT statement:

SELECT EMPNO      FROM CORPDATA.EMPLOYEE      WHERE WORKDEPT = 'D11'

The query returns the following interim result table.

EMPNO from CORPDATA.EMPLOYEE
000060
000150
000160
000170
000180
000190
000200
000210
000220
200170
200220

Step 2. SQL processes the second SELECT statement:

SELECT EMPNO      FROM CORPDATA.EMPPROJACT      WHERE PROJNO='MA2112' OR
                 PROJNO= 'MA2113' OR
                 PROJNO= 'AD3111'

The query returns another interim result table.

EMPNO from CORPDATA.EMPPROJACT
000230
000230
000240
000230
000230
000240
000230
000150
000170
000190
000170
000190
000150
000160
000180
000170
000210
000210

Step 3. SQL combines the two interim result tables, removes duplicate rows, and orders the result:

SELECT EMPNO 
     FROM CORPDATA.EMPLOYEE      WHERE WORKDEPT = 'D11'

UNION

SELECT EMPNO FROM CORPDATA.EMPPROJACT WHERE PROJNO='MA2112' OR PROJNO= 'MA2113' OR PROJNO= 'AD3111'

ORDER BY EMPNO

The query returns a combined result table with values in ascending sequence.

EMPNO
000060
000150
000160
000170
000180
000190
000200
000210
000220
000230
000240
200170
200220

When you use UNION:

To identify which subselect each row is from, you can include a constant at the end of the select list of each subselect in the union. When SQL returns your results, the last column contains the constant for the subselect that is the source of that row. For example, you can specify:

  SELECT A, B, 'A1' ... 
  UNION 
  SELECT X, Y, 'B2'...

When a row is returned, it includes a value (either A1 or B2) to indicate the table that is the source of the row's values. If the column names in the union are different, SQL uses the set of column names specified in the first subselect when interactive SQL displays or prints the results, or in the SQLDA resulting from processing an SQL DESCRIBE statement.

Sort sequence is applied after the fields across the UNION pieces are made compatible. The sort sequence is used for the distinct processing that implicitly occurs during UNION processing.

 

Parent topic:

Retrieving data using the SELECT statement

 

Related concepts


Sort sequences and normalization in SQL

 

Related reference


Creating and using views