Directed join with DB2® Multisystem

 

In the directed join, the partitioning keys of at least one of the files are used as the join fields.

The join fields do not match the partitioning keys of the other files. Records of one file are directed to or sent to the nodes of the second file based on the hashing of the join field values using the partition map and node group of the second file. As soon as the records have been moved to the nodes of the second file through a temporary distributed file, a collocated join is used to join the data. This method is valid only for equijoin queries where all fields of the partitioning key are join fields for at least one of the files.

In the following query, join field (WORKDEPT) is the partitioning key for file EMPLOYEE; however, join field (ADMRDEPT) is not the partitioning key for DEPARTMENT. If the join was attempted without moving the data, result records would be missing because record 2 of DEPARTMENT should be joined to records 1 and 2 of EMPLOYEE and these records are stored on different nodes.

SQL statement:

   SELECT DEPTNAME, FIRSTNME, LASTNAME               FROM DEPARTMENT, EMPLOYEE               WHERE ADMRDEPT = WORKDEPT AND JOB = 'Manager'

OPNQRYF command:

   OPNQRYF FILE((DEPARTMENT) (EMPLOYEE))
                   FORMAT(JOINFMT)
                   QRYSLT('JOB *EQ 'Manager')
                   JFLD((ADMRDEPT WORKDEPT *EQ))

The records of DEPARTMENT that are needed to run the query are read, and the data in ADMRDEPT is hashed using the partitioning map and the node group of EMPLOYEE. A temporary file is created and looks like this:

Table 1. A temporary table
Old node New node DEPTNAME ADMRDEPT (New partitioning key)
SYSA SYSA Support services A00
SYSB SYSA Planning A00
SYSC SYSC Accounting B00
SYSA SYSC Programming B00

This temporary table is joined to EMPLOYEE. The join works because ADMRDEPT is partition compatible with WORKDEPT.

Table 2. EMPLOYEE joined table
DEPTNAME FIRSTNME LASTNAME
Support services Christine Haas
Planning Christine Haas
Accounting Michael Thompson
Programming Michael Thompson

 

Parent topic:

Implementation and optimization of join operations with DB2 Multisystem