Implementation and optimization of join operations with DB2® Multisystem

 

In addition to the performance considerations for nondistributed join queries, more performance considerations exist for queries involving distributed files.

Joins can be performed only when the data is partition compatible. The distributed query optimizer generates a plan that makes data partition compatible, which might involve moving data between nodes.

Data is partition compatible when the data in the partitioning keys of both files uses the same node group and hashes to the same node. For example, the same numeric value stored in either a large-integer field or a small-integer field hashes to the same value.

The data types that follow are partition compatible:

Date, time, timestamp, and floating-point numeric data types are not partition compatible because they cannot be partitioning keys.

Joins involving distributed files are classified into four types: collocated, directed, repartitioned, and broadcast. The following sections define the types of joins and give examples of the different join types.

 

Parent topic:

Query design for performance with DB2 Multisystem