NestedLoop Join AlgorithmIn our previous section, we learned about joins and various types of joins. In this section, we will know about the nested loop join algorithm. A nested loop join is a join that contains a pair of nested for loops. To perform the nested loop join i.e., θ on two relations r and s, we use an algorithm known as the Nested loop join algorithm. The computation takes place as: r ⋈ _{θ} s where r is known as the outer relation and s is the inner relation of the join. It is because the for loop of r encloses the for loop of s. NestedLoop Join AlgorithmLet's discuss the algorithm of nestedloop join: for each tuple t_{r} in r do begin for each tuple t_{s} in s do begin test pair (t_{r}, t_{s}) to test if they satisfy the given join condition ? if test satisfied add t_{r} . t_{s} to the result; end inner loop end outer loop In the algorithm, t_{r} and t_{s} are the tuples of relations r and s, respectively. The notation t_{r}. t_{s} is a tuple constructed by concatenating the attribute values of tuples t_{r} and t_{s}. With the help of the algorithm, we understood the following points:
Cost Analysis of Nestedloop Join AlgorithmFor analyzing the cost of the nestedloop join algorithm, consider a number of pairs of tuples as n_{r} * n_{s}. Here, n_{r} specifies the number of tuples in relation r and n_{s} specifies the number of tuples in relation s. For computing the cost, perform a complete scan on relation s. Thus, Total number of block transfers in worst case = n_{r} * b_{s} + b_{r} Total number of seeks required in worst case = n_{r} + b_{r} Here, b_{s} and b_{r} are the number of blocks holding the tuples of relation r and s, respectively. In the best case, both relations r and s have sufficient memory to fit in the memory simultaneously. So, each block will read only once. Thus, Total number of block transfers in best case = b_{r} + b_{s}. Total number of seeks required = 2(n_{r} + b_{r}) In case if any one of the relations given fits entirely into the memory, it is a must to use that relation as the inner relation. It is because we will read the inner relation only once. Thus, Total number of block transfers in such case = b_{r} + b_{s} Total number of seeks required = 2(n_{r} + b_{r}) Block NestedLoop JoinBlock NestedLoop Join is a variant of nestedloop join in which each block of the inner relation is paired with each block of the outer relation. The block nestedloop join saves major block access in a situation where the buffer size is small enough to hold the entire relation into the memory. It does so by processing the relations on the basis of per block rather on the basis of per tuple. Within each pair of blocks, the block nestedloop join pairs each tuple of one block with each tuple in the other block to produce all pairs of tuples. It pairs only those tuples that satisfy the given join condition and them to the result. Block NestedLoop Join AlgorithmThe algorithm that is used to perform the block nestedloop join is known as the Block NestedLoop Join algorithm. We will use the same relations r and s in this algorithm. for each block b_{r} of r do begin for each block b_{s} of s do begin for each tuple t_{r} in b_{r} do begin for each tuple t_{s} in b_{s} do begin test pair (t_{r}, t_{s}) to determine if they pass the given join condition if test passed add t_{r} . t_{s} to the result; end end end end Cost Analysis of Block NestedLoop Join AlgorithmThere is a major difference between the cost of block nestedloop join and nested loopjoin algorithm. In the worst case of block nestedloop join, each block in the inner relation s is read only for one time for each block in the outer relation r. On the other hand, the nestedloop join reads each tuple in the inner relation s for one time for each tuple in the outer relation r. Thus in block nestedloop join, Total number of block transfers in worst case = b_{r} * b_{s}+ b_{r} Total number of seeks required = 2 * b_{r} Here, b_{r} and b_{s} are the number of blocks holding records of the given relation r and s, respectively. Also, each scan of s (inner relation) needs only one seek, and r (outer relation) requires one seek per block. In the best case, the inner relation fits entirely into memory. Thus, Total number of block transfers in best case = b_{r} + b_{s} Total number of seeks required = 2(n_{r} + b_{r}) In the case where none of the given relations r and s fits entirely into the memory, it is efficient to use the inner relation i.e., s as the outer relation. Improving Performance of NestedLoop and Block NestedLoop JoinsAfter understanding both joins, it was evaluated that the performance of both the joins can be further improved:
Next TopicSelection Operation in Query Processing
