External Sort-Merge Algorithm

Till now, we saw that sorting is an important term in any database system. It means arranging the data either in ascending or descending order. We use sorting not only for generating a sequenced output but also for satisfying conditions of various database algorithms. In query processing, the sorting method is used for performing various relational operations such as joins, etc. efficiently. But the need is to provide a sorted input value to the system. For sorting any relation, we have to build an index on the sort key and use that index for reading the relation in sorted order. However, using an index, we sort the relation logically, not physically. Thus, sorting is performed for cases that include:

Case 1: Relations that are having either small or medium size than main memory.

Case 2: Relations having a size larger than the memory size.

In Case 1, the small or medium size relations do not exceed the size of the main memory. So, we can fit them in memory. So, we can use standard sorting methods such as quicksort, merge sort, etc., to do so.

For Case 2, the standard algorithms do not work properly. Thus, for such relations whose size exceeds the memory size, we use the External Sort-Merge algorithm.

The sorting of relations which do not fit in the memory because their size is larger than the memory size. Such type of sorting is known as External Sorting. As a result, the external-sort merge is the most suitable method used for external sorting.

External Sort-Merge Algorithm

Here, we will discuss the external-sort merge algorithm stages in detail:

In the algorithm, M signifies the number of disk blocks available in the main memory buffer for sorting.

Stage 1: Initially, we create a number of sorted runs. Sort each of them. These runs contain only a few records of the relation.

repeat
	read either M blocks or the rest of the relation having a smaller size; 
	sort the in-memory part of the relation;
	write the sorted data to run file Ri;
	i =i+1;

In Stage 1, we can see that we are performing the sorting operation on the disk blocks. After completing the steps of Stage 1, proceed to Stage 2.

Stage 2: In Stage 2, we merge the runs. Consider that total number of runs, i.e., N is less than M. So, we can allocate one block to each run and still have some space left to hold one block of output. We perform the operation as follows:

	read one block of each of N files Ri into a buffer block in memory;
	repeat
	      select the first tuple among all buffer blocks (where selection is made in sorted order);
	     write the tuple to the output, and then delete it from the buffer block;
                    if the buffer block of any run Ri is empty and not EOF(Ri)
		then read the next block of Ri into the buffer block;
	Until all input buffer blocks are empty

After completing Stage 2, we will get a sorted relation as an output. The output file is then buffered for minimizing the disk-write operations. As this algorithm merges N runs, that's why it is known as an N-way merge.

However, if the size of the relation is larger than the memory size, then either M or more runs will be generated in Stage 1. Also, it is not possible to allocate a single block for each run while processing Stage 2. In such a case, the merge operation process in multiple passes. As M-1 input buffer blocks have sufficient memory, each merge can easily hold M-1 runs as its input. So, the initial phase works in the following way:

It merges the first M-1 runs for getting a single run for the next one.
Similarly, it merges the next M-1 runs. This step continues until it processes all the initial runs. Here, the number of runs has a reduced M-1 value. Still, if this reduced value is greater than or equal to M, we need to create another pass. For this new pass, the input will be the runs created by the first pass.
The work of each pass will be to reduce the number of runs by M-1 value. This job repeats as many times as needed until the number of runs is either less than or equal to M.
Thus, a final pass produces the sorted output.

Estimating cost for External Sort-Merge Method

The cost analysis of external sort-merge is performed using the above-discussed stages in the algorithm:

Assume b_r denote number of blocks containing records of relation r.
In the first stage, it reads each block of the relation and writes them back. It takes a total of 2b_r block transfers.
Initially, the value of the number of runs is [b_r/M]. As the number of runs decreases by M-1 in each merge pass, so it needs a total number of [log_M-1(b_r/M)] merge passes.

Every pass read and write each block of the relation only once. But with two exceptions:

The final pass can give a sorted output without writing its result to the disk
There might be chances that some runs may not be read or written during the pass.

Neglecting such small exceptions, the total number of block transfers for external sorting comes out:

b _r (2 Γ log _M-1 (b _r /M) ˥ + 1)

We need to add the disk seek cost because each run needs seeks reading and writing data for them. If in Stage 2, i.e., the merge phase, each run is allocated with b_b buffer blocks or each run reads b_b data at a time, then each merge needs [b_r /b_b] seeks for reading the data. The output is written sequentially, so if it is on the same disk as input runs, the head will need to move between the writes of consecutive blocks. Therefore, add a total of 2[b_r /b_b] seeks for each merge pass and the total number of seeks comes out:

2 Γ b _r /M ˥ + Γb _r /b _b ˥(2Γlog_M-1(b _r /M)˥ - 1)

Thus, we need to calculate the total number of disk seeks for analyzing the cost of the External merge-sort algorithm.

Example of External Merge-sort Algorithm

Let's understand the working of the external merge-sort algorithm and also analyze the cost of the external sorting with the help of an example.

Suppose that for a relation R, we are performing the external sort-merge. In this, assume that only one block can hold one tuple, and the memory can hold at most three blocks. So, while processing Stage 2, i.e., the merge stage, it will use two blocks as input and one block for output.

Next TopicHash Join Algorithm

← prev next →