Pipelining in Query Processing
In the earlier section, we learned about materialization in which we evaluate multiple operations in the given expression via temporary relations. But, it leads to a drawback of producing a high number of temporary files. It makes the query-evaluation less efficient. However, the evaluation of the query should be highly efficient in producing an effective output.
Here, we will discuss another method of evaluating the multiple operations of an expression that works more efficiently than materialization. Such a more efficient way is known as Pipelining. Pipelining helps in improving the efficiency of the query-evaluation by decreasing the production of a number of temporary files. Actually, we reduce the construction of the temporary files by merging the multiple operations into a pipeline. The result of one currently executed operation passes to the next operation for its execution, and the chain continues till all operations are completed, and we get the final output of the expression. Such type of evaluation process is known as Pipelined Evaluation.
Advantages of Pipeline
There are following advantages of creating a pipelining of operations:
- It reduces the cost of query evaluation by eliminating the cost of reading and writing the temporary relations, unlike the materialization process.
- If we combine the root operator of a query evaluation plan in a pipeline with its inputs, the process of generating query results becomes quick. As a result, it is beneficial for the users as they can view the results of their asked queries as soon as the outputs get generated. Else, the users need to wait for high-time to get and view any query results.
Pipelining vs. Materialization
Although both methods are used for evaluating multiple operations of expression, there are few differences between them. The difference points are described in the below table:
|It is a modern approach to evaluate multiple operations.
||It is a traditional approach to evaluate multiple operations.
|It does not use any temporary relations for storing the results of the evaluated operations.
||It uses temporary relations for storing the results of the evaluated operations. So, it needs more temporary files and I/O.
|It is a more efficient way of query evaluation as it quickly generates the results.
||It is less efficient as it takes time to generate the query results.
|It requires memory buffers at a high rate for generating outputs. Insufficient memory buffers will cause thrashing.
||It does not have any higher requirements for memory buffers for query evaluation.
|Poor performance if trashing occurs.
||No trashing occurs in materialization. Thus, in such cases, materialization is having better performance.
|It optimizes the cost of query evaluation. As it does not include the cost of reading and writing the temporary storages.
||The overall cost includes the cost of operations plus the cost of reading and writing results on the temporary storage.