SQL Server ROLLUP
The ROLLUP clause in SQL Server is an extension of the grouping set operator. This article will give a complete overview of the ROLLUP clause to aggregates different groups with subtotals and grand totals.
It is the subclass of the GROUP BY clause that performs an aggregate operation on multiple levels in a hierarchy and prepares summary reports. It allows us to generate multiple grouping sets within a single query, which is impossible with the GROUP BY clause as it aggregates a single group. Thus, we can say that the ROLLUP provides a more detailed analysis by employing a single query to create several grouping sets along the hierarchy of columns.
NOTE: ROLLUP modifier produces the summary output, including extra rows that represent super-aggregate summary operations. It is mainly used to provide support for OLAP (Online Analytical Processing) operations.
The following is the basic syntax that illustrates the ROLLUP clause in SQL Server:
The parameters of the above syntax are described below:
SQL Server also provides another syntax to use the ROLLUP clause as given below:
Both the syntax will return the same output.
SQL Server ROLLUP Example
Let us understand when and how the ROLLUP clause is used in the query practically. We can do this by first creating a table named EMPLOYEE using the below statement:
Next, we will insert some records into this table as below:
Execute the SELECT statement to verify the table data:
In the above table, we can see that we have got employees from three different countries US, UK, and India. Now, based on these table data, we want to write a query to retrieve salaries grouped by country as below:
Executing the query will return the following output:
This query is good if we want to get the total salary based on country. But it's not a better option when we want to get the grand totals of all salaries in the result also because it can only aggregate data at one level, which is subtotal, not grand totals. Hence, we will use the ROLLUP clause along with GROUP BY to examine both the subtotal and the grand totals for a category as it extends the capability of the GROUP BY clause.
The following query is used to retrieve subtotal and grand totals of salaries grouped by country:
We will get the below output:
We can also get the same output using the below syntax:
Here is the result:
In the output, we can see the total salary for each country as well as returns the grand totals of all salaries. The ROLLUP clause will add an extra row in the result set to display the grand totals. Here NULL represents the grand totals. We can also give it a meaningful name using the COALESCE function to replace the NULL value with any provided text.
The below statement will substitute "Grand Total" for the null value:
Here is the result that replaced the NULL value with the given text:
ROLLUP with multiple columns
We already know that the ROLLUP clause performs aggregate operations on multiple levels in a hierarchy and prepare summary reports at each level within a single query.
Let's understand this concept with an example. Suppose there are multiple columns in the GROUP BY clause. In this case, the ROLLUP clause assumes a hierarchy among the columns supplied in the GROUP BY clause. The query adds an extra super-aggregate summary row to the end of the result whenever a column value changes.
For example, we have specified the three columns in the GROUP BY clause as below:
The ROLLUP modifier assumes the hierarchy as below:
And generates the following grouping set:
The following query helps to understand it more clearly:
Since we have not used the ROLLUP clause, a summary of the employee table based on multiple columns supplied in the GROUP BY clause would look like the below output. In this case, we will only receive summary values at the gender/country level of analysis.
Now we will analyze the output with the ROLLUP clause:
Executing the query will display the following output where we can see the several extra rows:
This output generates data at three different levels of analysis, which are described below:
It ensures that we get a different answer if we modify the order of the columns defined in the GROUP BY clause. See the below query:
Here is the result:
GROUPING Function with ROLLUP
The GROUPING() function determines whether NULL in the output is a regular grouped value, a super-aggregate value, or grand totals. It yields one when NULL occurs in a supper-aggregate row. Otherwise, it returns 0. The GROUPING() function can be used in the select list, HAVING clause, and ORDER BY clause.
See the below query:
Execute the query will return the below output:
In the output, we can see that the GROUPING(gender) returns one when NULL in the gender column occurs in a super-aggregate row. Otherwise, it will return zero.
Similarly, the GROUPING(country) returns one when NULL in the country column occurs in a super-aggregate row. Otherwise, it will return zero.
How is ROLLUP different from CUBE?
ROLLUP and CUBE in SQL Server are used for reporting purposes and display the subtotal and grand totals. We generally use these clauses with the GROUP BY clause. However, both are used for similar objectives, but they have some differences also.
ROLLUP returns an output that shows aggregates for a hierarchy of values in the columns you've chosen. On the other hand, CUBE creates a result set that includes aggregates for all possible combinations of values in the columns that have been selected.
Let us take an example to illustrate these differences. Suppose we have three columns c1, c2, and c3. Next, we will perform aggregation at a different level.
If we use these columns with the CUBE clause as CUBE(c1, c2, c3), we will get the eight possible grouping sets:
If we use these columns with the ROLLUP clause as ROLLUP(c1, c2, c3), we will get only four possible grouping sets: