MySQL Window Functions
A window function in MySQL used to do a calculation across a set of rows that are related to the current row. The current row is that row for which function evaluation occurs. Window functions perform a calculation similar to a calculation done by using the aggregate functions. But, unlike aggregate functions that perform operations on an entire table, window functions do not produce a result to be grouped into one row. It means window functions perform operations on a set of rows and produces an aggregated value for each row. Therefore each row maintains the unique identities.
The window functions are the new feature introduced in the release of MySQL version 8 that improves the execution performance of queries. These functions allow us to solve the query related problem more efficiently.
The following are the basic syntax for using a window function:
In the syntax, it can be seen that we have first specified the name of the window functions, which is followed by an expression. Then, we specify the OVER clause that contains three expressions that are partition_definition, order_definition, and frame_definition.
It makes sure that an OVER clause always has an opening and closing parentheses, even it does not have any expression.
Let us see the syntax of each expression used in the OVER clause:
This clause is used to divide or breaks the rows into partitions, and the partition boundary separates these partitions. The window function operates on each partition, and when it crosses the partition boundary, it will be initialized again. The syntax of this clause is given below:
In the partition clause, we can define one or more expressions that are separated by commas.
ORDER BY Clause
This clause is used to specify the order of the rows within a partition. The following are the syntax of ORDER BY clause:
We can also use it to order the rows within a partition on multiple keys where each key specified by an expression. This clause can also define one or more expressions that are separated by commas. Although the ORDER BY clause can work with all window functions, it is recommended to use it with order-sensitive window function.
A frame is the subset of the current partition in window functions. So we use frame clause to define a subset of the current partition. The syntax of creating a subset of the current partition using frame clause is as follows:
We can use the current row to define a Frame that allows moving within a partition with respect to the position of the current row.
In the syntax, the frame_unit that can be ROWS or RANGE is responsible for defining the type of relationship between the frame row and the current row. If the frame_unit is ROWS, then the offset of the frame rows and the current row is row number. While if the frame_unit is RANGE, then the offset is row values.
The frame_start and frame_between expressions are used to specify the frame boundary. The frame_start expression has three things:
UNBOUNDED PRECEDING: Here, the frame starts from the first row of a current partition.
N PRECEDING: Here, N is a literal number or an expression that evaluates in numbers. It is the number of rows before the first current row.
CURRENT ROW: It specifies the row of the recent calculation
The frame_between expression can be written as:
The above expression can have one of the following things:
frame_start: We have already explained it previously.
UNBOUNDED FOLLOWING: It specifies the end of the frame at the final row in the partition.
N FOLLOWING: It is the physical N of rows after the first current row.
If the frame_definition is not specified in the OVER clause, then by default MySQL uses the below frame:
Window Function Concept
In this section, we are going to see how we can use the window function in MySQL. So let us first create a table named "Sales" using the following statement:
Next, we have to add records into the table using INSERT statement as below:
To verify the records into a table, use the SELECT statement:
After execution, we can see that the records are added successfully into the table.
To understand window function, let us first see how an aggregate function works in MySQL. The aggregate function evaluates multiple rows and produces the result set into one row. So, execute the below statement that uses the aggregate function "SUM" and returns the total sales of all employees in the given year:
Again, we use the "SUM" function with the GROUP BY clause that works on the subset of rows. So, execute the below statement that returns the total sales of all products group by particular years:
In both examples, we can see that an aggregate function reduces the number of rows into a single row after the execution of the query.
Similar to aggregate function, window function also works with a subset of rows, but it does not reduce the result set into a single row. It means window functions perform operations on a set of rows and produces an aggregated value for each row. For example, execute the following statement that returns sales for each product along with total sales of the products by the given year:
In the above example, we can see that the window operation uses an OVER clause, which is responsible for partitioning the query rows into groups that processed by the window function. Here, the OVER clause partitions rows by year and produces a sum on every partition. After successful calculation, it produces this sum corresponding to each partitioning row.
Types of Window Function
We can categorize the window functions mainly in three types that are given below:
It is a function that operates on multiple rows and produces the result in a single row. Some of the important aggregate functions are:
COUNT, SUM, AVG, MIN, MAX, and many more.
It is a function that allows us to rank each row of a partition in a given table. Some of the important ranking functions are:
RANK, DENSE_RANK, PERCENT_RANK, ROW_NUMBER, CUME_DIST, etc.
It is a function, which is locally represented by a power series. Some of the important analytical functions are:
NTILE, LEAD, LAG, NTH, FIRST_VALUE, LAST_VALUE, etc.
Example of Analytical Function
Here, we are going to use the NTILE window function. This function takes an integer value as an argument that divides the group into a number of integer values. For example, if we use NTILE(4), then it divides the total records into four groups. When the total record is odd, it adds the odd records in the first row. The following query explains it more clearly.
In the above output, we can see that we have a total of 9 rows. So, the NTILE function divides it into four rows, and one extra row will be added into the first row.
Let us see another example using the "LEAD" function. This function is used to query more than one row in a table without joining the table itself. It means we can access the data of the next row from the current row. It returns the output from the next row. Execute the following statement to understand it more clearly: