Teradata OLAP Functions

OLAP functions are similar to aggregate functions except that the aggregate functions will return only one value. In contrast, the OLAP function will provide the individual rows in addition to the aggregates.

The OLAP functions are built into the Teradata database to provide data mining capabilities and trend analysis.

These functions provide processing not available using the standard aggregation. The OLAP functions give the result of their operation and display the detail data values used in the function.

The detail row data is displayed as part of the answer set since the output is in row format and not in a report format like WITH.

The OLAP functions may be performed on all tables or views and used to populate tables in conjunction with INSERT/SELECT. The most significant difference is that these functions can be used in Queryman, unlike WITH.

Syntax

Following is the syntax of the OLAP function.

<aggregate function> OVER  
([PARTITION BY] [ORDER BY columnname][ROWS BETWEEN 
UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)

NOTE: Aggregation functions can be SUM, COUNT, MIN, MAX, and AVG.

Example

Consider the following Salary table of the employees.

Emp_Id	Gross	Deduction	NetPay
202001	40,000	4,000	36,000
202002	80,000	6,000	74,000
202003	90,000	7,000	83,000
202004	75,000	5,000	70,000

Following is an example to find the cumulative sum or running total of NetPay on the Salary table.

Records are sorted by Emp_Id, and the cumulative sum is calculated on the NetPay column.

SELECT  
Emp_Id, NetPay, 
SUM(Netpay) OVER(ORDER BY Emp_Id ROWS  
UNBOUNDED PRECEDING) as TotalSalary 
FROM Salary;

After executing the above query, it produces the following output.

   Emp_Id           NetPay     TotalSalary 
   202001           36000        36000 
   202002           74000        110000 
   202003           83000        193000 
   202004           70000        263000 

QUANTILE Function

A QUANTILE function is used to divide rows into several partitions of roughly the same number of rows in each partition. The percentile is the QUANTILE most commonly used in business.

By default, both the QUANTILE column and the QUANTILE value itself will be output in ascending sequence.

In some cases, the ORDER BY clause may be used to reorder the output for display. Here the order of the output does not change the meaning of the output, unlike a summation where the values are being added together and all need to appear in the proper sequence.

Syntax

Following is the syntax of the QUANTILE function.

SELECT QUANTILE (<partitions>, <column_name> ,<sort_key> [DESC | ASC])
FROM <table_name>
[QUALIFY QUANTILE (<column_name>) {< | > | = | <= | >=} <number of rows>];

RANK Function

RANK function orders the records based on the column provided. The RANK function can also filter the number of records returned based on the Rank.

The RANK function allows a column to be evaluated and compared based on either high or low order, against all other rows to create the output set.

The order will be sorted by default in the descending sequence of the ranking column, which correlates to descending Rank.

The output of the RANK function is the highest or the lowest data values in the column, depending on the sort requested.

Syntax

Following is the syntax to use the RANK function.

SELECT RANK( <column_name> [DESC | ASC] )
FROM <table_name>
[GROUP BY <column_name>, [<column_number> ] ] [{ ASC | DESC } ]
[QUALIFY RANK(<column_name>) {< | <=} <number of rows> ];

Example

Consider the following Employee table.

Emp_Id	First_Name	Last_Name	JoinedDate	Department_No	BirthDate
202001	Mike	Richard	3/27/2008	1	1/5/1980
202002	Robert	Williams	4/25/2013	1	3/5/1983
202003	Peter	Collin	3/21/2010	1	4/1/1983
202004	Alexa	Stuart	1/2/2017	2	11/6/1984
202005	Robert	Peterson	1/4/2015	2	12/1/1984

The following query orders the records of the employee table by Joined Date and assigns the ranking on Joined Date.

SELECT Emp_Id, JoinedDate, RANK() 
OVER(ORDER BY JoinedDate) as Seniority 
FROM Employee;

When the above query is executed, it produces the following output.

Emp_Id        JoinedDate       Seniority  
202001        03/27/2008         1 
202003        03/21/2010         2 
202002        04/25/2013         3 
202005        01/04/2015         4 
202004        01/02/2017         5 

PARTITION BY clause groups the data by the columns defined in the PARTITION BY clause and performs the OLAP function within each group. Following is an example of the query that uses the PARTITION BY clause.

SELECT Emp_Id, JoinedDate, RANK() 
OVER(PARTITION BY DeparmentNo ORDER BY JoinedDate) as Seniority 
FROM Employee;

When the above query is executed, it produces the following output. We can see that the Rank is reset for each Department.

Emp_Id         DepartmentNo       JoinedDate         Seniority  
202001           1                03/27/2008              1 
202003           1                03/21/2010              2 
202002           1                04/25/2013              3 
202005           2                01/04/2015              4 
202004           2                01/02/2017              5

Next TopicTeradata Data Protection

← prev next →