Teradata Performance Tuning

In Teradata, Performance tuning is done to identify all the bottlenecks and then resolve them.

The bottleneck is not a form of error, but it certainly causes a certain amount of delay.

There are the following ways to identifying a bottleneck and resolve them, such as:

EXPLAIN Plan

The first step in performance tuning is the use of EXPLAIN on the query. When a query is preceded with the EXPLAIN command, the execution plan of the Parsing Engine is returned to the user instead of AMPs.

The EXPLAIN plan gives a clear picture of the way optimizer will execute a query. In the Explain plan, check for the keywords such as

Confidence level
Join strategy used
Spool file size
Redistribution, etc.

Collect Statistics

The optimizer uses Data demographics to come up with an effective execution strategy. COLLECT STATISTICS command is used to collect data demographics of the table. Make sure that the statistics collected on the columns are up to date.

Collect statistics on the columns that are used in the WHERE clause and on the columns used in the joining condition.
Collect statistics on the Unique Primary Index columns.
Collect statistics on Non-Unique Secondary Index columns. The optimizer will decide if it can use NUSI or Full Table Scan.
Collect statistics on the Join Index through the statistics on the base table is collected.
Collect statistics on the partitioning columns.

Data Types

Each column in a table is associated with a data type. Data types specify what kind of values will be stored in the column. Every data value belongs to an SQL data type.

Make sure that proper data types are used. This will avoid the use of excessive storage than required.

Conversion

Ensure that the data types of the columns used in join conditions are compatible to avoid explicit data conversions.

Sort

Remove unnecessary ORDER BY clauses unless required.

Spool Space Issue

Spool space error is generated if the query exceeds per AMP spool space limit for that user.

Verify the explain plan and identify the step that consumes more spool space. These common queries can be split and put as separately to build temporary tables.

Primary Index

The primary index is used to specify where the data resides in Teradata. Each table is required to have a primary index defined, so make sure the Primary Index is correctly set for the table.

The primary index column should evenly distribute the data and should be frequently used to access the data.

Partition Primary Index

If Partition Primary Index is defined in tables, try to use it. If we are not using it in the filter condition, it will degrade the performance.

SET Table

If we define a SET table, then the optimizer will check if the record is duplicate for each and every record inserted. To remove the duplicate check condition, we need to define Unique Secondary Index for the table.

UPDATE on Large Table

Updating the large table will be time-consuming. Instead of updating the table, we can delete the records and insert the files with modified rows.

Dropping Temporary Tables

Drop the temporary tables (staging tables) and volatiles if they are no longer needed. This will free up permanent space and spool space.

MULTISET Table

If we confirmed that the input records would not have duplicate files, we could define the target table as a MULTISET table to avoid the duplicate row check used by the SET table.

IN Clause

Avoid the IN clause in filter conditions when there can be a huge number of values in where conditions, then it's better to insert such values in a volatile table and use a volatile table with INNER JOIN in the main query.

UPDATE Clause

Avoid using the UPDATE clause with just SET condition and no WHERE conditioning. Even if the Target and Source have only one row, add WHERE clause for PI column.

Next Topic#

← prev next →