Teradata Performance Tuning
In Teradata, Performance tuning is done to identify all the bottlenecks and then resolve them.
The bottleneck is not a form of error, but it certainly causes a certain amount of delay.
There are the following ways to identifying a bottleneck and resolve them, such as:
The first step in performance tuning is the use of EXPLAIN on the query. When a query is preceded with the EXPLAIN command, the execution plan of the Parsing Engine is returned to the user instead of AMPs.
The EXPLAIN plan gives a clear picture of the way optimizer will execute a query. In the Explain plan, check for the keywords such as
The optimizer uses Data demographics to come up with an effective execution strategy. COLLECT STATISTICS command is used to collect data demographics of the table. Make sure that the statistics collected on the columns are up to date.
Each column in a table is associated with a data type. Data types specify what kind of values will be stored in the column. Every data value belongs to an SQL data type.
Make sure that proper data types are used. This will avoid the use of excessive storage than required.
Ensure that the data types of the columns used in join conditions are compatible to avoid explicit data conversions.
Remove unnecessary ORDER BY clauses unless required.
Spool Space Issue
Spool space error is generated if the query exceeds per AMP spool space limit for that user.
Verify the explain plan and identify the step that consumes more spool space. These common queries can be split and put as separately to build temporary tables.
The primary index is used to specify where the data resides in Teradata. Each table is required to have a primary index defined, so make sure the Primary Index is correctly set for the table.
The primary index column should evenly distribute the data and should be frequently used to access the data.
Partition Primary Index
If Partition Primary Index is defined in tables, try to use it. If we are not using it in the filter condition, it will degrade the performance.
If we define a SET table, then the optimizer will check if the record is duplicate for each and every record inserted. To remove the duplicate check condition, we need to define Unique Secondary Index for the table.
UPDATE on Large Table
Updating the large table will be time-consuming. Instead of updating the table, we can delete the records and insert the files with modified rows.
Dropping Temporary Tables
Drop the temporary tables (staging tables) and volatiles if they are no longer needed. This will free up permanent space and spool space.
If we confirmed that the input records would not have duplicate files, we could define the target table as a MULTISET table to avoid the duplicate row check used by the SET table.
Avoid the IN clause in filter conditions when there can be a huge number of values in where conditions, then it's better to insert such values in a volatile table and use a volatile table with INNER JOIN in the main query.
Avoid using the UPDATE clause with just SET condition and no WHERE conditioning. Even if the Target and Source have only one row, add WHERE clause for PI column.