Teradata ExplainEXPLAIN command is the Parsing Engines (PE's) plan to the AMPs. EXPLAIN command returns the execution plan of the parsing engine in English translation. It can be used with any SQL statement except on another EXPLAIN command. When a query is preceded with the EXPLAIN command, the execution plan of the Parsing Engine is returned to the user instead of AMPs. The explain plan can give a clear picture of the way optimizer will execute a query. It is a good practice to analyze the explain plan for a newly developed query before running it in the Teradata system. The explain plan can be obtained in two ways. First, add an "EXPLAIN" keyword before any queries, and the second is simply pressing "F6" from the keyboard. Explain plan is beneficial to analyze the performance issue of a query as it breaks down any complex query into the lowest level. The Explain plan provides a lot of information, such as: - Access path: If data is fetching using full table scan or using any index like primary index path, secondary index path, or any other indexes.
- Confidence level: If the optimizer can use the statistics or any statistics is missing.
- Joins information: What kind of join is going to happen.
- Time estimation: Provides an estimated query completion time.
If we pass an explain command in front of a query statement or Simply Press the F6 key, there are the following estimated confidence messages that the optimizer will relay to the user, such as: - High Confidence: Statistics available on an Index or Column.
- Low Confidence: Random sampling of INDEX, or Statistics available but AND/OR condition exists.
- No Confidence: Random sampling based on AMP row count. No statistics are collected.
Explain Plan KeywordsTo understand the EXPLAIN plan, we should know the following Keywords. Keyword | Explanation |
---|
Locking Pseudo Table | Serial lock on a symbolic table. Every table has one. It is used to prevent deadlocks situations between users. | Locking table for | This indicates that ACCESS, READ, WRITE, or the EXCLUSIVE lock has been placed on the table. | Locking rows for | Indicates that an ACCESS, READ or WRITE, the lock is placed on rows read or written. | Do an ABORT test | Guarantees a transaction is not in progress for this user. | All AMPs retrieve | All AMPs are receiving the AMP steps and are involved in providing the answer set. | By way of an all-rows scan | Rows are read sequentially on all AMPs. | By way of the primary index | Rows are read using the Primary index column(s). | By way of index number | Rows are read using the Secondary index: number from HELP INDEX. | BMSMS | Bit Map Set Manipulation Step, alternative direct access technique when multiple NUSI columns are referenced in the WHERE clause. | Residual conditions | WHERE clause conditions, other than those of a join. | Eliminating duplicate rows | Providing unique values usually result in DISTINCT, GROUP BY, or subquery. | Where unknown comparison will be ignored | This indicates that NULL values will not compare to TRUE or FALSE. Seen in a subquery using NOT IN or NOT = ALL because no rows will be returned on the ignored comparison. | Nested join | The fastest join is possible. It uses UPI to retrieve a single row after using a UPI or a USI in the WHERE to reduce the join to a single row. | Merge join | Rows of one table are matched to the other table on common domain columns after being sorted into the same sequence, normally Row Hash. | Product join | Rows of one table are matched to all rows of another table with no concern for domain match. | ROWID join | A very fast join. It uses ROWID of a UPI to retrieve a single row after using a UPI or a USI in the WHERE to reduce the join to a single row. | Duplicated on all AMPs | Participating rows for the table of a join are duplicated on all AMPS. | Hash redistributed on all AMPs. | Participating rows of a join are hashed on the join column and sent to the same AMP that stores the matching row of the table to join. | SMS | Set Manipulation Step, the result of an INTERSECT, UNION, EXCEPT, or MINUS operation. | Last use | SPOOL file is no longer needed after the step and space are released. | Built locally on the AMPs | As rows are read, they are put into SPOOL on the same AMP. | Aggregate Intermediate Results computed locally. | The aggregation values are all on the same AMP and, therefore, no need to redistribute them to work with rows on other AMPs. | Aggregate Intermediate Results computed globally. | The aggregation values are not all on the same AMP and must be redistributed on one AMP to accompany the same value with the other AMPs. |
How to EXPLAIN Plan WorksThe EXPLAIN request modifier that precedes any SQL request causes Teradata Database to display the execution plan for that request. The request itself is not submitted for execution. - When we perform an EXPLAIN against any SQL request, that request is parsed and optimized.
- The access and join plans generated by the Optimizer are returned in the form of a text file that explains the (possibly parallel) steps used in the execution of the request.
- It also included the relative cost required to complete the request, given the statistics with which the Optimizer had to work.
- If the statistics are not reasonably accurate, the cost estimate may not be accurate.
Benefits of Using EXPLAINHere are some essential benefits of using the Explain Plan, such as: - EXPLAIN helps to evaluate complex queries and to develop alternative, more efficient, processing strategies.
- We can get a better plan by collecting more statistics on more columns, or by defining additional indexes.
- Knowing the actual demographics information may allow identifying row count estimates that seem badly wrong and help pinpoint areas where additional statistics would be helpful.
Examples of EXPLAINConsider the Employee table with the following definition. Full Table Scan (FTS)When no conditions are specified in the SELECT statement, then the optimizer may use the Full Table Scan where each and every row of the table is accessed. Example Following is a query where the optimizer may choose FTS. When the above query is executed, it produces the following output. As can be seen, the optimizer chooses to access all AMPs and all rows within the AMP. 1. First, we lock a distinct TDUSER."pseudo table" for reading on a RowHash to prevent global deadlock for TDUSER.Employee.
2. Next, we lock TDUSER.Employee to read.
3. We do an all-AMPs RETRIEVE step from TDUSER.Employee by way of an all-rows scan with no residual conditions into Spool 1 (group_amps) built locally on the AMPs. The size of Spool 1 is estimated with low confidence to be 2 rows (116 bytes). The estimated time for this step is 0.03 seconds.
4. Finally, we send out an END TRANSACTION step to all AMPs involved in processing the request.
//The contents of Spool 1 are sent back to the user as the result of statement 1.
//The total estimated time is 0.03 seconds.
Unique Primary IndexWhen the rows are accessed using Unique Primary Index, then it is one AMP operation. When the above query is executed, it produces the single-AMP retrieval, and the optimizer uses the unique primary index to access the row. 1. First, we do a single-AMP RETRIEVE step from TDUSER.Employee by way of the unique primary index "TDUSER.Employee.Employee_Id = 1001" with no residual conditions.
//The row is sent directly back to the user as the result of statement 1.
//The total estimated time is 0.01 seconds.
Unique Secondary IndexWhen the rows are accessed using Unique Secondary Index, it's a two amp operation. Example Consider the Salary table with the following definition. Consider the following SELECT statement. When the above query is executed, it produces the optimizer retrieves the row in two amp operations using a unique secondary index. 1. First, we do a two-AMP RETRIEVE step from TDUSER.Salary by way of unique index # 4 "TDUSER.Salary.Employee_Id = 1001" with no residual conditions.
//The row is sent directly back to the user as the result of statement 1.
//The total estimated time is 0.01 seconds.
|