Cross Tabulation in SAS

In SAS, cross-tabulation is one of the most useful analytical tools. It is implemented by the specific statistical procedure to produce cross tables, also called contingency tables.

Cross-tabulation determines the relationship between two categorical variables. It uses all possible combinations of two or more variables to produce cross tables.

We can produce a cross table by using PROC FREQ along with the TABLES option.

To generate one- to- n way frequency table, use the TABLES statement. Without using the TABLES statement, PROC FREQ will generate only one-way tables for all the variables that are present in the data set.

The variables used in the TABLES statement can be either numerical or categorical because PROC FREQ considers all variables to be categorical.

Syntax:

Where,

PROC FREQ: It is the procedure used to find the frequency.

Dataset: It is the name of the data set, which variables will be is used to create a cross table.

Tables: It is used to create cross tables.

Variable_1 and Variable_2: these are the names of variables whose frequency distribution needs to be calculated.

Example:

Suppose, we need to find the number of types of car that are available under each car brand in the data set cars. Data set cars is already available in the SAS HELP library. We also need the individual frequency values of the invoice, horsepower, length, weight as well as the sum of these frequency values across both make and type variables.

Execute the above code in SAS Studio:

Output:

As we can see in the above output, the cross-tabulation table of variable make and type have been generated with frequency values of the invoice, horsepower, length, weight, and their sum.

Cross-tabulation of 3 Variables

In the case of three variables, we can make a group two variables and then paired this group with the third variable.

Example

In the following example, we are going to find the frequency of model and type variable with respect to the make variable of the of a cars data set. Besides, we use the NOCOL, NOROW, NOFREQ, and NOPERCENT options to ignore the sum and percentage values from the output. We can use these options in different combinations together or independently.

Execute the above code in SAS studio:

Output:

Frequency on the basis of Type:

Frequency on the basis of Model:

Cross-tabulation of 4 Variables

In the case of four variables, the total number of combinations is 4, and each variable from group 1 is paired with each variable of group 2.

Example:

In the following example, we are going to find the frequency of model and type variable with respect to the make variable of the of a cars data set.

In the following example, we are going to find the frequency of length variable of the cars data set for variables make and model. Similarly, the frequency of variable horsepower for variable make and model.

Execute the above code in the SAS studio:

Output:

As we can see in the above output, the total number of combinations is 4, and each variable from group 1 have been paired with each variable of group 2.

Next Topic#