Correlation in Microsoft Excel: coefficient, matrix, and graph
In Microsoft Excel, Correlation is considered to be one of the simplest as well as the most basic calculations (statistics) that we can perform. Though it is very simple to use, it is useful in understanding the relations between two or more variables that are used in the Excel sheet for calculation.
And besides all this, Microsoft Excel primarily provides all the necessary tools to run out the correlation analysis. In these tutorials, we will be discovering the below-mentioned topics respectively:
What do you mean by the term correlation in Microsoft Excel?
Correlation in Microsoft Excel is a measure that is used to describe the strength as well as the direction of a relationship that exists between the two variables effectively.
Furthermore, the Correlation functions are the most commonly used in statistics, economics, and the social sciences in order to create budgets, planning of business, etc.
And the efficient methods that are used in Microsoft Excel is to study how closely the variables are related and are termed as correlation analysis.
Now we will be seeing a couple of examples that are related to the strong Correlation:
And here the examples of data that have weak or zero amount of Correlation:
And the most important thing that needs to be remembered about Correlation is that it is only used to show how closely the two variables are related and does not imply any causation.
The fact that changes in one variable are closely associated with changes in the other variable does not mean that one of the respective variables causes the different variables to get change effectively.
What is meant by the term Correlation coefficient in Microsoft Excel?
In Microsoft Excel, the correlation coefficient (r) is the numerical measure of the degree of association between the two continuous variables.
And the given coefficient value usually lies in between of -1 and 1, and this could be effectively used for the purpose of measuring out the strength as well as the direction of the linear relationship lies in between of the variables.
We can assume that, larger the absolute value of the coefficient stronger the relationship will get exits:
In Microsoft Excel, the coefficient sign that is plus or minus used to indicate the direction of the relationship respectively.
And now, in order to get a better insight into the above-used function, we will be looking at the below-mentioned graph as well:
What do you mean by Pearson correlation in Microsoft Excel?
We all know that statistics measure several types of Correlation, which depend on the kind of data with which we work.
The Pearson Correlation in Microsoft Excel stands for the Pearson Product Moment Correlation(PPMC) that is used for the purpose of evaluating linear relationships between the given data when a change in one variable is associated with a proportional change in the other variable respectively.
In simple terminology, the Pearson Correlation also answers the basic question: Can the data be represented on a line?
In statistics, it is the most popular correlation type. If we are dealing with a "correlation coefficient" without further qualification, it is likely to be only Pearson.
And the most commonly used formula that can be effectively used to find the Pearson correlation coefficient in Microsoft Excel is also termed to be Pearson's R:
And at the same time, we may come across two other formulas that can be used to calculate the sample correlation coefficient (r) and the population correlation coefficient (?), respectively.
How can one find Pearson correlation in Microsoft Excel?
Calculating the Pearson correlation coefficient by making use of the hand involves much math. And luckily, Microsoft Excel has made out things much simple. Depending on our data set and our respective goal, we are free to use one of the following techniques as well:
CORRELATION Formula in Microsoft Excel
The correlation formula that is primarily used in Microsoft Excel is as follows:
How to make use of the CORRELATION Function in Microsoft Excel?
The CORRELATION Function in Microsoft Excel is primarily considered as simple to use function. Now after that we will be seeing how one can make use of the CORRELATION function in Excel with the help of the various examples.
# Example 1: For a Set of Positive Variables or Dataset in Excel sheet
With the help of the Correlation function, we are required to find out the correlation coefficient that is effectively exiting in between of the two datasets or the variables as well.
The below table contains the two variables, one in column X and the other one in column Y, in which both the datasets include positive values:
Step 1: Now we will apply the Correlation function in the cell "C14". And then, we will select the cell "C14" in which the function needs to be used.
Step 2: Now in these step, we will be clicking on the insert function button (fx), that is present just under the formula toolbar, and soon after clicking the dialog box will get appears on the screen; then we will typing out the keyword that is "CORREL" in the search for a function box, the CORREL function will get appears in the select function box. We will be double-click on the CORREL function respectively.
Step 3: A dialog will appears on the screen in which we need to fill out the arguments for the CORREL function to carry out.
Now for the Array1 argument, we will be clicking just inside the cell B8 and see the cell selected; then, we will determine the cells until B12. So that column range will be chosen from cell B8 to cell B12 respectively.
Now for the Array2 argument, we will be then clicking inside the cell C8 and will see that the cell gets selected; then after that we will be determining the cells until cell "C12". So that column range will be chosen from cell C8 to cell C12.
i.e. =CORREL (B8:B12, C8:C12) which will gets appear in cell C14.
And after entering both array arguments, we will click on the "ok" option.
Step 4: After that, it will return the output as 0.988104929, where the Correlation coefficient between the two datasets or the variables is 0.98.
The Graphical representation can also be done using the line chart, which is present just under chart options.
We are having two variables that are X and Y, where one is plotted on the particular X-axis and the other on the Y-axis as well.
For this, we will be selecting the table range excluding the header X as well as the Y that is from cell B8 to cell C12 respectively:
After that, we will click on the Insert tab, which is present just under the line options, and select the first one in the line chart option.
As soon we perform all the above steps, it will give the specific chart following the value mentioned:
And the chart elements such as legend series (X, Y) axis title (X and Y axis), chart title (POSITIVE CORRELATION) as well as the data label (Values) need to be updated in the chart as well.
# Example 2: For a Dataset Containing Positive & Negative Values in Microsoft Excel
With the help of the Correlation function, we need to find the correlation coefficient between the two datasets or the variables.
In the below-mentioned example, we have used two variables; one is primarily represented in column x and the other in column Y. The column X datasets usually contain positive values, and the column Y datasets contain negative ones.
Step 1: Now, we will move on by applying the Correlation function in the particular cell, "C29". And we will be selecting the cell "C29," in which the Correlation function needs to be applied.
Step 2: And now after that we will be clicking on the insert function button (fx) that is present just under the formula toolbar; a dialog box will get appear on the screen, and we will then typing the keyword "CORREL" in the search for a function box, and the CORREL function will calls will get appear in the select function box. After that, we will be double clicking on the CORREL function respectively.
Step 3: A dialog box will appear on the screen where we need to fill out the required arguments for the CORREL: =CORREL (array1, array2).
And for the given Array1 argument, we will be then clicking inside cell B23, and we will get encountered with the selected cell; after that we will be selecting the cells till B27. So that column range will be chosen from cell B23 to cell B27.
Similarly, for the Array2 argument, we will click inside cell C23, encounter the selected cell, and select the cells till C27. So that column range will get selected effectively, from cell C23 to cell C27.
And =CORREL (B23:B27, C23:C27) will appear in cell C29.
And after entering both array arguments, we will click on the "ok" option.
Step 4: After that, it will return the output as: -0.988104929, whereas the Correlation coefficient between the two datasets is -0.98, respectively.
Similar to the above example, it can also be represented graphically by using a line chart which is present just under chart options.
We are having two variables, X and Y, in which one is plotted on the respective X-axis and the other on the Y-axis.
Moreover, we can see the negative Correlation, i.e., Variables X and Y values are negatively correlated (Negative linear relationship). In this case, y moves on decreasing when x goes on increasing.
What important things need to be remembered while using the Correlation function in Microsoft Excel?
The important points, as well as things that are required to be remembered by an individual while dealing with the Correlation Function in Microsoft Excel, are as follows: