15 Statistical Hypothesis Tests in Python

There are hundreds of statistical tests used for testing hypotheses. However, only a handful of them are required for machine learning projects. In this tutorial, we will see some of the most important hypothesis tests that one must know if one wants to work in the fields related to statistical modelling. We will implement these tests in Python programming language.

Every hypothesis test mentioned below contains the following information related to the test:

  • What is the test called?
  • What are we checking in the test?
  • What are the key assumptions for implementing the test?
  • How to interpret the test results?
  • How to implement the test in Python?

Note that these assumptions are very important. If the assumptions like the expected distribution of the data sample or the size of the sample required are violated, the results of the test will not be accurate. The interpretation based on these results will be highly unreliable. Hence, keeping these assumptions in check before applying the tests is very important.

Data samples often require to be sufficiently large to reveal how they're distributed for analysis and illustrative of the domain.

In some circumstances, it is possible to adjust the data so that it conforms to the assumptions. To provide just two instances, this may be done by eliminating outliers from a distribution that is almost normal in order to make it more normal or by adjusting the degrees of freedom in a test when the variance of the given data samples is different.

Finally, there could be several tests available for a certain issue, like normalcy. With statistics, we cannot obtain precise solutions to questions; rather, we obtain probabilistic ones. As a result, by thinking about the same subject in several ways, we might come up with various responses. Consequently, many tests may be required to address some data-related queries we may have.

Normality Tests

In this section, we will see the tests that are used to test if the given data sample has Gaussian distribution or not. The assumption that the data follows a Gaussian distribution forms a basic requirement for many statistical modeling techniques. Hence, these tests are very important.

Shapiro-Wilk Test

Hence, this test tests if the given data sample has Gaussian or Normal distribution.

Assumptions

  • The observations of every sample are independent in nature, and they are identically distributed. The abbreviation of this assumption is IID.

Interpretation

H0: The sample follows a Gaussian distribution

H1: the given sample does not follow a Gaussian distribution.

Code

Output:

The statistic value is: 0.9621855020523071, and the p-value is 0.8104783892631531
The data does not follow a Gaussian distribution.

D'Agostino's K^2 Test

This test tests whether the given data sample is Gaussian or not.

Assumptions

  • The observations of every sample are independent in nature, and they are identically distributed.

Interpretation

H0: The sample follows a Gaussian distribution

H1: the given sample does not follow a Gaussian distribution.

Code

Output:

The statistic value is: 1.0653637027947445, and the p-value is 0.5870285334466323
The data does not follow a Gaussian distribution

Anderson-Darling Test

This test tests whether the given data sample is Gaussian or not.

Assumptions

  • The observations of every sample are independent in nature, and they are identically distributed.

Interpretation

H0: The sample follows a Gaussian distribution

H1:the given sample does not follow a Gaussian distribution.

Code

Output:

The statistic value is: 0.20692157645671116

The critical value at 15.0% is 0.501
The data does not follow a Gaussian distribution at 15.0%

The critical value at 10.0% is 0.57
The data does not follow a Gaussian distribution at 10.0%

The critical value at 5.0% is 0.684
The data does not follow a Gaussian distribution at 5.0%

The critical value at 2.5% is 0.798
The data does not follow a Gaussian distribution at 2.5%

The critical value at 1.0% is 0.95
The data does not follow a Gaussian distribution at 1.0%

Correlation Tests

Now we will see the tests which compare the two samples and tell if they are related or not.

Pearson's Correlation Coefficient

This test tests whether the given two data samples have a linear relationship or not.

Assumptions

  • The observations of every sample are independent in nature, and they are identically distributed.
  • The observations of every sample follow the normal distribution.
  • The observations in every sample have the same variance.

Interpretation

H0: the given two samples are not dependent, i.e., they are independent.

H1: there is some sort of dependency between the given samples.

Code

Output:

The statistic value is: 0.6135196215696078, and the p-value is 0.05922727627191346
The data samples are independent of each other

Spearman's Rank Correlation

This is a step ahead of the Pearson test. It tests if the given samples have a monotonic relationship. The relationship can be linear or non-linear.

Assumptions

  • The observations of every sample are independent in nature, and they are identically distributed.
  • The observations of both samples are ranked.

Interpretation

H0: the given two samples are not dependent, i.e., they are independent.

H1: there is some sort of dependency between the given samples.

Code

Output:

The statistic value is: 0.6969696969696969, and the p-value is 0.02509667588225183
Both data samples are dependent on each other

Kendall's Rank Correlation

This is a step ahead of the Pearson test. It tests if the given samples have a monotonic relationship.

Assumptions

  • The observations of every sample are independent in nature, and they are identically distributed.
  • The observations of both samples are ranked.

Interpretation

H0: the given two samples are not dependent.

H1: there is some sort of dependency between the given samples.

Code

Output:

The statistic value is: 0.5111111111111111, and the p-value is 0.04662257495590829
Both data samples are dependent on each other

Chi-Squared Test

Pearson's test can only be used with numerical values. Spearman's and Kendall's rank correlation tests can be used for ordinal data. Ordinal data is categorical data that have a certain order. But for nominal data (categorical data with no order), these tests cannot be used. To test the dependency or the relationship between the nominal data, we use the Chi-Squared test.

Assumptions

  • The observations which will be used for the calculation of the contingency table should be independent.
  • Each cell of the contingency table should contain more than 25 observations.

Interpretation

H0: the given two samples are not dependent.

H1: there is some sort of dependency between the given samples.

Code

Output:

Expected Frequencies are [[27.03703704 26.54545455 31.95286195 29.49494949 30.96969697]
 [27.96296296 27.45454545 33.04713805 30.50505051 32.03030303]]
The statistic value is: 1.8882030380034551, and the p-value is 0.7563117707680647
The data samples are independent of each other

Stationary Tests

Time series is a very important topic. The models performed on time series require the time series data to be stationary. Therefore, to apply any model, we need to first check if the time series data is stationary or not. Now we will see tests to check the stationarity of the data.

Augmented Dickey-Fuller Unit Root Test

Through this test, we check whether the given time series data has a unit modulus root. Or, in more technical terms, is the data autoregressive or not? The autoregressive time series is stationary. If the time series has a unit modulus root, then it is not stationary.

Assumptions

  • The observations should be in a temporal order.

Interpretation

H0: the time series has a unit root (the series is not stationary).

H1: The unit modulus root is not present (the series is stationary).

Code

Output:

The order of the autoregressive model is 1
The statistic value is: -10.232070586545865, and the p-value is 4.998574442108246e-18
The given time series is stationary

Kwiatkowski-Phillips-Schmidt-Shin

This test tests if the given time series has a stationary trend or not. If the series is trend-stationary, then that means the series is deterministic.

Assumptions

  • The observations should be in temporal order.

Interpretation

H0: the given time series has a stationary trend.

H1: the given time series does not have a stationary trend.

Code

Output:

The order of the autoregressive model is 0
The statistic value is: 0.09930151338766009, and the p-value is 0.1
The given time series is not stationary

Parametric Statistical Hypothesis Tests

Now we will see the parametric tests. In these tests, we test if a certain parameter of one or more samples is equal to or different from a value or from each other.

Student's t-test

In this test, the parameter is the mean of the given samples. We check if the means of the two samples are independent on, in other words, significantly different from each other.

Assumptions

  • The observations of every sample are independent in nature, and they are identically distributed.
  • The observations of both samples follow the normal distribution.
  • The observations of both the samples have the same variance.

Interpretation

H0: the mean values of the given samples are equal.

H1: the mean values of the given samples are not equal.

Code

Output:

The statistic value is: 0.6713796580759667, and the p-value is 0.5105037120903526
The given samples have equal mean values

Paired Student's t-test

In this test also, the parameter is mean. However, this test is used when the two samples are paired. Two samples are said to be paired if both values are observed using the same sample before and after a certain treatment.

Assumptions

  • The observations of every sample are independent in nature, and they are identically distributed.
  • The observations of both samples follow the normal distribution.
  • The observations of both samples have the same variance.
  • The observations are paired for each sample.

Interpretation

H0: the mean values of the paired samples are equal.

H1: the mean values of the paired samples are not equal.

Code

Output:

The statistic value is: 0.9502747511161275, and the p-value is 0.36679175997294733
The paired samples have equal mean values

Analysis of Variance Test (ANOVA)

In this test, we use variance to determine if two or more samples are different from each other or the same.

Assumptions

  • The observations of every sample are independent in nature, and they are identically distributed.
  • The observations of both samples follow the normal distribution.
  • The observations of both samples have the same variance.

Interpretation

H0: the mean values of the given samples are equal.

H1: the given one or more than one mean values of the given multiple samples are not equal.

Code

Output:

The statistic value is: 0.3557581063875854, and the p-value is 0.7038772383760818
The samples have equal mean values

Nonparametric Statistical Hypothesis Tests

Mann-Whitney U Test

This test will test if the samples taken from two independent population data are equal or not.

Assumptions

  • The observations of every sample are independent in nature, and they are identically distributed.
  • The observations of both samples are ranked.

Interpretation

H0: the distributions underlying the independent samples are equal.

H1: the distributions underlying the independent samples are not equal.

Code

Output:

The statistic value is: 60.0, and the p-value is 0.47267559351158717
The samples have the same distributions

Wilcoxon Signed-Rank Test

This test tests if the distributions of the given two or more paired observation samples are equal or not.

Assumptions

  • The observations of every sample are independent in nature, and they are identically distributed.
  • The observations of both samples are ranked.
  • Observations of each sample are paired.

Interpretation

H0: the distributions underlying the independent samples are equal.

H1: the distributions underlying the independent samples are not equal.

Code

Output:

The statistic value is: 15.0, and the p-value is 0.232421875
The samples have the same distributions

Kruskal-Wallis H Test

This test tests if the distributions of the given two or more observation samples are equal or not.

Assumptions

  • The observations of every sample are independent in nature, and they are identically distributed.
  • The observations of both samples are ranked.

Interpretation

H0: the distributions underlying the independent samples are equal.

H1: the distributions underlying the independent samples are not equal.

Code

Output:

The statistic value is: 0.5714285714285694, and the p-value is 0.4496917979688917
The samples have the same distributions

Friedman Test

This test tests if the distributions of the given two or more paired observation samples are equal or not.

Assumptions

  • The observations of every sample are independent in nature, and they are identically distributed.
  • The observations of both samples are ranked.
  • Observations of each sample are paired.

Interpretation

H0: the distributions underlying the independent samples are equal.

H1: the distributions underlying the independent samples are not equal.

Code

Output:

The statistic value is: 2.4000000000000057, and the p-value is 0.3011942119122012
The samples have the same distributions

Summary

You learned about the primary hypothesis tests in this tutorial that you may apply in a machine learning project.

In particular, you discovered:

  • The many test types to employ depending on the situation, including normality checks, correlations between variables, and the paired natures of the sample.
  • The main presumptions underlying each test, as well as how to evaluate the results.
  • How to use the Python API for executing the test?