{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "
Introduction to hypothesis testing
\n", "\n", "An important part of the scientific process is to make hypotheses about the world or about the results of experiments. These hypotheses need then to be checked by collecting evidence and making comparisons. Hypothesis testing is a step in this process where statistical tools are used to test hypotheses using data.\n", "\n", "**This notebook is designed for you to learn**:\n", "* How to distinguish between \"population\" datasets and \"sample\" datasets when dealing with experimental data\n", "* How to compare a sample to a population, test a hypothesis using a statistical test called the \"t-test\" and interpret its results\n", "* How to use Python scripts to make statistical analyses on a dataset\n", "\n", "In the following, we will use an example dataset representing series of measurements on a type of flower called Iris." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Introduction\n", "\n", "shift + enter
.3*4
in it below the comment (#). Once this is done press shift + enter
or ► to execute the cellYou can check your answer by clicking on the \"...\" below.
\n", "3*4
and upon executing the cell the notebook should display the result, the number 12.shift + enter
You can check your answer by clicking on the \"...\" below.
\n", "mu
and sets its value to 5.552
.mu
.You can check your answer by clicking on the \"...\" below.
\n", "You can check your answer by clicking on the \"...\" below.
\n", "You can check your answer by clicking on the \"...\" below.
\n", "alpha
= (e.g 0.01, 0.10, 0.50) and see how the zones change. You can remove the means=means
argument to see only the theoretical distribution without the histogram of our 5000 samples.\n",
" You can check your answer by clicking on the \"...\" below.
\n", "my_sem
.\n",
" You can check your answer by clicking on the \"...\" below.
\n", "\n", "# Compute the sem for your sample\n", "my_sem = my_sample_std / np.sqrt(my_sample_size)\n", "\n", "# Display the standard error of the mean\n", "my_sem\n", "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### t value\n", "Now that we have an estimate for the standard deviation of the means $\\sigma_{\\overline{X}}$ we can compute the **t value** for our samples.\n", "\n", "$\n", "\\begin{align}\n", "t = \\frac{m - \\mu}{SEM}\n", "\\end{align}\n", "$\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Compute the t value:\n", "vuillerens_t = (vuillerens_sample_mean - mu) / vuillerens_sem\n", "\n", "# Display t\n", "vuillerens_t" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
my_t
.my_t
?\n",
" You can check your answer by clicking on the \"...\" below.
\n", "my_t
) is negative because the mean of your sample is smaller than $\\mu$, hence the numerator for the t-value is negative. \n", "# Compute my_t\n", "my_t = (my_sample_mean - mu) / (my_sample_std / np.sqrt(my_sample_size))\n", "my_t\n", "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Cut-off point and $\\alpha$\n", "With a normal distribition for example, we know that the most extreme 5% of observations are found above or below ±1.96 standard deviations above and below the mean. In our case, because our sample size is less than 130 (it is 50), our distribution is close to normal but not quite normal.\n", "\n", "In this case, it is possible to find out the relevant cut off point from [looking it up in statistical tables](https://en.wikipedia.org/wiki/Student%27s_t-distribution#Table_of_selected_values) for a Student's t distribution. The corresponding t distribution has a different shape for different samples sizes. The parameter used to determine the shape of the t distribution is called *degrees of freedom* and is equal to $n-1$, in our case 50-1 = 49.\n", "\n", "The most extreme 5% of cases are found above or below approximately 2.01 standard deviations from the mean. Because there are both positive and negative extreme cases, the cutoff point we are looking for is $t_{\\frac{\\alpha}{2}=0.025} = -2.01$ for the 2.5% negative extremes and $t_{1-\\frac{\\alpha}{2}=0.975} = 2.01$ for the 2.5% positive cases. The cutoff point 2.01 corresponds to most extreme 5% of possible values of |t| (positive and negative).\n", "\n", "The good news is that **Python gives us automatically the value of the cutoff point** based on the value of the significance level $\\alpha$ chosen and the sample size, thanks to the `stats` library which offers useful functions related to many statistical distributions such as Student's t.\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Define alpha and sample size\n", "alpha = 0.05\n", "sample_size = 50\n", "\n", "# Get the cutoff point for alpha at 0.05 and sample size of 50\n", "cutoff05 = stats.t.ppf(1 - alpha / 2, sample_size-1)\n", "\n", "# Print the cutoff point\n", "print(\"\\ncutoff05 is the value of t for alpha = 1-({:.3f} / 2) => {:.3f}\\n\".format(alpha, cutoff05))\n", "\n", "# Plot the t distribution with cutoff points\n", "plot_t_distribution(df=49, alpha=alpha);" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
You can check your answer by clicking on the \"...\" below.
\n", "alpha = 0.01
.alpha=0.2
, the cutoff point is smaller and the area in red is larger. \n", "alpha = 0.01\n", "cutoff01 = stats.t.isf(alpha / 2, sample_size-1)\n", "\n", "# Display cutoff\n", "print(\"\\ncutoff01 is the value of t for alpha ({:.3f} / 2) => {:.3f}\\n\".format(alpha, cutoff01))\n", "\n", "plot_t_distribution(df=49, alpha=alpha)\n", "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## t-test\n", "\n", "We can now restate our question: **\"is our sample mean in the most extreme 5% of samples that would be drawn from a population with the same mean as Anderson’s population?\"** in terms of a t-test: **\"is our t value greater than 2.01 times the standard error of the mean?\"**. \n", "\n", "This is equivalent to compare the **t value**\n", "$\n", "\\begin{align}\n", "t = \\frac{m - \\mu}{\\sigma_{\\overline{X}}}\n", "\\end{align}\n", "$\n", "to the cutoff point 2.01 (or -2.01).\n", "\n", "One issue here is that **when $m$ is smaller than $\\mu$, the value of $t$ can be negative**. This is because, just like for the Normal distribution, Student's t-distribution is symmetrical and centred on zero, zero meaning there is no difference between the mean of the sample and the mean of the population. \n", "\n", "So when comparing $t$ to the cutoff point, either we take its absolute value $|t|$, which is what we do below, or if $t$ is negative we compare it to the negative value of the cutoff point (i.e. -2.01 for a significance level of 0.05).\n", "\n", "If $|t| > $ cutoff$_\\alpha$ we say:\n", "* the t-test is statistically significant at the level $\\alpha$\n", "* we can reject $H_0: m = \\mu$ and accept $H_a: m \\neq \\mu$\n", "* the mean from our sample is different from the population mean $\\mu$" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Compare t with our cutoff point\n", "if abs(vuillerens_t) > cutoff05: \n", " print(\"The difference IS statistically significant \"+\n", " \"because the t value |{:.3f}| > {:.3f}\".format(vuillerens_t, cutoff05))\n", "else: \n", " print(\"The difference is NOT statistically significant \"+\n", " \"because the t value |{:.3f}| < {:.3f}\".format(vuillerens_t, cutoff05))\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We see in the results above that for our Vullierens sample $|t| > 2.01$, therefore the difference between the two means is greater than 2.01 times the standard error. In other words, **our sample mean IS in the most extremes 5%** of samples that would be drawn from a population with the same mean as Anderson's population. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Visualization of *t*\n", "\n", "Using Python we can visualize what the t-test means graphically by plotting the t-distribution of all the possible sample means that would be drawn from a population with the same mean as Anderson's population and showing where `t` is in the distribution compared to the zone defined by our $\\alpha$ of 5%. \n", "\n", "### Rejection zones\n", "\n", "* If the *t* value falls outside of the rejection zone defined by $\\alpha$, then that means that the difference between our sample mean and the population mean is **not statistically significant**. \n", "* If it falls into the rejection zone, then the difference is **statistically significant** and the sample should not be considered as coming from the Anderson population under the significance level we have chosen.\n", "\n", "The cell below uses an external library to generate a graphical visualization of the result of the t-test for the 2 samples we have used so far." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "plot_t_distribution(df=49, alpha=0.05)\n", "\n", "# In green the t-value for the Vuillerens sample\n", "plt.axvline(x=vuillerens_t, color='green', linestyle='-.', linewidth=1, label=\"t value for Vuillerens\")\n", "\n", "# In blue your own sample\n", "plt.axvline(x=my_t, color='blue', linestyle='-.', linewidth=1, label=\"t value for My sample\")\n", "\n", "plt.legend();" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Testing for your own sample\n", "Let's now check for your own sample whether the the t value falls inside the rejection zone." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Compare your sample t (the second sample t) with the cutoff point\n", "if abs(my_t) > cutoff05: \n", " print(\"The difference IS statistically significant because\" + \n", " \"the t value |{:.3f}| > {:.3f}\".format(my_t, cutoff05))\n", "else: \n", " print(\"The difference is NOT statistically significant because\" +\n", " \" the t value |{:.3f}| < {:.3f}\".format(my_t, cutoff05))\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
You can check your answer by clicking on the \"...\" below.
\n", "cutoff05
in the code above by the variable cutoff01
we have defined earlier with the appropriate value for the cutoff point. See the solution code below.\n", "if abs(vuillerens_t) > cutoff01: \n", " print(\"The difference IS statistically significant because\" + \n", " \"the t value |{:.3f}| > {:.3f}\".format(my_t, cutoff01))\n", "else: \n", " print(\"The difference is NOT statistically significant because\" +\n", " \" the t value |{:.3f}| < {:.3f}\".format(my_t, cutoff01))\n", "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The statistical test we have just performed here, where we compare our sample mean to the mean of a population, is called a **one-sample t-test**: *one-sample* because we compare a sample to the mean of a population, and *t-test* because the distribution of all the possible sample means of the population follows a distribution called *Student's t-distribution*. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Conclusion\n", "\n", "What can we conclude from there? What the one sample t-test tells us is that when we have evidence (a t value greater than a cutoff value) which would lead us to think that the sample doesn't come from an Anderson like population we **can reject our hypothesis $H_0$** and accept the **alternative hypothesis** $H_a$. The $H_a$ states that the sample does not come from an Anderson lilke population. \n", "\n", "$H_0: m = \\mu$\n", "\n", "$H_a: m \\neq \\mu$\n", "\n", "\n", "Now there are some limitations to keep in mind when using the one sample t-test, that we will explore in the section below.\n", "\n", "\n", "---" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Influence of the sample size\n", "\n", "Above, we have seen that $t = $\n", "$\n", "\\begin{align}\n", "\\frac{m - \\mu}{SEM}\n", "\\end{align}\n", "$ with the standard error of the mean $SEM = $\n", "$\n", "\\begin{align}\n", "\\frac{s}{\\sqrt{n}}\n", "\\end{align}\n", "$.\n", "\n", "Therefore we can rewrite the *t* statistics as:\n", "\n", "$\n", "\\begin{align}\n", "t = \\frac{m - \\mu}{\\frac{s}{\\sqrt{n}}}\n", "\\end{align}\n", "$\n", "\n", "This means that *t* is actually:\n", "\n", "$\n", "\\begin{align}\n", "t = \\frac{m - \\mu}{s}\\sqrt{n}\n", "\\end{align}\n", "$\n", "\n", "From there, we see that the **sample size $n$ influences the value of $t$**: all else being equal (i.e. sample mean, sample standard deviation and population mean), **a larger sample would result in a higher value of $t$** and therefore more chances to find a significant result for the t-test.\n", "\n", "The shape of the t distribution varies a bit depending on the sample size (for small sample sizes), hence the cutoff point also depends on the sample size. To simplify, we will use a cutoff value of 2.00 to illustrate the relationship of t and n. What sample size would make the value of $t$ reach 2.00, all else being equal (i.e. with identical sample mean, sample standard deviation and population mean)?
You can check your answer by clicking on the \"...\" below.
\n", "sample_mean
and sample_std
in the code above by the variables my_sample_mean
and my_sample_std
. See the solution code below.\n", "n = ((2.0 * my_sample_std) / (my_sample_mean - mu)) ** 2 \n", "n\n", "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
You can check your answer by clicking on the \"...\" below.
\n", "cutoff05
in the code above by the variables cutoff01
. See the solution code below.\n", "n = ((cutoff01 * my_sample_std) / (my_sample_mean - mu)) ** 2 \n", "n\n", "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Plotting t against sample size\n", "We now investigate the relationship between the sample size and the corresponding |t| values with a plot that varies the sample size on the x-axis and the corresponding |t| values on the y-axis. The cell below illustrates this for the Vuillerens sample.\n", "\n", "The function `plot_n_and_t` is increasing `n` from `from_n` to `to_n` in increments of `step_n`. " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "plot_n_and_t(vuillerens_sample_mean, vuillerens_sample_std, mu, from_n=10, to_n=60, step_n=5)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The plot above illustrates how the t-value changes as a function of the sample size. In the Vuillerens sample, we need **n=41** flowers in the sample to reach a t-value equal to the cutoff point for $\\alpha=0.05$: $t[49]=2.01$." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
to_n
in the cell above.\n",
" You can check your answer by clicking on the \"...\" below.
\n", "\n", "plot_n_and_t(vuillerens_sample_mean, vuillerens_sample_std, mu, from_n=10, to_n=80, step_n=5)\n", "\n", "\n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
to_n
) to find the smallest possible sample size that would be needed to reach a cutoff value of |2.00| and |2.66|.\n",
" You can check your answer by clicking on the \"...\" below.
\n", "t
compare to the cutoff value (2.01)?p
compare to $\\alpha$ (0.05)?You can check your answer by clicking on the \"...\" below.
\n", "You can check your answer by clicking on the \"...\" below.
\n", "\n", "# Compute the t-test for your own sample (my_sample).\n", "my_t, my_p = stats.ttest_1samp(my_sample, mu)\n", "\n", "# Display the result\n", "print(\"t = {:.3f}\".format(my_t))\n", "print(\"p = {:.3f}\".format(my_p))\n", "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Visualization of the p-value\n", "\n", "Using Python we can visualize the t-test graphically by plotting the t-distribution of all the possible sample means that would be drawn from a population with the same mean as Anderson's population and showing where the `t values` from our samples are in the distribution compared to the zone defined by our $\\alpha$ of 5%.\n", "\n", "In addition to displaying the value of *t*, the visualization below also **shows the *p-value*** (represented by the hatched zones left and right), which is the **area under the curve of the t-distribution** representing the probability of getting a more extreme sample mean than the one we observe. When this area is larger than the rejection zone defined by the $\\alpha$ we have chosen, then that means the difference between the sample mean and the population mean is not statistically significant." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Visualize graphically the result of the t-test and the p-value with alpha at 0.05\n", "plot_t_test(vuillerens_sample, mu, alpha=0.05)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We see that:\n", "* p < $\\alpha$ \n", "* the t value is larger than the cutoff value for $\\alpha=0.05$\n", "* the hatched zone is included in the red zone. \n", "* the test is statistically **signifiant**. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Importance of the choice of $\\alpha$\n", "\n", "In the cell below we illustrate the **influence of the choice of $\\alpha$** by let looking at the rejection zones and the cutoff values for $\\alpha=0.01$." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Visualize graphically the result of the t-test and the p-value with alpha at 0.01\n", "plot_t_test(vuillerens_sample, mu, alpha=0.01)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We see that:\n", "* p > $\\alpha$ \n", "* the t value is smaller than the cutoff value for $\\alpha=0.01$\n", "* the hatched zone is not included in the red zone. \n", "* the test is statistically **not signifiant**. \n", "\n", "It is striking to see that the same Vuillerens sample leads to a t-test which is sgnificant (at $\\alpha=0.05$) or not significant (at $\\alpha=0.01$) depending on the choice of $\\alpha$. Depending on \"how certain do we want to be that the sample mean is different from the population mean $\\mu$\", the conclusion of the test is different. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Your own sample\n", "Let's now visualise the t-test for your own sample with $\\alpha=0.05$." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "plot_t_test(my_sample, mu, alpha=0.05)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
my_sample
for which you had computed my_t
.You can check your answer by clicking on the \"...\" below.
\n", "\n", "if my_t < -onesided_cutoff05: \n", " print(\"The difference IS statistically significant because \" + \n", " \"the t value {:.3f} < {:.3f}\".format(my_t, -onesided_cutoff05))\n", "else: \n", " print(\"The difference is NOT statistically significant because \" +\n", " \" the t value {:.3f} > {:.3f}\".format(my_t, -onesided_cutoff05))\n", " \n", "plot_t_test(my_sample, mu, alpha=0.05, tail=\"lower\")\n", "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ " \n", "\n", "---" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Summary\n", "\n", "In this notebook, you have seen how to compare a sample to a population using an approach called **hypothesis testing** and using a statistical test called a **one-sample t-test**.\n", "\n", "To summarize, to compare the mean of a sample to a reference value from a population, you have to proceed in four main steps:\n", "1. Look at descriptive statistics and visualizations of the sample you have to get an idea about how it compares to the population\n", "1. Formulate the hypothese you want to test: \n", " * For two-tailed tests the null hypothesis $H_0: m = \\mu$ and its alternate $H_a: m \\neq \\mu$\n", " * For upper-tailed tests the null hypothesis $H_0: m \\leq \\mu$ and its alternative $H_a: m > \\mu$\n", " * For lower-tailed tests the null hypothesis $H_0: m \\geq \\mu$ and its alternative $H_a: m < \\mu$\n", "1. Choose a significance level for being sure, usually $\\alpha = 0.05$ or $\\alpha = 0.01$, or even $\\alpha = 0.001$ \n", "1. Determine the cutoff value for your given $\\alpha$ level.\n", "1. Compute the result of the t-test and interpret the result\n", " * if the |t| value is *larger* than the cutoff value for the the given $\\alpha$ level, then $H_0$ should probably be rejected. \n", " * if the p-value is *below* the significance level you have chosen, $p \\lt \\alpha$, then it means $H_0$ should probably be rejected." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ " \n", "\n", "---\n", "\n", "