Student's T-Test

We are going to learn about a common inferential statistic called the Student’s T-Test (it was published under the psedonym ‘Student’). It is a so-called parametric test, because it assumes your data follows a normal distribution.

Watch the video and then answer the questions below.

Thirty four-minute video

You can also view this video on YouTube


Key Points

  • The t-test calculates the T statistic
  • The T statistic can be converted to a p-value by comparing it to the t-distribution
  • The t-test assumes our data is normally distributed
  • T-tests can compare only up to two groups

One-sample and paired t-test

If we want to compare a group mean against a known value (e.g the population mean \( \mu \)), or a mean of a group of differences with a known value (e.g. (\ \mu = 0\)), we use the following formula:

\[ t = \frac{\overline{x} - \mu}{\frac{s}{\sqrt{n}}} \]

We use the t-test (as opposed to the z-test) when we don’t know the population standard deviation \( \sigma \), so we use the sample standard deviation \( s \). Because of the uncertainty in calculating \( s \), we get a t-statistic instead of a z-statistic, and have to compare it on a t-distribution.

Two-sample t-test

If we want to compare a differences between group means (\(\overline{x}_1 - \overline{x}_2\)) against an expected difference (e.g. 0 as in the formula below), we use the following formula:

\[ t = \frac{(\overline{x}_1 - \overline{x}_2) - 0 }{\sqrt{\frac{s_1^2}{n_1} + \frac{s_2^2}{n_2}}} \]

Where \( \overline{x}_1 \) and \( \overline{x}_2 \) are the means of the two groups, \( n_1 \) and \( n_2 \) are the number of observations in the two groups, and \( s_1 \) and \( s_2 \) are the standard deviations of the two groups.


Questions

1. Check your understanding

1. Pick the appropriate statistical test formula
  Expression \( z = \frac{x - \mu}{\sigma} \) \( z = \frac{\overline{X} - \mu}{\frac{\sigma}{\sqrt{n}}} \) \( t = \frac{\overline{x} - \mu}{\frac{s}{\sqrt{n}}} \) \( t = \frac{(\overline{x}_1 - \overline{x}_2) - 0 }{\sqrt{\frac{s_1^2}{n_1} + \frac{s_2^2}{n_2}}} \)  
1. I compare the average height of two groups
2. I compare a group’s performance in a puzzle against a theoretical mean that assumes completely random behaviour
3. I assess the IQ of a group assuming \( \mu = 100, \sigma = 15 \)
4. I investigate if drinking coffee increases a participant’s heart rate compared to a resting value
5. I run a counterbalanced game enjoyment study. Each participant plays two games and rates each of them. I want to see if one game is more enjoyable than the other

Check Answers

2. Calculate the t statistic

I collect 10 sensor readings each from 2 sensors. I want to see if there is a difference between the means of their readings.

Group 1 Group 2
0.4 6.3
3.6 -1.2
3.3 -11.3
1.5 -6.3
-1.7 -5
0.1 -3.4
4.2 2.4
-1.8 14.7
1.9 -2.9
-3.6 9.9

We should use a:

We get a t statistic of:

(2 decimal places)

Here there are 20 data points. Because we “spend” 2 of them to calculate the mean for each group we are left with 18 degrees of freedom (df). We do a 2 tailed test against an \( \alpha = 0.05 \). Look up our t and df in a table of t-statistics for different alpha values. If our t value is larger than the one listed for our t and df we have significance.

Is our result significant?

Check Answers


Summary

In this section we have learned about sampling from a population and threats to validity involved. Once you’ve completed the questions, you can move on to the inferential statistics challenges.