Z Scores

In this section we will look at what an inferential statistic is, and the form they usually take. While there are loads of different formulas for different tests, they generally share the same general approach.

Watch the video and then answer the questions below.

Sixteen-minute video

You can also view this video on YouTube


Key Points

The general form of an inferential statistic is usually like this:

\[ \frac{\textrm{observed} - \textrm{expected}}{\textrm{typical variation}} \]

This gives us a number which expresses the difference from what we expected proportional to what variation we would typcially expect.

Different test statistics are defined in this way, such as the z-statistic and the t-statistic. To interpret the liklihood of achiving a particular value, you need to look where that value falls on the probability distribution for that statistic.

The z-distribution is a normal distribution with a mean of 0 and a standard deviation of 1. So a z statistic of 2 is equivalent to being 2 standard deviations away from the mean, or in the most extreme 5% of the data.

Picking a single value from a normal distribution

If you’ve caught a fish (a single value) and you want to compare its size to a population mean, then we use the following form of the formula:

\[ z = \frac{x - \mu}{\sigma} \]

Where \( x \) is the size of the fish caught (obsered), and \( \mu \) is the population mean (expected). Here our sampling distribution is the same as the population distribution (our observed value represents 1 fish, and our population is a population of fish). Therefore, the typical variation we expect is equal to \( \sigma \), where \( \sigma \) is the population standard deviation.

This gives a z-statistic which is interpreted using the z-distribution.

Picking a group mean from a distribution of group means

Say we catch 100 fish and find the average size, and compare that average to the population mean. When we average over a group of fish, we are no longer sampling directly from the population of fish. Instead our sampling distribution is of means of groups of fish, (where \( n \) is our group size).

\[ z = \frac{\overline{X} - \mu}{\frac{\sigma}{\sqrt{n}}} \]

This gives a z-statistic which is interpreted using the z-distribution.


Questions

1. Check your understanding

1. What is your sampling distribution?
  Expression The population distribution Group means Differences in group means  
1. You compare your IQ to the population
2. You compare the average height of children in a class against an estimate for the population
3. You catch a big fish and compare it against a population mean
4. A factory produces pipes. You select one pipe from that day’s production and compare its thickness against a benchmark value
5. A factory produces pipes. You compare the average thickness for the day’s production against a benchmark value
6. You measure the average number of frogs per square meter in two different wetland habitats and compare the results

Check Answers

2. Pick the appropriate statistical test formula
  Expression \( z = \frac{x - \mu}{\sigma} \) \( z = \frac{\overline{X} - \mu}{\frac{\sigma}{\sqrt{n}}} \)  
1. Is my score on a standardised test higher than average?
2. Are the scores of a class on the same test higher than average?
3. I generate numbers from a normal distribution. I get 100 surprisingly low values and think there is a bug in my code
4. I think a particular person has an unusually high IQ

Check Answers

3. Is this significant?

Assume signficance means 95% liklihood that the result was not chance.

  Expression Yes No  
1. z = 0.5
2. p = 0.03
3. z = 2
4. z = 25
5. p = 2

Check Answers

2. Implementing the Z-Test

Implement one of the following functions to perform and display the result of a z-test. Choose from one of the following:

  1. Take an arrays of numbers, a population mean, and a population standard deviation.
  2. Take a single measurement, a population mean, and a population standard deviation.

Your function should print to System.out:

  1. The calculated z statistic
  2. Whether or not the result is significant
  3. The effective alpha level used in the test

Summary

In this section we learned the basics of how inferential statistics works by looking at the z test.

  • You should be able to identify when the z-test would be used
  • You should know the general formula that inferential statistics commonly take
  • You should be able to identify your sampling distribution

Once you’ve completed the questions, you can move on to the next section about a very commonly used statistical test, called the t-test.