Measures of Central Tendency

In this section, we’re going to learn about measures of central tendancy, very common descriptive statistics used to describe datasets.

Thirteen-minute video

You can also view this video on YouTube


Key Points

A measure of central tendancy tells you where the middle of your data is. There are all sorts of things you could consider the “middle” of your dataset, so there is more than one measure of central tendancy.

These work for numeric data.

  • Mean - sum all and divide by the number ( \frac{\sum_{i=0}^{n} X_i}{n})
  • Median - order the values and find the one in the middle. If there are an even number of values, find the mean of the two middle values.
  • Mode - the most common value in the dataset

Why are they used?

  • Give an impression of what is a normal value in the data
  • Compare two data sets to see if one is larger or smaller on average than the other
  • Check whether a value is above or below the centre.

Limitations

  • By itself, measures of central tendency don’t say anything about the distribution of the data
  • Just because one dataset is on average larger than another is not enough to conclude that there is a meaningful difference.

Sorting in Java

To calculate the median, you need to have sorted data. You can use the Arrays.sort() method from the java.util.Arrays package.


Questions

1. Check your understanding

Check Answers

2. Implementing in Code

A Replit project is available in Python and model answers for Python are also available.

1. Central tendancy functions

Write the functions double mean(double[] arr), double median(double[] arr), and double mode(double[] arr) to calculate the mean, median, and mode of an array of numbers. You may use Java’s inbuilt sorting.

2. Weighted averages

A weighted average can be calculated by multiplying each number by a weight and summing the result. Write a function double weightedAverage(double[] arr) that takes an array of numbers and an array of weights and calculate the weighted average. You can assume the weights sum to 1.

3. Calculating mean again, but differently

Write a function to calculate the mean of an array of numbers without making reference to the length of the array. Use a for loop to iterate through the numbers in the array one by one.

4. Harmonic Mean

The harmonic mean is a measure of central tendency that is often more robust to outliers than the ‘normal’ mean (also known as the arithmetic mean). Write a function which calculates the harmonic mean of an array of numbers

Summary

In this section we have learned about measures of central tendancy.

  • You should be able to calculate mean, median, and mode.
  • You should understand why measures of central tendancy are used.

In the next section we learn about measures of spread.