Contributor: Lynn Ellis. Lesson ID: 13745
In this lesson, you will learn to divide data into 4 groups with an equal number of data points, called quartiles. From the quartiles, you will calculate the interquartile range, a measure of spread.
Sometimes, it is extremely helpful to divide data into four equal pieces.
Fortunately, statistics gives us tools that make it easier to divide a data set into four equal pieces than it is to divide a random ink blob.
NBA players are typically very tall men and, as such, they have big feet. Here are the shoe sizes of 14 current NBA players:
15, 13, 13.5, 16, 14, 16, 14, 17, 17, 18, 14, 12, 15, 15.
Let's try to organize these shoe sizes into four equal chunks of data.
You may have already noticed that 14 is not divisible by 4. If there were 16, we could think about taking sets of four.
Their original order is pretty random.
Think about those questions for a minute. Jot down what you think.
You may already know what the median is and how to calculate it. The median is the middle number when all the numbers are place in order.
I'll start by putting the numbers in order:
12, 13, 13.5, 14, 14, 14, 15, 15, 15, 16, 16, 17, 17, 18
Hmm, with 14 numbers there is no number in the middle. But that's okay. We just take the middle two numbers and average them.
The middle two numbers are both 15, so the median of this data set is 15.
That tells us that half of the NBA players in this group wear shoes that are size 15 or less, and half of the NBA players in this group wear shoes that are size 15 or bigger.
Imagine that the median draws a line between the bottom half of the data and the top half of the data. It looks something like this (the letter M represents the median):
12, 13, 13.5, 14, 14, 14, 15 | 15, 15, 16, 16, 17, 17, 18
M = 15
Now look at the lower half of the data. We can find the middle of that in the same way that we found the middle of the entire data set.
The numbers are already in order, and the middle number in that lower half is 14. We call the middle number of the lower half of the data Quartile 1, and we abbreviate it Q1.
I'll place that into our data:
12, 13, 13.5, 14, 14, 14, 15 | 15, 15, 16, 16, 17, 17, 18
Q1 = 14
M = 15
Note that 14 is a part of our data set, but it also acts as the line dividing the lower fourth (quarter) of the data from the second fourth (quarter) of the data.
We can do the same thing with the upper half of the data set. You try that.
Sweet -- so did I. We call this number Quartile 3, and we abbreviate that Q3.
I'll add that into our data:
12, 13, 13.5, 14, 14, 14, 15 | 15, 15, 16, 16, 17, 17, 18
Q1 = 14
M = 15
Q3 = 16
We now have our data organized so that we know where the dividing lines are for the first quarter of data, the second quarter of data, the third quarter of data, and the fourth quarter of data.
We call that the Interquartile Range (IQR), aptly named because it is the range that is between Q3 and Q1. And we find it by subtracting:
IQR = Q3 - Q1
Here is our vocabulary from this lesson:
» quartile 1 (Q1) - the middle of the lower half of the data (when written least to greatest)
» median - the middle of the entire data set (when written least to greatest)
» quartile 3 (Q3) - the middle of the upper half of the data (when written least to greatest)
» interquartile range (IQR) - the middle half of the data; calculated by subtracting Q1 from Q3
Continue on to the Got It? section to practice finding Q1, Q3, and the IQR.