used to describe data in some situations, and knowing how to interpret them is beneficial.
Key Terms
o Range
o Measure of position
o Percentile
o Quartile
Objectives
While measures of central tendency, dispersion, and skewness are used often in statistics,
there are other methods of characterizing or describing data distributions or portions that are
commonly used as well. We will examine several of these statistical measures, some of which
you may already know or have seen elsewhere.
Range
The range of a data set is simply the difference between the maximum and minimum values
of the set. (This measure is typically considered a measure of dispersion, since it is a simple
description of how far the data extends.) Thus, if a data set such as {x1, x2, x3,., xN} is
provided in increasing order so that xi < xi+1, then the range of the data set is simplyxN – x1. If
the data set is not ordered, then you must simply determine by inspection the maximum and
minimum values.
Solution: We can find the range either by simply looking for the maximum and minimum
values or by arranging the set in increasing order and then subtracting the first element from
the last. Although the latter approach is a bit more time consuming, it can be beneficial in
cases where you need to perform other calculations. So, let's order the data set for the sake of
completeness.
Document1 Page 1 of 3
Quartiles and Percentiles
Let's consider a number p, where p is a whole number between 0 and 100. Assume that the
number p describes the percentage of values less than or equal to some data value Np.
Consequently, 100 – p is the percentage of values greater than Np. This number Np is
the pthpercentile. Thus, to say that some data value x is the 75th percentile is to say that 75%
of all the values in the data set are less than or equal tox, and that 25% of the data values are
greater than x. Note that the percentile of a data value can also be understood as 100 times the
cumulative relative frequency of that value. (Recall that the cumulative relative frequency of
a value x is the relative frequency of all values less than or equal to x.) So, a student who gets
a test score in the 90th percentile, for instance, hasn't (necessarily) scored 90/100 correct--he
simply has a score that is at least as good as 90% of the other students. Although such a
description isn't necessarily very satisfying for the student (who is probably more interested
in finding out his percentage of correct answers), it is statistically helpful in certain situations.
Typically, the 0th and 100th percentiles are not discussed, because these values are simply
the minimum and maximum (respectively) of the data set.
Practice Problem: For the data set below, which value is in the 75th percentile?
Solution: We want to find the data value Np for which 75% of the data set is less than or equal
to Np. Note that there are a total of 16 values in the set; thus, 75% of the data set is 12 values.
Because the data set is ordered, we need simply find the 12th data value; then, 75% (12 out of
16 values) of the data set will be less than or equal to this value. The number 10 is the 75th
percentile: 75% of the values in the set are less than or equal to 10.
Practice Problem: Which of the following data values is the 50th percentile?
{1.52, 5.36, 6.79, 5.21, 0.28, 6.36, 8.47, 5.52, 6.26, 5.97}
Solution: The 50th percentile is that value N for which 50% of the values in the set are less
than or equal to N. To help us find this value, let's first order the data set.
{0.28, 1.52, 5.21, 5.36, 5.52, 5.97, 6.26, 6.36, 6.79, 8.47}
Document1 Page 2 of 3
The data set has 10 values; thus, the 50th percentile is the fifth data value, 5.52. Exactly half
(50%) of the data values are less than or equal to 5.52, and the remaining half are greater than
5.52.
Another measure of position is the quartile, which is similar to the percentile except that it
divides data into quarters (segments of 25% each) instead of hundredths. Thus, the nth
quartile is the value x for which (25n)% of the values are less than or equal to x. Three
quartiles are defined: Q1, Q2, and Q3. The quartile Q1 corresponds to the 25th percentile, Q2
to the 50th percentile, and Q3 to the 75th percentile.
The Q2 and the 50th percentile are sometimes said to correspond to the median of a data set.
Given our definition of a median, this is true when there are an odd number of data values; it
is not strictly true for an even number of data values (see the practice problem above)--the
median, according to our definition, would actually be the mean of 5.52 and 5.97. We could,
however, say that this median value (5.75) is the 50th percentile for the data set: technically,
half the values in the data set are below this value, and half are above. Thus, we can still
maintain our definition of the median if we appropriately define percentiles and quartiles. In
addition, we can also note that Q1 is the median of the first half of the values, and Q3 is the
median of the second half of the values. (Our above considerations on the definition of the
median apply here as well.)
Solution: Q3 is the value x for which 75% (three out of four) of the data values are at most x.
Since there are eight members in the data set, the sixth value is Q3-75. This value is also the
75th percentile.
Document1 Page 3 of 3