Describing Data
Lesson 5: Measuring Central Tendency
TIME FRAME: 1 hour session
OVERVIEW OF LESSON
The lesson begins with students engaging in a review of various measures of central tendency.
Following the review, students are given cases where these measures are calculated. Students are
also asked to examine both strengths and limitations of these measures. Some time will be
devoted to having students discuss questions with a partner before reporting to the class.
Assessments will be given to students on their ability to calculate these measures, and also to get
an overall sense of whether they recognize how these measures responds to changes in data
values.
LEARNING OUTCOME(S): At the end of the lesson, the learner is able to
LESSON OUTLINE:
1. Introduction/Warm Up
2. Case Studies
3. Analysis and Comments on Cases
DEVELOPMENT OF THE LESSON
(A) Introduction/Warm Up
For 10 minutes, let students recall that data has variation; ask them what would be some
ways to describe the center of a data set? Three commonly used measures of the center are:
Mean
Median
Mode
Inform students that the most widely used measure of the center is the (arithmetic) mean.
\
The mean of a data set is the sum of the data values divided by
the number of data values.
All the data contribute equally in its calculation. That is, the weight of each of the data
items in the list is the reciprocal of the number N of data, i.e. 1/N.
Mention to students that the mean represents the center of gravity. That is, if the values in
a list were to be put on a dot scale, the mean acts as the balancing point where smaller
observations will balance the larger ones.
Special Note: A measure of economic performance called the Gross Domestic Product
(GDP), which represents the value of all goods and services produced within the domestic
territory for a specific period of time. The GDP can also be related with the goods and
services which go to consumption, to investments, including those that go to exports less the
countrys imports. When GDP is divided by total population, we have some average measure
of income or expenditure in the domestic territory. When a countrys economic production
and growth (as measured by the GDP) is healthy, we expect to see low unemployment and
increases in incomes as businesses demand more labor to meet the growing economy. An
abrupt change in the GDP also has effects on the stock market. A healthy economy which
indicates high consumption and production would translate into higher profits for
companies, which in turn, would increase stock prices.
When there are extremes in a set of data, the mean is not be a good measure of the center.
One alternative measure of the center is the median, the cut off where the data are split
evenly into lows and highs.
Inform students that the median is fairly easy to calculate particularly when the size of the
data is rather small. However, for moderate and large data sets, the median may be tedious to
compute, as sorting the data would be cumbersome. With available computing tools such as
spreadsheet applications (like Excel), we may be able to readily obtain the sorted data (and
even the median itself) but we would still have to encode the data into our software package.
Tell students that another alternative measure of central tendency is the mode, that value of a
variable that occurs most frequently in a distribution. It is also sometimes referred to as the
nominal average. In a given data set, the mode can easily be picked out by ocular inspection,
especially if the data are not too many. In some data sets, the mode may not be unique. The
list is said to be unimodal if there is a unique mode, bimodal if there are two modes, and
multimodal if there are more than two modes. For continuous data, the mode is not very
useful since here, measurements (to the most precise significant digit) would theoretically
occur only once.
Chapter 1 Describing Data Lesson 5 Page 2
The mode is a more helpful measure for discrete and qualitative numeric data than for other
types of data. In fact, in the case of qualitative numeric data, the mean and median are
meaningful. Here, we say there is no mode.
Divide class into groups of three to five learners. Let some groups work on case 1, others on
case 2, and others on case 3. Give groups 20 minutes to work on their cases. Randomly
select a group to present their group work for 5 to 10 minutes, with the remaining groups
asked to make comments for 3 to 5 minutes on the presentations.
Last week, one mansion at the end of the street was just finished being built, and the family
of Manny Pacquiao decided to move in! Suppose that the monthly income of the Pacquiaos is
13.67 million pesos.
Ask students to make a histogram to represent the new household incomes for their street.
From the histogram, ask them to estimate where the center is.
Tell them to calculate the Mean, Median, and Mode for the income data, with and without the
Pacquiao family.
Instruct students to write a paragraph explaining what the best choice for the measure of
central tendency is, and why.
(i) Inform students that the Pacquiao income is called an outlier, i.e. a value in a
dataset that is not very typical (in relation to the rest of the data). These outliers
seriously affect the mean, but not the median nor the mode.
(ii) Mean, Median, and Mode without Pacquiao are 30,536.76471, 32,250 and 32,250,
respectively; while Mean, Median, and Mode with Pacquiao are 420,235.7143,
32250, and 32250, respectively. Thus, students should indicate that the median is the
best choice for an average, when we consider income distribution. The mean gets
easily affected by the presence of the extreme observation (the high income of the
Pacquiao family), increasing the average from about 31 thousand pesos to over 400
hundred thousand.
Case 2: Color for the Senior High School Dance
For the senior high school dance, there is a debate going on among students regarding the
color that will be featured prominently. Votes were sent by students via SMS, and the results
are as follows:
Red 300 votes Yellow 220
Ask students to make a Pie Graph showing the outcome of the election.
Instruct students to find the Mean, Median, and Mode for the colors, if possible (not the
amount of votes!)
Tell them to write a paragraph explaining why you could or could not find each measure of
the center. Which measure of center will determine the color to be prominently used during
the senior high school dance?
10 correct 8 students
9 correct 12 students
8 correct 6 students
7 correct 5 students
Ask them what are the mean, median, and mode for this set of data.
Tell students to imagine that the teacher said Everyone in the class will be getting either the
mean, median, or mode for their official score.
Small Large
What measure of center did the group think was best?
Case 2
Best to use
Summarize the scenario in one sentence:
Mean or
Best to use Median
Mode
Outliers No
Outliers
What measure of center did the group think was best? Best to use Best to use
Median Mean
Case 3
10
0
1 2 3 4 5 6 7 8
ANSWER:
There is an outlier. Use either the mode (2), or the median (3).
2) The average age of 10 full time guidance counselors is 35. Two new full time guidance
counselors, aged 28 and 30, are hired. Five years from now, what would be the average age of
these twelve guidance counselors?
ANSWER:
Sum of ages is 350 for 10 counselors, with the two extra, sum is now 308, thus yielding an average age
currently at 34 years. Five years from now, the average will go up to 39 years for the 12 guidance
counselors.
3) Houses in a certain area in Makati have a mean price of P4,000,000, but a median price of
only P2,500,000. How might you explain this best?
ANSWER:
There is an outlier (an extremely expensive house) in the prices of the houses.
4) Five persons are asked the number of hours they spend watching television in a week. Their
responses are: 5, 7, 3, 38, and 7.
b. If another person were to be asked the same question and he/she responded 200 hours,
how would this affect the mean, median and mode?
ANSWER:
Explanatory Note:
Teachers have the option to just ask this assessment orally to the entire class, or to group
students and ask them to identify answers, or to give this as homework, or to use some
questions for a chapter examination.
WHAT is
The mean: represents the value that each data point would take on if the total of the data values were
redistributed equally.
The median: the middlemost score (or average of the two middle scores when the data are even);
HOW
To calculate the mean (often called the average):
2) Divide the sum by how many values there are in the data set.
TECHNICAL NOTES
To
x1 x 2 xn
By convention, we represent a list of n data as , , , and denote its sum through the
x i x i
i 1
summation notation or when there is no confusion in the values of the indices as so
x i x i x
i 1
compute for the median:
n n n
that the1)mean
Sort is
thewritten as lowest or
data from or highest .to lowest).
to highest (or
2) For an odd number of data, the median of a data set is the middle observation. When the
number of data is even, the median is the average of the two middle scores.
Mode
Getting a frequency distribution:
Value Frequency
1 1
2 0
3 1
4 1
5 2
6 2
7 1
8 1
9 1