Anda di halaman 1dari 12

Chapter 1:

Describing Data
Lesson 5: Measuring Central Tendency
TIME FRAME: 1 hour session
OVERVIEW OF LESSON
The lesson begins with students engaging in a review of various measures of central tendency.
Following the review, students are given cases where these measures are calculated. Students are
also asked to examine both strengths and limitations of these measures. Some time will be
devoted to having students discuss questions with a partner before reporting to the class.
Assessments will be given to students on their ability to calculate these measures, and also to get
an overall sense of whether they recognize how these measures responds to changes in data
values.
LEARNING OUTCOME(S): At the end of the lesson, the learner is able to

calculate commonly used measures of central tendency,


provide a sound interpretation of these summary measures, and
discuss the limitations of these measures.

LESSON OUTLINE:
1. Introduction/Warm Up
2. Case Studies
3. Analysis and Comments on Cases
DEVELOPMENT OF THE LESSON
(A) Introduction/Warm Up
For 10 minutes, let students recall that data has variation; ask them what would be some
ways to describe the center of a data set? Three commonly used measures of the center are:

Mean
Median
Mode

Inform students that the most widely used measure of the center is the (arithmetic) mean.
\

The mean of a data set is the sum of the data values divided by
the number of data values.

Chapter 1 Describing Data Lesson 5 Page 1


A basic feature of the mean, also called the average, is the ease in calculation.

All the data contribute equally in its calculation. That is, the weight of each of the data
items in the list is the reciprocal of the number N of data, i.e. 1/N.

Mention to students that the mean represents the center of gravity. That is, if the values in
a list were to be put on a dot scale, the mean acts as the balancing point where smaller
observations will balance the larger ones.

Special Note: A measure of economic performance called the Gross Domestic Product
(GDP), which represents the value of all goods and services produced within the domestic
territory for a specific period of time. The GDP can also be related with the goods and
services which go to consumption, to investments, including those that go to exports less the
countrys imports. When GDP is divided by total population, we have some average measure
of income or expenditure in the domestic territory. When a countrys economic production
and growth (as measured by the GDP) is healthy, we expect to see low unemployment and
increases in incomes as businesses demand more labor to meet the growing economy. An
abrupt change in the GDP also has effects on the stock market. A healthy economy which
indicates high consumption and production would translate into higher profits for
companies, which in turn, would increase stock prices.

When there are extremes in a set of data, the mean is not be a good measure of the center.
One alternative measure of the center is the median, the cut off where the data are split
evenly into lows and highs.

The median of a data set is the middle observation when the


data set is sorted (in either increasing or decreasing order). Note
that when the size n is even, the median is the average of the two
middle scores.

Inform students that the median is fairly easy to calculate particularly when the size of the
data is rather small. However, for moderate and large data sets, the median may be tedious to
compute, as sorting the data would be cumbersome. With available computing tools such as
spreadsheet applications (like Excel), we may be able to readily obtain the sorted data (and
even the median itself) but we would still have to encode the data into our software package.

Tell students that another alternative measure of central tendency is the mode, that value of a
variable that occurs most frequently in a distribution. It is also sometimes referred to as the
nominal average. In a given data set, the mode can easily be picked out by ocular inspection,
especially if the data are not too many. In some data sets, the mode may not be unique. The
list is said to be unimodal if there is a unique mode, bimodal if there are two modes, and
multimodal if there are more than two modes. For continuous data, the mode is not very
useful since here, measurements (to the most precise significant digit) would theoretically
occur only once.
Chapter 1 Describing Data Lesson 5 Page 2
The mode is a more helpful measure for discrete and qualitative numeric data than for other
types of data. In fact, in the case of qualitative numeric data, the mean and median are
meaningful. Here, we say there is no mode.

(B) Case Studies

Divide class into groups of three to five learners. Let some groups work on case 1, others on
case 2, and others on case 3. Give groups 20 minutes to work on their cases. Randomly
select a group to present their group work for 5 to 10 minutes, with the remaining groups
asked to make comments for 3 to 5 minutes on the presentations.

Case 1: Averaging Incomes


There are 34 families living in your neighborhood. The household family monthly incomes
are given in the following table:

2 families - 40,000 5 families - 36,000


3 families - 20,000 4 families - 24,000
9 families - 32,250 2 families - 60,000
8 families - 25,000 1 family - 12,000

Last week, one mansion at the end of the street was just finished being built, and the family
of Manny Pacquiao decided to move in! Suppose that the monthly income of the Pacquiaos is
13.67 million pesos.

Ask students to make a histogram to represent the new household incomes for their street.

From the histogram, ask them to estimate where the center is.

Tell them to calculate the Mean, Median, and Mode for the income data, with and without the
Pacquiao family.

Instruct students to write a paragraph explaining what the best choice for the measure of
central tendency is, and why.

Chapter 1 Describing Data Lesson 5 Page 3


Special Notes:

(i) Inform students that the Pacquiao income is called an outlier, i.e. a value in a
dataset that is not very typical (in relation to the rest of the data). These outliers
seriously affect the mean, but not the median nor the mode.
(ii) Mean, Median, and Mode without Pacquiao are 30,536.76471, 32,250 and 32,250,
respectively; while Mean, Median, and Mode with Pacquiao are 420,235.7143,
32250, and 32250, respectively. Thus, students should indicate that the median is the
best choice for an average, when we consider income distribution. The mean gets
easily affected by the presence of the extreme observation (the high income of the
Pacquiao family), increasing the average from about 31 thousand pesos to over 400
hundred thousand.
Case 2: Color for the Senior High School Dance
For the senior high school dance, there is a debate going on among students regarding the
color that will be featured prominently. Votes were sent by students via SMS, and the results
are as follows:
Red 300 votes Yellow 220

Green 550 votes Blue 710

Orange 70 votes. Brown 35

White 130 votes Purple 5

Ask students to make a Pie Graph showing the outcome of the election.

Tell students to identify if there is a clear winner on the choice of color.

Instruct students to find the Mean, Median, and Mode for the colors, if possible (not the
amount of votes!)

Tell them to write a paragraph explaining why you could or could not find each measure of
the center. Which measure of center will determine the color to be prominently used during
the senior high school dance?

Case 3: Results of Quiz in Statistics and Probability Course


Everyone studied very hard for the quiz in the Statistics and Probability Course. There were
10 questions on the test, and the scores are distributed as follows:

10 correct 8 students
9 correct 12 students
8 correct 6 students
7 correct 5 students

Chapter 1 Describing Data Lesson 5 Page 4


6 correct 3 students
5 correct 2 students
4 correct 0 students
3 correct 1 student
2 correct 1 student
1 correct 0 students
0 correct 2 students

Suggest to the students to create a bar graph for the data.

Ask them what are the mean, median, and mode for this set of data.

Tell students to imagine that the teacher said Everyone in the class will be getting either the
mean, median, or mode for their official score.

a) What would students want to receive (the mean, median, or mode)?


b) Which would students want to receive the least (the mean, median or mode)?
c) What is the fairest score to receive would be? (Ask students to explain their answers)

(C) Analysis and Comments to Cases


The first case introduces the idea that means are sensitive to the presence of outliers (here the
outlier is the Pacquiao income). The mean income increased tremendously with the presence
of Pacquiao. The median (and the mode) would be less sensitive to the presence of the
outlier,
The second case involves categorical data with a nominal scale, where the mode is the best
measure of central tendency.
The third case involves interval data, where the mean is the best measure to use, especially if
there are no outliers.
The context of the data suggests what would be a good measure of central tendency.

Chapter 1 Describing Data Lesson 5 Page 5


What is the context of the
data?

Chapter 1 Describing Data Lesson 5 Page 6


Activity Sheet 1-05.
Nominal Ordinal/
PRESENTATION COMMENTS
Interval/Ratio
Case 1
What is the size of the
Summarize the scenario in one sentence: data?

Small Large
What measure of center did the group think was best?

Are there outliers?

Case 2
Best to use
Summarize the scenario in one sentence:
Mean or
Best to use Median
Mode
Outliers No
Outliers

What measure of center did the group think was best? Best to use Best to use
Median Mean

Case 3

Summarize the scenario in one sentence:

What measure of center did the group think was best?

Chapter 1 Describing Data Lesson 5 Page 7


REFERENCES

Many materials here adapted from


Deciding Which Measure of Center to Use http://www.sharemylesson.com/teaching-
resource/deciding-which-measure-of-center-to-use-50013703/
Albert, J. R. G. (2008).Basic Statistics for the Tertiary Level (ed. Roberto Padua, Welfredo
Patungan, Nelia Marquez), published by Rex Bookstore.

Chapter 1 Describing Data Lesson 5 Page 8


ASSESSMENT
1) Thirty people were asked, how many people do you consider your best friend. The graph
below shows their responses. What measure of center would you use to find the center for the
number of best friends people have? Explain your answer.

Number of Best Friends


12

10

0
1 2 3 4 5 6 7 8

ANSWER:

There is an outlier. Use either the mode (2), or the median (3).

2) The average age of 10 full time guidance counselors is 35. Two new full time guidance
counselors, aged 28 and 30, are hired. Five years from now, what would be the average age of
these twelve guidance counselors?

ANSWER:

Sum of ages is 350 for 10 counselors, with the two extra, sum is now 308, thus yielding an average age
currently at 34 years. Five years from now, the average will go up to 39 years for the 12 guidance
counselors.

3) Houses in a certain area in Makati have a mean price of P4,000,000, but a median price of
only P2,500,000. How might you explain this best?
ANSWER:

There is an outlier (an extremely expensive house) in the prices of the houses.

4) Five persons are asked the number of hours they spend watching television in a week. Their
responses are: 5, 7, 3, 38, and 7.

Chapter 1 Describing Data Lesson 5 Page 9


a. Obtain the mean, median and mode.

b. If another person were to be asked the same question and he/she responded 200 hours,
how would this affect the mean, median and mode?

ANSWER:

a. The mean is 12; median is 7, mode is 7.

b. Median and mode unchanged; mean increases to 43.3

Explanatory Note:

Teachers have the option to just ask this assessment orally to the entire class, or to group
students and ask them to identify answers, or to give this as homework, or to use some
questions for a chapter examination.

Chapter 1 Describing Data Lesson 5 Page 10


HANDOUT FOR STUDENTS

WHAT is
The mean: represents the value that each data point would take on if the total of the data values were
redistributed equally.

The median: the middlemost score (or average of the two middle scores when the data are even);

The mode: the most frequently occurring value in a list of data.

HOW
To calculate the mean (often called the average):

1) Add up all of the values in the data set.

2) Divide the sum by how many values there are in the data set.

TECHNICAL NOTES
To
x1 x 2 xn
By convention, we represent a list of n data as , , , and denote its sum through the

x i x i
i 1
summation notation or when there is no confusion in the values of the indices as so

x i x i x
i 1
compute for the median:
n n n
that the1)mean
Sort is
thewritten as lowest or
data from or highest .to lowest).
to highest (or

2) For an odd number of data, the median of a data set is the middle observation. When the
number of data is even, the median is the average of the two middle scores.

To find the mode:

1) Obtain a frequency distribution of the distinct values of the data.


2) The mode is the most frequently occurring data (if there is one).

Chapter 1 Describing Data Lesson 5 Page 11


Example: A group of Senior High School students answered
Number of SMS messages sent
the question: How many SMS messages did you send
yesterday
yesterday. The table shows their data. Obtain the mean,
median and mode.
Student Number
Mean: Joseph 3
Add the numbers:
3 + 1 + + 2 = 38 Ethel 1
Angeles 0
Divide by how many students (there are 10 students):
Mean = 38 10 = 3.8 Michael 5
Carlo 4
Paula 8
Median
A sorting of the data: Josefina 4
0, 1, 2, 3, 4, 4, 5, 5, 6, 8
Martin 6
yields two 4s as the two middle scores, whose average 4, is Beverly 5
the median. Alan 2

Mode
Getting a frequency distribution:

Value Frequency
1 1
2 0
3 1
4 1
5 2
6 2
7 1
8 1
9 1

We find two modes, 4 and 5, both occurring twice.

Chapter 1 Describing Data Lesson 5 Page 12

Anda mungkin juga menyukai