Anda di halaman 1dari 15

Lesson 3 - 5

The Five-Number Summary and Boxplots

Objectives
Compute the five-number summary Draw and interpret boxplots

Vocabulary
Five-number Summary the minimum data value, Q1, median, Q3 and the maximum data value

Five-number summary
Min
smallest value

Q1

M
First, Second and Third Quartiles (Second Quartile is the Median, M)

Q3

Max
largest value

Lower Fence

Upper Fence

Boxplot

Smallest Data Value > Lower Fence Largest Data Value < Upper Fence
(Min unless min is an outlier) (Max unless max is an outlier)

Outlier

Distribution Shape Based on Boxplots:


If the median is near the center of the box and each horizontal line is of approximately equal length, then the distribution is roughly symmetric If the median is to the left of the center of the box or the right line is substantially longer than the left line, then the distribution is skewed right If the median is to the right of the center of the box or the left line is substantially longer than the right line, then the distribution is skewed left

Why Use a Boxplot?


A boxplot provides an alternative to a histogram, a dotplot, and a stem-and-leaf plot. Among the advantages of a boxplot over a histogram are ease of construction and convenient handling of outliers. In addition, the construction of a boxplot does not involve subjective judgements, as does a histogram. That is, two individuals will construct the same boxplot for a given set of data - which is not necessarily true of a histogram, because the number of classes and the class endpoints must be chosen. On the other hand, the boxplot lacks the details the histogram provides. Dotplots and stemplots retain the identity of the individual observations; a boxplot does not. Many sets of data are more suitable for display as boxplots than as a stemplot. A boxplot as well as a stemplot are useful for making side-by-side comparisons.

Example 1
Consumer Reports did a study of ice cream bars (sigh, only vanilla flavored) in their August 1989 issue. Twenty-seven bars having a taste-test rating of at least fair were listed, and calories per bar was included. Calories vary quite a bit partly because bars are not of uniform size. Just how many calories should an ice cream bar contain?
342 377 209 377 182 147 319 310 190 353 439 151 295 111 131 234 201 151 294 182 286 197

Construct a boxplot for the data above.

Example 1 - Answer
Q1 = 182 Min = 111 IQR = 137 Q2 = 221.5 Max = 439 UF = 524.5 Q3 = 319 Range = 328 LF = -23.5

100 125 150 175 200 225 250 275 300 325 350 375 400 425 450 475 500

Calories

Example 2
The weights of 20 randomly selected juniors at MSHS are recorded below:
121
125

126
128

130
131

132
133

143
135

137
139

141
141

144
147

148
153

205
213

a) Construct a boxplot of the data

b) Determine if there are any mild or extreme outliers.

Example 2 - Answer
Q1 = 130.5 Min = 121 IQR = 15 Q2 = 138 Max = 213 UF = 168 Q3 = 145.5 Range = 92 LF = 108

Extreme Outliers ( > 3 IQR from Q3) * *

100 110 120 130 140 150 160 170 180 190 200 210 220 230 240 250 260

Weight

Example 3
The following are the scores of 12 members of a womans golf team in tournament play:
89 111 90 108 87 83 95 88 86 91 81 79

a) Construct a boxplot of the data.

b) Are there any mild or extreme outliers? c) Find the mean and standard deviation. d) Based on the mean and median describe the distribution?

Example 3 - Answer
Q1 = 84.5 Min = 79 IQR = 18.5 Q2 = 88.5 Max = 111 UF = 120.75 Q3 = 93 Range = 32 LF = 56.75

Golf Scores

78

81

84

87

90

93

96

99 102 105 108 111 114 117 120 123 126

No Outliers Mean= 90.67 St Dev = 9.85 Distribution appears to be skewed right (mean > median and long whisker)

Example 4
Comparative Boxplots: The scores of 18 first year college women on the Survey of Study Habits and Attitudes (this psychological test measures motivation, study habits and attitudes toward school) are given below:
154 103 109 126 137 126 115 137 152 165 140 165 154 129 178 200 101 148

The college also administered the test to 20 first-year college men. There scores are also given:
108 109 140 132 114 75 91 88 180 113 115 151 126 70 92 115 169 187 146 104

Compare the two distributions by constructing boxplots. Are there any outliers in either group? Are there any noticeable differences or similarities between the two groups?

Example 4 - Answer
Q1 = 126 Min = 101 IQR = 28 98 70 45 Q2 = 138.5 Max = 200 UF = 196 114.5 187 210.5 Q3 = 154 Range = 99 LF = 59 143 117 30.5

Comparing Men and Women Study Habits and Attitudes Women

60

70

80

90 100 110 120 130 140 150 160 170 180 190 200 210 220

Men

Womens median is greater and they have less variability (spread) in their scores; the womens distribution is more symmetric while the mens is skewed right. Women have an outlier; while the men do not.

Summary and Homework


Summary
Boxplots are used for checking for outliers Use comparative boxplots for two datasets Constructing a boxplot is not subjective Identifying a distribution from boxplots or histograms is subjective!

Homework: pg 181-183: 5-7, 15

Anda mungkin juga menyukai