Anda di halaman 1dari 32

Chapter 04 - Descriptive Statistics

Chapter 4
Descriptive Statistics
4.1 a. Mean = 2.83, Median = 1.5, Mode = 0. Any one of these would be a useful measure of
center. Data appears skewed right with many values equal to zero so one might choose
median or mode over mean.
b. Mean = 68.33, Median = 72, Mode = 40. This is continuous numerical data so either the
mean or the median would be best.
c. Mean = 3.04, Median = 3.03, No mode. This is continuous numerical data so either the
mean or the median would be best.
Learning Objective: 04-1
Learning Objective: 04-3
Learning Objective: 04-11

4.2 a. This is attribute data so mode is the only measure of central tendency possible. Mode =
M, which occurs 9 out of 12 observations.
b. This is discrete data with a small range so mode could be an appropriate measure of
central tendency. Mode = 18, which occurs 6 out of 10 observations.
c. There is no mode in this case because there is no value that occurs more than once.
Therefore, either mean or median would be more appropriate. Mean = 29.5 and median
= 29.
Learning Objective: 04-1
Learning Objective: 04-3
Learning Objective: 04-11

4.3 a. Continuous data, skewed right, no mode. Median is the best choice. Median = 26.2 mpg
and mean = 28.1 mpg.
b. Mostly 1 rider therefore the median or the mode would be the best choice. Median =
mode = 1.
c. Symmetric distribution. The mean and median are the same and there are two modes.
The mean or the median are the best choice. Mean = median = 3. (Modes = 2 and 4.)
Learning Objective: 04-1
Learning Objective: 04-3
Learning Objective: 04-11

4.4 a.Mean = 56.22, Median = 47, Mode = 44.


b. The distribution is most likely skewed right because the mean is larger than the median
and mode.
c. The mode is not a useful measure of central tendency because there are few values that
repeat. The value 44 appears only three times out of 36 observations.
Learning Objective: 04-1
Learning Objective: 04-3
Learning Objective: 04-11

4-1
Chapter 04 - Descriptive Statistics

4.5 a. Mean = 75.5, Median = 80.5, Mode = 93.


b. The distribution is most likely skewed left because the mean is less than the median and
mode.
c. The mode is not a useful measure of central tendency because there are few values that
repeat. The value 93 appears only three times out of 24 observations.
Learning Objective: 04-1
Learning Objective: 04-2
Learning Objective: 04-3
Learning Objective: 04-11

4.6 a.

Quiz Quiz Quiz Quiz


1 2 3 4
Count 10 10 10 10
Mean 72.00 72.00 76.00 76.00
Median 72.00 72.00 72.00 86.50

Note that both quiz 2 and quiz 3 have multiple modes. Quiz 2 has modes at 65 and 79
(4 each.) Quiz 3 has modes at 72 and 74 (2 each.)
b. No, they don’t agree for all quizzes. The mean and the median are the same for quiz 1
and quiz2, and the mode and the median are the same for quiz 3.
c. The mode is an unreliable measure of central tendency for quantitative data. Where the
mean and median disagree, one should look at the shape of the distribution to see which
measure is more appropriate.
d. Quiz 1 and Quiz 2 have a symmetric distribution. Even though the mode is not equal to
the mean and median it is sufficient to say they are symmetrical given that the mean
equals the median. Quiz 3 is skewed rightbecause the mean is to the right of (or greater
than) the median. Quiz 4 is skewed left because the mean is to the left of (or less than)
the median.
e. Students on average did better on quizzes 3 and 4. None of the students received the
same score on Quiz 4.
Learning Objective: 04-1
Learning Objective: 04-2
Learning Objective: 04-3
Learning Objective: 04-11

4.7 a.

Descriptive Statistic Data


count 32
mean 27.34
mode 26.00
median 26.00

b. The mode and median are the same value, but the mean is greater than the median.

4-2
Chapter 04 - Descriptive Statistics

4-3
Chapter 04 - Descriptive Statistics

c.

d. One might say data is slightly skewed to the right because the mean is greater than the
median. It is difficult to conclude shape by looking at the graphs.
Learning Objective: 04-1
Learning Objective: 04-2
Learning Objective: 04-3
Learning Objective: 04-11

4.8 a.

Descriptive
Statistics Data
count 28
mean 107.25
median 106.00
mode 95.00

b. The mean and median are close in value but the mode is more than $10 lower.
c.

4-4
Chapter 04 - Descriptive Statistics

d. When looking at the dot plot one might say data is slightly skewed to the right. This
would also be logical because the mean is greater than the median. It is difficult to
conclude shape by looking at the graphs.
Learning Objective: 04-1
Learning Objective: 04-2
Learning Objective: 04-3
Learning Objective: 04-11

4.9 b.

Descriptive Statistics Data


count 65
mean 4.48
median 2.00
mode 1.00

c. No. The mean is more than twice the median and the median is twice the mode.
d. The data are skewed to the right with a few high outliers at 20, 26, and 29 minutes.
Learning Objective: 04-1
Learning Objective: 04-2
Learning Objective: 04-3
Learning Objective: 04-11

4.10 a. Median = 68, Midrange = 68, Geometric Mean = 67.37


b. Median = 3.03, Midrange = 2.96, Geometric Mean = 3.01
c. Median = 1.5, Midrange = 7.5, Geometric Mean is undefined.
The midrange is not very robust because it is sensitive to extreme data such as the 94 for
one exam score and the 15 for number of absences. The median and geometric mean
appear to be close in value for the first two data sets. However, we might want to rely
on the median because the data do not appear to have a steep increasing trend and the
median is a more easily understood and recognized measure of central tendency.
Learning Objective: 04-1
Learning Objective: 04-2

4-5
Chapter 04 - Descriptive Statistics

Learning Objective: 04-3


Learning Objective: 04-11

4.11 a. =TRIMMEAN(A1:A50,0.20)
b. There are 50 observations. Ten percent of 50 equals five. So five values will be dropped
from each tail.
c. If five values are dropped from each tail this is a total of 10 values.
Learning Objective: 04-2
Learning Objective: 04-3

4.12 Excel’s TRIMMEAN function multiplies the trimmed percentage by the number of
observations and then truncates to the next lower integer. For the function TRIMMEAN
(Data, .10) Excel will multiply by .05.
a. For n = 41, .05×41 = 2.05. Excel would trim 2 values from each tail.
b. For n = 66, .05×66 = 3.3. Excel would trim 3 values from each tail.
c. For n = 83, .05×83 = 4.15. Excel would trim 4 values from each tail
Learning Objective: 04-2
Learning Objective: 04-3

4.13 a.Mean = 100, Median = 0,Mode = 0, Midrange = 325, Geometric Mean is undefined
because there are values equal to 0.
b. Choose the mean. If planning for expected work-related medical expenses of $100 for
eight officers the budget should be fairly close. Using the midrange would overestimate
total expected expenses by quite a bit.
Learning Objective: 04-1
Learning Objective: 04-2
Learning Objective: 04-3
Learning Objective: 04-11

4.14 a.

Mon Tue Wed Thu


1 1 0 1
1 1 0 1
1 3 0 1
1 3 0 1
5 4 4 1
5 6 6 1
5 6 6 1
6 7 6 1
6 7 6 1
9 9 10 10

4-6
Chapter 04 - Descriptive Statistics

b.
Descriptive Statistics Mon Tue Wed Thu
mean 4 4.7 3.8 1.9
median 5 5 5 1
mode 1 1 0 1
midrange 5 5 5 5.5
geometric mean 2.89 3.76 NA 1.26
trimmed mean 3.75 4.625 3.5 1

Tuesday has multiple modes at 1, 3, 6 and 7 and Wednesday has multiple modes at 6 and 0.
c. The geometric mean and mode are very different from the other measures. The mean,
median, and midrange areclose in value.
d. The mean or median are better measures of central tendency for this type of data. The
midrange is sensitive to extreme values and ignores most of the data. The geometric
mean is less familiar to most people and the trimmed mean may exclude relevant data
values.
Learning Objective: 04-1
Learning Objective: 04-2
Learning Objective: 04-3
Learning Objective: 04-11

4.15 a.

mean 27.34
midrange 25.50
geometric mean 26.08
trim mean 27.46

b. The mean and the trimmed mean are similar in value and they are both greater than the
midrange and the geometric range.
Learning Objective: 04-1
Learning Objective: 04-2
Learning Objective: 04-3

4.16 a.

mean 107.25
midrange 114
geometric mean 102.48
trimmed mean 106

b. The mean and trimmed mean are similar and fall between the geometric mean and
midrange.
Learning Objective: 04-1
Learning Objective: 04-2
Learning Objective: 04-3

4-7
Chapter 04 - Descriptive Statistics

4.17 a.

mean 4.48
median 2.00
mode 1.00
midrange 15.00
geometric mean 2.60
trimmed mean 3.13

b. The data are skewed to the right. The mean is greater than the median. The midrange is
much greater than the mean.
Learning Objective: 04-1
Learning Objective: 04-2
Learning Objective: 04-3
Learning Objective: 04-11

4.18 7.15%. See calculation below.

xn 156.6
GR = n -1 - 1 = 11-1 - 1 = .0715 or 7.15%.
x1 78.5
Learning Objective: 04-3

4.19 a.

Sample Sample Sample


A: B: C:
Mean 7 62 1001
Sample Standard Deviation 1 1 1

b. The midpoint of each sample is the mean. The other 2 data points are exactly 1 standard
deviation from the mean.The idea is to illustrate that the standard deviationis not a
function of the value of the mean.
Learning Objective: 04-1
Learning Objective: 04-2
Learning Objective: 04-3
Learning Objective: 04-4

4.20

Data Set Data Set Data Set


A: B: C:
a. Mean 7.0000 7.0000 7.0000
b. Sample Standard Deviation 1.0000 2.1602 3.8944
c. Population Standard Deviation 0.8165 2.0000 3.7417

4-8
Chapter 04 - Descriptive Statistics

d. The sample standard deviation is larger than the population standard deviation for the
same data set. This makes sense because in the formula for sample standard deviation

4-9
Chapter 04 - Descriptive Statistics

we divide by n-1 instead of n. This exercise shows us that samples can have similar
means, but different standard deviations. We cannot get a sense of what the standard
deviation is just by looking at the mean.
Learning Objective: 04-1
Learning Objective: 04-2
Learning Objective: 04-3
Learning Objective: 04-4

4.21 The coefficient of variation for the hybrid vehicle is 2.2/43.2 = .051 or 5.1%. The
coefficient of variation for the gasoline vehicle is 1.9/27.2 = .070 or 7%. The hybrid
had more consistent gas mileage relative to the mean than the gasoline vehicle.
Learning Objective: 04-4

4.22
n
�| xi - x |
MAD = i =1 .
n
|12 - 20 | + |18 - 20 | + | 21 - 20 | + | 22 - 20 | + | 27 - 20 |
Using x = 20, MAD =
5
8 + 2 + 1 + 2 + 7 20
= = = 4.
5 5
Learning Objective: 04-4

4.23 a.
Stock s/ x CV
Stock A 5.25/24.50 21.43%
Stock B 12.25/147.25 8.32%
Stock C 2.08/5.75 36.17%

b. Stock C, the one with the smallest standard deviation and smallest mean, has the greatest
relative variation.
c. The stocks have different average values therefore directly comparing the standard
deviations is not a good comparison of risk. The variation relative to the mean value is
more appropriate.
Learning Objective: 04-4

4-10
Chapter 04 - Descriptive Statistics

4.24 a.

Quiz
Quiz 1 2 Quiz 3 Quiz 4
Count 10 10 10 10
Mean 72.00 72.00 76.00 76.00
sample standard deviation 13.23 6.67 11.41 27.43
coefficient of variation (CV) 18.38% 9.26% 15.02% 36.09%

b. Central tendency falls at 72 or 76 and the dispersion is greatest for Quiz 4.


c. Scores, on average, are higher for Quiz 3 and Quiz 4. Quiz 2 has the least
relativevariation and Quiz 4 has the most.
Learning Objective: 04-2
Learning Objective: 04-3
Learning Objective: 04-4

4.25 a. xA = 6.86, sA = 1.497, xB = 7.24, sB = 1.209


b. CVA = 1.497/6.86 = 0.218 or 21.8%, CVB = 1.209/7.24 = 0.167 or 16.7%.
c. Consumers preferred sauce B. Sauce B also had more consistent ratings relative to the
average.
Learning Objective: 04-2
Learning Objective: 04-3
Learning Objective: 04-4

4.26 a. Chebychev's theorem states that at least 75% of the data will fall within ± 2 standard
deviations so: .75×200 = 150.
b. The Empirical Rule states that 95.44% would fall in this range so: .9544×200 = 191.
Learning Objective: 04-5
Learning Objective: 04-6

4.27 90-70 = 70-50 = 20. 20/10 = 2. Chebychev’s theoremstates that at least 75% of the data will
fall within ± 2 standard deviations so: .75×400 = 300.
Learning Objective: 04-5
Learning Objective: 04-6

4.28 a. z = (1325-875)/219 = 2.055.


b. Because John’s standardized score is greater than 2 but less than 3 we would consider his
rent unusual but not an outlier.
rent - 875
c. John’s rent would be considered an outlier if |John’s z score| > 3. Set z = and
219
solve for rent. Rent = $1532. John’s rent would have to be $1532 or higher to be
considered an outlier.
Learning Objective: 04-5
Learning Objective: 04-6

4-11
Chapter 04 - Descriptive Statistics

4-12
Chapter 04 - Descriptive Statistics

4.29 a. z= (91-79)/5 = 2.4. John’s score of 91 was 2.4 standard deviations greater than the
mean of 79.
b. z= (3.18-2.87)/.31 = 1. Mary’s GPA of 3.18 was 1 standard deviation above the meanof
2.87.
c. z= (18-15)/5 = .6. Jamie’s weekly study hours of 18 hours was .6 standard deviations
above the mean of 15 hours.
Learning Objective: 04-5
Learning Objective: 04-6

4.30 The Empirical Rule states that 99.73% of the data will fall within ± 3 standard deviations.
Therefore, we know that almost the entire range of data falls within 6 standard
deviations. We use the formula:

R (126.2 - 109.7) 16.5


s= = = = 2.75
6 6 6

and conclude that our estimate of σ is 2.75.


Learning Objective: 04-5
Learning Objective: 04-6
x-m
4.31 For each part below use the formula z = . Plug in the values for z, μ,and σand solve
s
for x.
x - 2.98
a. Bob’s GPA: 1.71 = , x = 1.71(.36) + 2.98 = 3.596 .
.36
x - 21.6
b. Sarah’s weekly work hours: 1.18 = , x = 1.18(7.1) + 21.6 = 29.978 .
7.1
x - 150
c. Dave’s bowling score: -1.35 = , x = -1.35(40) + 150 = 96 .
40
Learning Objective: 04-5
Learning Objective: 04-6

4.32 a. The z scores are in the table below:


Standardized Value for Number of customers during the noon hour (n = 30 days).
-0.2985 1.2298 -0.5532 -0.1711 0.3383 1.6119 -0.2985 -0.1711
1.4845 0.5930 -0.8079 -0.1711 -1.0626 -0.0438 0.5930 -0.2985
-1.1900 -0.1711 0.8477 -1.1900 0.4657 0.9751 -0.8079 0.7204
0.7204 -2.3362 -1.4447 0.5930 0.9751 1.8666 -1.5721 -0.4259
b. From Megastat:

empirical rule
mean - 1s 19.49
mean + 1s 35.20
percent in interval (68.26%) 68.8%
mean - 2s 11.64
mean + 2s 43.05

4-13
Chapter 04 - Descriptive Statistics

percent in interval (95.44%) 96.9%


mean - 3s 3.79
mean + 3s 50.90
percent in interval (99.73%) 100.0%

There are no standardized values greater than 3 or less than -3 so no official outliers. There
is one standardized value less than -2. Observation 9 has a z score = -2.3362 so we
would consider this an unusual value.
c. 68.8% of the data values fall within 1σ, more than 96.9% fall within 2σ, and 100% fall
within 3σ. These percentages are slightly greater than the percentages from the
Empirical rule but not too different. A normal distribution seems reasonable.
Learning Objective: 04-5
Learning Objective: 04-6

4.33 a. The z scores are in the table below:


Standardized Value of Lengths of 65 calls initiated during the last week of July
-0.5922 -0.4219 0.9407 0.0891 -0.2515 -0.2515 -0.4219 2.6439 -0.5922 -0.5922
0.2594 -0.2515 1.4517 -0.4219 -0.4219 -0.5922 3.6658 -0.2515 -0.5922 -0.2515
-0.5922 -0.4219 -0.5922 0.4297 -0.5922 -0.4219 -0.2515 -0.5922 -0.4219 1.2813
-0.5922 -0.0812 -0.4219 -0.4219 4.1768 -0.5922 -0.5922 -0.5922 0.6001 0.0891
-0.5922 -0.0812 -0.4219 -0.4219 4.1768 -0.5922 -0.5922 -0.5922 0.6001 0.0891
-0.5922 -0.0812 -0.4219 -0.5922 -0.5922 -0.5922 -0.5922 0.2594 -0.5922 -0.4219
-0.2515 -0.2515 0.2594 -0.5922 -0.2515 -0.5922 -0.5922 0.0891 -0.5922 2.3033
-0.4219 1.4517 1.4517 -0.5922 0.2594
b. From MegaStat:

empirical rule
mean - 1s -1.39
mean + 1s 10.35
percent in interval (68.26%) 87.7%
mean - 2s -7.27
mean + 2s 16.22
percent in interval (95.44%) 93.8%
mean - 3s -13.14
mean + 3s 22.09
percent in interval (99.73%) 96.9%
There are two observations that would be considered outliers according to the Empirical
Rule. The values 26 (z = 3.67) and 29 (z = 4.18) are both more than 3 standard
deviations beyond the mean. The values 18 (z = 2.30) and 20 (z = 2.64) are both more
than 2 standard deviations beyond the mean. These would be considered unusual
values.
c. There are more observations within 1σ of the mean than the empirical rule would
indicate, 87.7% vs. 68.26%. There are fewer observations within 2σthe mean that the
empirical rule would indicate, 93.88% vs. 95.44%. Data do not seem to be from a
normal distribution.

4-14
Chapter 04 - Descriptive Statistics

Learning Objective: 04-5


Learning Objective: 04-6

4.34 a.

b. The long left whisker suggests left-skewness.


Learning Objective: 04-7
Learning Objective: 04-8
Learning Objective: 04-11

4.35 a.

b. Strongly skewed right.


Learning Objective: 04-7
Learning Objective: 04-8
Learning Objective: 04-11

4.36 a. Estimate Q1 = 32, Q2 = 38, and Q3 = 46 customers.


b. Approximately 64 customers were served on the busiest day and 20 customers on the
slowest day.
c. The distribution appears fairly symmetric because the boxes are approximately equal in
width and the whiskers are approximately equal in length.
Learning Objective: 04-7
Learning Objective: 04-11

4.37 a.Estimate Q1 = 3300, Q2 = 3900, and Q3 = 4300vehicles per day.


b. xmin≈ 2400 and xmax≈ 4800.
c. The distribution appears left skewed because the left whisker is much longer than the
right whisker.
Learning Objective: 04-7
Learning Objective: 04-11

4-15
Chapter 04 - Descriptive Statistics

4.38 a.
Data
count 32
1st quartile 21.50
median 26.00
3rd quartile 33.00
interquartile range 11.50

The first quartile tells us that 25% of the data is at or below 21.50 and the third quartile
tells us that 75% of the data is at or below 33.00. The median is the same as the 2nd
quartile which tells us that 50% of our data is at or below 26.00.
Q + Q3 21.5 + 33.0
b. Themidhinge = 1 = = 27.25 . This midhinge value indicates we have
2 2
right skewed data because it is greater than the median.
c.

This boxplot shows us that our range of data is from 9 to 42 and that the median
number of customers is 26. Days with 22 or fewer customers are in the bottom quartile.
Days with 33 or more customers are in the upper quartile. The box plot indicates that
there are no extreme values or outliers and that our data is right skewed. The longer
whisker on the right indicates right-skewness just as the midhinge value showed. Note
that the boxplot was created on MegaStat and the quartiles in part (a) were calculated
using Excel 2010 functions. There may be slight discrepancies in quartile values.
Learning Objective: 04-2
Learning Objective: 04-7
Learning Objective: 04-8
Learning Objective: 04-11

4.39 a.
Data
count 65
1st quartile 1
median 2
3rd quartile 5
interquartile range 4

The first quartile tells us that 25% of the data is at or below 1 minute and the third quartile
tells us that 75% of the data is at or below 5 minutes.

4-16
Chapter 04 - Descriptive Statistics

Q1 + Q3 1 + 5
b. The midhinge = = = 3 . The midhingeindicates the data could be skewed to
2 2
the right because it is greater than the median.

c. The box plot confirms the quartile calculations and reveals that there are 6 calls of
unusually large duration, 4 of them extreme in length.
Learning Objective: 04-2
Learning Objective: 04-7
Learning Objective: 04-8
Learning Objective: 04-11

4.40 a.

To calculate the sample correlation coefficient we can use the formula


n

�( x - X )( y - Y )
i i
r= i =1
or use the =CORREL(XData, Y Data)
n n

�( x - X ) �( y - Y )
i =1
i
2

i =1
i
2

function in excel. This gives us r = -.8841 . There appears to be a strong,


negative, linear relationship.

4-17
Chapter 04 - Descriptive Statistics

b.

Again we can use the equation or excel and the result is r = .90875. There appears
to be a strong, positive, linear relationship.

c.

Again, we can use the formula or excel and the result is r = .1704. There appears
to be a weak, positive, linear relationship.
Learning Objective: 04-9

4-18
Chapter 04 - Descriptive Statistics

4.41 a.

b. There is a strong, positive linear association between Speed and Power.


c. Using Excel’s correlation function, =CORREL(Xdata, Ydata), r = .9620.
Learning Objective: 04-9
sxy 48.724
4.42 a. r = = = .5401
sx s y 11.724 �8.244
b. The relationship between X and Y is a moderate positive linear relationship.
c. The correlation coefficient is unit-free and therefore is easier to interpret than the
covariance. The covariance magnitude is dependent on the magnitude of the variables
so you cannot compare covariance of different pairs of variables.
Learning Objective: 04-9

4.43 a.

b. Using Excel’s correlation function, =CORREL(Xdata, Ydata), r=.8338.


c. There appears to be a strong, positive, linear relationship.
Learning Objective: 04-9
4.44 a. 100m dash times:

4-19
Chapter 04 - Descriptive Statistics

9.87 + 9.98 + 10.02 + 10.15 + 10.36 + 10.36


Mean = = 10.12
6
10.02 + 10.15
Median = = 10.085
2
Mode = 10.36
Number of children:

0 +1 +1+ 2 + 2 + 2 + 2 + 2 + 2 + 2 + 2 + 2 + 6
Mean = =2
13
Median = 2
Mode = 2

Number of cars in driveway:

0 + 0 +1+1+ 2 + 2 + 3 + 5
Mean = = 1.75
8
1+ 2
Median = = 1.5
2
Mode = 0, 1, and 2

b. 100m dash times: The mode is the weakest because all of the values fall at or below
10.36. We also know that the mode should be used for a small range of discrete data or
attribute data. This is a small range of continuous data.

Number of children: All measures of central tendency in this case have the same value
of 2 which is a strong indicator of a "typical" data value

Number of cars in driveway: The mode is the weakest because there are three different
values for mode and only five unique values in the entire data set.
Learning Objective: 04-1
Learning Objective: 04-3

4.45 Using Chebychev’s theorem: 1− (1/32) = .8889. There is at least 88.89% within 3 standard
deviations which is the range from 71 to 119.
Learning Objective: 04-6

4.46 The mean of 396 is 10 gm away from the upper and lower range values. Because the
standard deviation is 5 gm, we are looking for the percent falling within 2 standard
deviations of the mean (10/5 = 2). Chebychev’s theorem: 1 – (1/22) = .75, .75 × 200 =
150. At least 150 bags will weigh between 386 and 406 gm.
Learning Objective: 04-6

4.47 a. z = (0.2761 – 0.2731)/ 0.000959 = 3.128.

4-20
Chapter 04 - Descriptive Statistics

b. This shipment would be considered an outlier because the standardized score is greater
than 3.
Learning Objective: 04-6

1430 - 1340
4.48 a. Bob's standardized zscore is z = =1
90
b. No, his score is not unusual because it is within 2 standard deviations of the mean. The
Empirical Rule states that a z score would have to be outside of 2 standard deviations
from the mean to be unusual.
Learning Objective: 04-5
Learning Objective: 04-6
x-m
4.49 For each part below use the formula z = . Plug in the values for z, μ,and σ and solve
s
for x.

a. x = 74 + (2.30) �7 = 90.1
b. x = 53 + ( - 1.45) �12 = $35.60
c. x = 4 + ( - 0.79) �1.15 = 3.09 hours
Learning Objective: 04-5

48 - 30
4.50 a. Nolan's standardized z score is z = = 2.57
7
b. Yes, his time is unusual because it falls above 2 standard deviations when standardized.
Learning Objective: 04-5
Learning Objective: 04-6

4.51 a. To estimate sigma use (xmax – xmin)/6 = (30-18)/6 = 2.


b. Assume a normal distribution.
Learning Objective: 04-5
Learning Objective: 04-6

4.52 a. Mean = 724.67, Median = 720, Mode = 730


b. Yes, the measures do tend to agree. The mean falls between the median and mode and
there is only a $5 difference between mean and median and between mean and mode.
c. Using the =STDEV.S function in excel, we get a standard deviation of 114.28.
d.
Standardized scores for monthly rents
0.046668 0.046668 0.046668 1.796735 -0.21584 -1.35338
-0.30334 2.671768 0.134172 -0.91587 -0.04083 -0.47835
-1.44089 0.134172 -0.65336 -0.56585 1.096708 1.796735
-1.09087 -0.91587 0.309178 -0.30334 -0.12834 -1.96591
0.046668 0.659192 0.834198 1.009205 -0.04083 -0.21584

e. There is one unusual value. The observation 1030 has a z score = 2.67.

4-21
Chapter 04 - Descriptive Statistics

f. It is possible that the data are normally distributed based on the empirical rule because
close to 68% of the data fall within ± 1 standard deviation, close to 95.44% fall within
± 2 standard deviations, and close to 99.73% of the data fall within ± 3 standard
deviations.

empirical rule
mean - 1s 610.39
mean + 1s 838.95
percent in interval (68.26%) 70.0%
mean - 2s 496.10
mean + 2s 953.23
percent in interval (95.44%) 96.7%
mean - 3s 381.82
mean + 3s 1,067.51
percent in interval (99.73%) 100.0%
Learning Objective: 04-1
Learning Objective: 04-2
Learning Objective: 04-3
Learning Objective: 04-4
Learning Objective: 04-5
Learning Objective: 04-6

4.53 a. Mean = 26.71. Median = 14.5. Mode = 11. Midrange = 124.5.


b. Q1 = 7.25, Q3 = 20.75, Midhinge = 14.
c. The geometric mean is only valid for data greater than zero.
Learning Objective: 04-1
Learning Objective: 04-2
Learning Objective: 04-3
Learning Objective: 04-7

4.54 a. From Excel: mean = 33.31, median = 26, and mode = 17.
b. The mean is greater than the median by 7 minutes.
c. No the mode is not a good measure of central tendency because this is continuous
numerical data. There are only 3 power outages out of 26 that lasted 17 minutes.
d. This distribution is skewed right because the mean is greater than the median.
Learning Objective: 04-1
Learning Objective: 04-2
Learning Objective: 04-3
Learning Objective: 04-11

4.55 a. From Excel: mean = 66.85, median = 69.5, and mode = 86.

4-22
Chapter 04 - Descriptive Statistics

b. The mean and median are fairly close in value.


c. No the mode is not a good measure of central tendency because this is continuous
numerical data. There are only 2 packages out of 20 that weighed 86 ounces.
d. It is difficult to describe the shape of this distribution based solely on the values of the
mean, median, and mode, however, because the mean and median are close in value
one might hypothesize that the shape is somewhat symmetric.
Learning Objective: 04-1
Learning Objective: 04-2
Learning Objective: 04-3
Learning Objective: 04-11
4.56 With a skewness coefficient greater than zero (0.773 > 0), we would describe the
distribution as right skewed. With a kurtosis coefficient greater than one (1.277 > 1) we
would describe the distribution as sharply peaked or leptokurtic.
Learning Objective: 04-11

4.57 a. Stock funds: x = 1.329 , median = 1.22. Bond funds: x = 0.875 , median = 0.85.
b. Stock funds: s = 0.5933, CV = (0.5933/1.329)×100 = 44.65%. Bond funds: s = 0.4489,
CV = (0.4489/0.875)×100 = 51.32%.
c. The bond funds have more variability relative to the mean because the CV is greater than
for stock funds.
Learning Objective: 04-1
Learning Objective: 04-2
Learning Objective: 04-3
Learning Objective: 04-4

4.58 a. From Excel: Mean = 34.54. Median = 33.0. Mode = 23, 33 and 36. Midrange = 42.
b. The mean or median would be an appropriate measure of central tendency because the
data is fairly symmetric. Mode would not be appropriate because this data is not
attribute or discrete.
c. s = 10.31.
d. The highest value, 61, is not an outlier because it does not fall above 65.47 which is the
mean + 3 standard deviations (34.54 + 10.31 + 10.31 + 10.31 = 65.47).
Learning Objective: 04-1
Learning Objective: 04-2
Learning Objective: 04-3
Learning Objective: 04-4
Learning Objective: 04-6

4.59

a. From Excel: Mean = 6,807. Median = 6,646.


b. The data is slightly skewed to the right because the mean is greater than the median.

4-23
Chapter 04 - Descriptive Statistics

c. Mode is not useful because data is numerical and there are no repeat observations.
Learning Objective: 04-1
Learning Objective: 04-2
Learning Objective: 04-3
Learning Objective: 04-11

4.60 a. From Excel: Mean = 95.1, Median = 90.0. There is no mode.


b. The median is the best measure of central tendency because the data set has two high
outliers. The mode is the worst because there is no mode so it doesn't tell us anything.
Learning Objective: 04-1
Learning Objective: 04-2
Learning Objective: 04-3
Learning Objective: 04-6

4.61 a. From Excel: Mean = 3012.44, Median = 2,550.50.


b. Use the median to describe the center because the data is skewed right. The typical cricket
club’s income is approximately £2.5 million.
Learning Objective: 04-1
Learning Objective: 04-2
Learning Objective: 04-3
Learning Objective: 04-11
�6,053 �
4.62 The coefficient of variation for plumbing supplier’s vinyl washers is: 100 � �= 25%
�24, 212 �
�1.7 �
The coefficient of variation for steam boilers is 100 � �= 25%
�6.8 �
The demand patterns exhibit similar relative variation, even though the standard deviations
are very different.
Learning Objective: 04-1
Learning Objective: 04-2
Learning Objective: 04-4

4.63 a.

4-24
Chapter 04 - Descriptive Statistics

b. The data is skewed to the right.


c. From Excel: Mean = 20.12, Standard Deviation = 7.64.
d. There are two unusual values.Sales of 36 have a z score of 2.08: z = (36 – 20.12)/7.64 =
2.08. Sales of 37 have a z score of 2.21: z = (37 – 20.12)/7.64 = 2.21. There is one
outlier value.Sales of 49 have a z score of 3.78: z = (49 – 20.12)/7.64 = 3.78.
Learning Objective: 04-1
Learning Objective: 04-2
Learning Objective: 04-3
Learning Objective: 04-4
Learning Objective: 04-5
Learning Objective: 04-6
�s �
4.64 a. See table below for CV values. CV values are calculated using CV = 100 � �
�X �

Comparative Returns on Four Types of Investments


Investment Mean Standard Coefficient of Variation
Return Deviation
Venture funds (adjusted) 19.2 14.0 72.92%
All common stocks 15.6 14.0 89.74%
Real estate 11.5 16.8 146.09%
Federal short term paper 6.7 1.9 28.36%

b. The standard deviations are “absolute” not relative measures of dispersion. It is best to
use the CVwhen comparing across variables that have different means.
c. The risk and returns are captured by the CV. Federal short term paper has the lowest CV
and hence lowest risk, real estate the greatest risk. Venture funds have lower risk and
greater return than common stocks based on the CV. In other words, there is more risk
when there is more variation in the returns.
Learning Objective: 04-1
Learning Objective: 04-2
Learning Objective: 04-3
Learning Objective: 04-4

4-25
Chapter 04 - Descriptive Statistics

4.65 a. CV Tuition Plans = 100%×(2.7/6.3) = 42.86%. CV SP500 = 100%×(15.8/12.9) =


122.48%.The tuition plans have lower returns than the SP 500, but less risk as
measured by the CV. This is not surprising because the goal of a tuition plan is to
ensure that a minimum amount of money is available at the time the plan matures, thus
parents and students are willing to take a lower return in exchange for lower risk.
b. We use the CV to compare the risk relative to the average return. The standarddeviation
alonecannot be used to compare distributions because the means are different.
Learning Objective: 04-1
Learning Objective: 04-2
Learning Objective: 04-4

4.66 a. Midrange = (180+60)/2 = 120.


x -x 180 - 60
b. s = max min � = = 20
6 6
c. Assuming normality is important because a normal distribution is symmetric and this
allows us to estimate the mean with the midrange and to estimate the standard deviation
using the assumption that the range will be approximately 6 standard deviations.
d. Caffeine levels in brewed coffee are dependent on many factors including brand of
coffee, grind of coffee beans, and brew time. Causes of variation in caffeine level are
unpredictable so a well-behaved bell-shaped curve may not be a reasonable assumption.
Learning Objective: 04-1
Learning Objective: 04-2
Learning Objective: 04-3
Learning Objective: 04-4
Learning Objective: 04-5
Learning Objective: 04-6

4.67 a. Midrange = (.92+.79)/2 = .855


x -x 0.92 - 0.79
b. s = max min = = 0.0217
6 6
b. A normal distribution is plausible here because there are likely to be controls on the level
of chlorine added to the water. There will be some variation around the mean but it will
be predictable.
Learning Objective: 04-1
Learning Objective: 04-2
Learning Objective: 04-3
Learning Objective: 04-4
Learning Objective: 04-5
Learning Objective: 04-6

4.68 a. The distribution should beskewed to the right because the mean is greater than the
median and the mode.
b. Most ATM transaction times will tend to be low in value but a few will be of longer
duration.
Learning Objective: 04-1
Learning Objective: 04-11

4-26
Chapter 04 - Descriptive Statistics

4.69 a. The distribution should be skewed to the right because the mean is greater than the
median and the mode.
b. Most patrons keep books out for a week or so. There will be a few patrons that keep a
book out much longer.
Learning Objective: 04-1
Learning Objective: 04-3
Learning Objective: 04-11

4.70 a. The distribution should be skewed to the left because the mean is less than the median
and the mode.
b. It appears that most students scored a C or higher but there were a few students that may
not have studied for the exam which pulled the mean down.
Learning Objective: 04-1
Learning Objective: 04-3
Learning Objective: 04-11

4.71 a. One would expect the mean to be close in value to the median, or slightly higher.
b. In general, the life span would have a normal distribution. If skewed, the distribution is
more likely skewed right than left. Life span is bounded below by zero but is
unbounded in the positive direction.
Learning Objective: 04-1
Learning Objective: 04-3
Learning Objective: 04-11

4.72 The mean would be greater than the median. There are likely to be a few waiting times that
are extremely long.
Learning Objective: 04-1
Learning Objective: 04-3
Learning Objective: 04-11

4.73 a. It is the midrange, not the median.


b. The midrange is influenced by outliers. Because salaries tend to be skewed to the right,
the midrange will be greater than the median. The community should use the median to
base charges.
Learning Objective: 04-1
Learning Objective: 04-3
Learning Objective: 04-11

4.74 a. The distribution would be skewed right.


b. Switching from the mean to the median would trigger a penalty sooner because the
median is less than the mean.
c. The union would oppose this change because they would probably have to pay more
penalties.
Learning Objective: 04-1
Learning Objective: 04-3
Learning Objective: 04-11

4-27
Chapter 04 - Descriptive Statistics

4.75 a. and c.

Week 1Week 2Week 3Week 4


mean 50.00 50.00 50.00 50.00
sample standard deviation 10.61 10.61 10.61 10.61
median 50.00 52.00 56.00 47.00

b. Based on the mean and standard deviation it appears that the distributions are the same.
One might conclude that the weeks occupancies are the same.
c. See table above for median values.
d.

e. Based on the medians and dot plots, the distributions are quite different.
Learning Objective: 04-1
Learning Objective: 04-2
Learning Objective: 04-3
Learning Objective: 04-4

4.76 Answers will vary.


Learning Objective: 04-1
Learning Objective: 04-2

4.77 a. and b.

To (not midpoint
f×m f(m− x )
2
From incl) frequency (f) (m)
0 2 7 1 7 500.764
2 4 42 3 126 1751.642
4 8 33 6 198 394.606
8 16 21 12 252 135.697
16 32 11 24 264 2326.167
32 64 6 48 288 8912.915
Total 120 1135 14021.791

Average = 9.458 Standard Deviation = 10.855 CV = 114.77%

c. No, the distribution appears to be skewed right. The frequencies of the lower ranges are
much greater than the higher ranges. One would expect that the distribution within each
interval to also be skewed right, i.e., more values close to the lower end of the range.

4-28
Chapter 04 - Descriptive Statistics

This would make a difference in the estimate for the mean because the method used
with grouped data assumes the average value within each range is the midpoint.We give
equal weight to the midpoints of each interval and our estimate of the mean could be
too low.
d. Unequal bin sizes allowed frequencies greater than 0. Because the times are skewed
right, there would be many bins with frequencies equal to zero in the higher end of the
range.
Learning Objective: 04-10

4.78 a.
midpoint
f×m f(m− x )
2
From To frequency (f) (m)
119 120 2 119.5 239 23.63
120 121 4 120.5 482 23.77
121 122 18 121.5 2187 37.20
122 123 25 122.5 3062.5 4.79
123 124 12 123.5 1482 3.80
124 125 9 124.5 1120.5 21.97
125 126 5 125.5 627.5 32.83
126 127 3 126.5 379.5 38.07
127 128 2 127.5 255 41.63
Total 80 9835 227.69

Average = 122.94 Standard Deviation = 1.698 CV = 1.38%

b. The raw data would show us the years when the winning times were much longer than
the average.
c. Because the overall distribution on time is slightly skewed right it is possible that the
times within an interval are also skewed right. We give equal weight to the midpoints of
each interval and our estimate of the mean could be too low.
Learning Objective: 04-10

4.79 a.

midpoint
f×m f(m− x )
2
From To frequency (f) (m)
40 50 12 45 540 2771.050
50 60 116 55 6380 3131.911
60 80 74 70 5180 7112.649
80 100 2 90 180 1776.547
Total 204 12,280 14792.157

Average = 60.196 Standard Deviation = 8.536 CV = 14.2%

b. No the unequal class sizes don’t hamper the calculations. Class sizes are unequal to
ensure that no class size has a zero value.
Learning Objective: 04-10

4-29
Chapter 04 - Descriptive Statistics

4.80 a.

midpoint
f×m f(m− x )
2
From To frequency (f) (m)
140 150 1 145 145 275.892
150 160 25 155 3875 1092.303
160 170 24 165 3960 275.810
170 180 4 175 700 717.168
180 190 2 185 370 1094.184
Total 56 9050 3455.358

Average = 161.61 Standard Deviation = 7.93 CV = 4.9%

b. Grouped estimates came pretty close.


c. The distribution of times within the class 150-160 may be skewed left with more
observations closer to 160 (which is close to the mean) and the distribution of times
within the class 160-170 may be skewed right with more observations closer to 160 as
well. This observation is based on the apparent shape of the distribution of the entire
data set: somewhat bell shaped with a peak at approximately 160 with a possible slight
right skew.
Learning Objective: 04-10

4.81 Answers will vary.


Learning Objective: 04-1
Learning Objective: 04-2
Learning Objective: 04-3
Learning Objective: 04-4
Learning Objective: 04-5
Learning Objective: 04-6
Learning Objective: 04-7
Learning Objective: 04-8

4.82 Answers will vary.


Learning Objective: 04-1
Learning Objective: 04-2
Learning Objective: 04-3
Learning Objective: 04-4
Learning Objective: 04-5
Learning Objective: 04-6
Learning Objective: 04-7
Learning Objective: 04-8

4-30
Chapter 04 - Descriptive Statistics

4.83 a.

b. r=.8332.
c. Graph says that these two years are strongly correlated. If a state had a high assault rate
in 1990, they also had a high rate in 2004.
d. 1990: Mean = 331.92. Median = 307. Standard Deviation = 172.914
2004: Mean = 256.6. Median = 232. Standard Deviation = 131.259
The comparison shows that the summary measures for 1990 were greater than the summary
measures for 2004.
Learning Objective: 04-1
Learning Objective: 04-2
Learning Objective: 04-3
Learning Objective: 04-4
Learning Objective: 04-9

4.84 a.

b. r=.9459
c. The graph and the correlation coefficient both show that there is a strong, positive, linear
relationship between airspeed and cockpit noise. This relationship most likely
exists because the plane becomes louder as it goes faster.

4-31
Chapter 04 - Descriptive Statistics

Optional:

Noise = 64.23 + 0.0765(Airspeed). As airspeed increases, noise level increases at a rate


of 0.0765 per knot.
Learning Objective: 04-1
Learning Objective: 04-2
Learning Objective: 04-3
Learning Objective: 04-4
Learning Objective: 04-9

4-32

Anda mungkin juga menyukai