Chap 001

Chapter 01 - Introduction and Descriptive Statistics
CHAPTER 1
INTRODUCTION AND DESCRIPTIVE STATISTICS
1-1.
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
1-2.
Data are based on numeric measurements of some variable, either from a data set
comprising an entire population of interest, or else obtained from only a sample (subset)
of the full population. Instead of doing the measurements ourselves, we may sometimes
obtain data from previous results in published form.
1-3.
The weakest is the Nominal Scale, in which categories of data are grouped by qualitative
differences and assigned numbers simply as labels, not usable in numeric comparisons.
Next in strength is the Ordinal Scale: data are ordered (ranked) according to relative size
or quality, but the numbers themselves don't imply specific numeric relationships.
Stronger than this is the Interval Scale: the ordered data points have meaningful distances
between any two of them, measured in units. Finally is the Ratio Scale, which is like an
Interval Scale but where the ratio of any two specific data values is also measured in units
and has meaning in comparing values.
1-4.
quantitative/ratio
qualitative/nominal
quantitative/ratio
qualitative/nominal
quantitative/ratio
quantitative/interval
quantitative/ratio
quantitative/ratio
quantitative/ratio
quantitative/ratio
quantitative/ordinal
Name:
Wealth:
Age:
Industry:
Country of Citizenship:
Qualitative
Quantitative
Quantitative
Qualitative
Qualitative
1-5.
Ordinal.
1-6.
A qualitative variable describes different categories or qualities of the members of a data set,
which have no numeric relationships to each other, even when the categories happen to be coded
as numbers for convenience. A quantitative variable gives numerically meaningful information, in
terms of ranking, differences, or ratios between individual values.
1-7.
The people from one particular neighborhood constitute a non-random sample (drawn
from the larger town population). The group of 100 people would be a random sample.
1-1
1-8.
A sample is a subset of the full population of interest, from which statistical inferences
are drawn about the population, which is usually too large to permit the variables to be
measured for all the members.
1-9.
A random sample is a sample drawn from a population in a way that is not a priori biased
with respect to the kinds of variables being measured. It attempts to give a representative
cross-section of the population.
1-10. Nationality: qualitative. Length of intended stay: quantitative.

1-11. Ordinal. The colors are ranked, but no units of difference between any two of them are
defined.
1-12. Income:
quantitative, ratio
Number of dependents:
quantitative, ratio
Filing singly/jointly: qualitative, nominal
Itemized or not:
qualitative, nominal
Local taxes: quantitative, ratio
1-13. Lower quartile = 25th percentile = data point in position (n + 1)(25/100) =
34(25/100) = position 8.5. (Here n = 33.) Let us order our observations: 109, 110,
114, 116, 118, 119, 120, 121, 121, 123, 123, 125, 125, 127, 128, 128, 128, 128, 129, 129,
130,
131, 132, 132, 133, 134, 134, 134, 134, 136, 136, 136, 136.
Lower quartile = 121
[Using the formula given in the text: (n+1)(p/100)]
Middle quartile is in position: 34(50/100) = 17. Point is 128.
Upper quartile is in position: 34(75/100) = 25.5. Point is 133.5
10th percentile is in position: 34(10/100) = 3.4. Point is 114.8.
IQR = 133.5 - 121 = 12.5.
[Using the Excel Template: Basic Statistics.xls]
Percentile and Percentile Rank Calculations
x-th
x
Percentile
10
116.4
15
118.8
65
130.8
y
116.4
118.8
130.8
Quartiles
1st Quartile
Median
3rd Quartile
121
128
133
IQR
1-2
12
Percentile
rank of y
10
15
65
1-14. First, order the data:

-1.2, 3.9, 8.3, 9, 9.5, 10, 11, 11.6, 12.5, 13, 14.8, 15.5, 16.2, 16.7, 18
The median, or 50th percentile, is the point in position 16(50/100) = 8. The point is 11.6.
First quartile is in position 16(25/100) = 4. Point is 9.
Third quartile is in position 16(75/100) = 12. Point is 15.5.
55th percentile is in position 16(55/100) = 8.8. Point is 12.32.
1-15.
Order the data:

-1.3, -0.7, -0.7, -0.5, -0.4, 0.1, 0.2, 0.7, 0.8, 1.6
Median is in position 11(50/100) = 5.5. Point is 0.15.

x-th
x
Percentile
20
-0.7
30
-0.56
60
0.14
90
0.88
Quartiles
1st Quartile
Median
3rd Quartile
1-16.
-0.65
-0.15
0.575
IQR
1.225
Order the data: 1, 1, 1, 2, 2, 2, 3, 3, 4, 4, 5, 5, 6, 6, 7.

Lower quartile is the 25th percentile, in position 16(25/100) = 4. Point is 2.
The median is in position 16(50/100) = 8. The point is 3.
Upper quartile is in position 16(75/100) = 12. Point is 5.
IQR = 5 - 2 = 3.
60th percentile is in position 16(60/100) = 9.6. Point is 4.
1-3

x-th
x
Percentile
60
4
1
1
y
4.0
0
0
Quartiles
1st Quartile
Median
3rd Quartile
1.17.
2
3
5
IQR
The data are already ordered; there are 16 data points.

The median is the point in position 17(50/100) = 8.5 It is 51.
Lower quartile is in position 17(25/100) = 4.25. It is 30.5.
Upper quartile is in position 17(75/100) = 12.75. It is 194.25.
IQR = 194.25 - 30.5 = 163.75.

x-th
x
Percentile
45
43
y
43.0
0
0
Quartiles
1st Quartile
Median
3rd Quartile
31.5
51
162.75
IQR
131.25
1-18.
The mean is a central point that summarizes all the information in the data. It is sensitive to
extreme observations. The median is a point "in the middle" of the data set and does not contain
all the information in the set. It is resistant to extreme observations. The mode is a value that
occurs most frequently.
1.19.
Mean, median, mode(s) of the observations in Problem 1-13:

Mean x xi 126.64
Median = 128
Modes = 128, 134, 136 (all have 4 points)
1-4

Measures of Central tendency
Mean 126.63636
1-20.
Median
128
Mode
128
For the data of Problem 1-14:

Mean = 11.2533
Median = 11.6
Mode: none
1-21. For the data of Problem 1-15:

Mean = 66.955
Median = 70
Mode = 45

Mean 66.954545
1-22.
Median
70
Mode
45

Mean = 3.466
Median = 3
Mode = 1 and 2

Mean 3.4666667
1-23.
Median
Mode

Mean = 199.875
Median = 51
Mode: none

Mean
199.875
Median
51
1-5
Mode
#N/A
1-24.
For the data of Example 1-1:

Mean = 163,260
Median = 166,800
Mode: none
1-25. (Using the template: Basic Statistics.xls, enter the data in column K.)
Basic Statistics from Raw Data
Mean 21.75
Median 13
Mode 12
1-26. (Using the template: Basic Statistics.xls)

50
40
30
20
10
iti
gr
ou
r
Pf
ize
of
t
M
ic
ro
s
ile
M
ob
Ex
xo
n
G
E
AT
&T
-10
In
te
l
Mean = 17.571
Median = 16.9
Outliers: -6.9, 46.5
1-27. [Using the Excel Template: Basic Statistics.xls]

Mean
18.35
Median
19.1
Mode
#N/A
1-28.
Measures of variability tell us about the spread of our observations.
1-29.
The most important measures of variability are the variance and its square root- the standard
deviation. Both reflect all the information in the data set.
1-30.
For a sample, we divide the sum of squared deviations from the mean by n 1, rather than by n.
1-6
1-31.
For the data of Problem 1-13, assumed a sample: Range = 136 109 = 27
Variance = 57.74
Standard deviation = 7.5986
Variance
St. Dev.
If the data is of a
Sample
Population
57.7386364
7.59859437
1-32.
For the data of Problem 1-14: Range = 18 (1.2) = 19.2

Variance = 25.90 Standard deviation = 5.0896
1-33.
For the data of Problem 1-15: Range = 98 38 = 60

If the data is of a
Sample
1-34.
Variance
321.378788
St. Dev.
17.9270407
For the data of Problem 1-16: Range = 7 1 = 6

Variance
St. Dev.
1-35.
Population
If the data is of a
Sample
Population
3.98095238
1.99523241
For the data of Problem 1-17: Range = 1,209 23 = 1,186

Variance = 110,287.45 Standard deviation = 332.096
If the data is of a
Sample Population
Variance 110287.45
St. Dev. 332.095543
1-36.
n 33, x 126.64, s 7.60, so x 2 s 111 .44,141.84 ; this captures 31/33 of the
data points, so Chebyshev's theorem holds. The data set is not mound-shaped, so the empirical
rule does not apply.
1-37.
n 15, x 11.253, s 5.090, so x 2s 1.073, 21.433 ; this captures 14/15 of the

data points, so Chebyshev's theorem holds. The data set is not mound-shaped, so the empirical
rule does not apply
1-7
1-38.
1-39.
n 22, x 66.95, s 17.93, so x 2 s 31.09,102.81 ; this captures all the data

points, so Chebyshev's theorem holds. The data set is not mound-shaped, so the empirical rule
does not apply.
n 15, x 3.467, s 1.995, so x 2s 0.523, 7.457 ; this captures all the data
points, so Chebyshev's theorem holds. The data set is not mound-shaped, so the empirical rule
does not apply.
1-40.
n 16, x 199.9, s 332.1, so x 2s 464.3, 864.1 ; this captures 15/16 of the data points,
so Chebyshev's theorem holds. The data set is not mound-shaped, so the empirical rule does not
apply.
1-41.
Electrolux
GE
Matsushita
Whirlpool
B-S
Philips
Maytag
1-42.
Stock 5
Stock 4
Stock 3
Stock 2
Stock 1
0
10
15
20
1-8
1-43.
Endowments ($ billions)
$ billions
4
3
2
1
Texas
A&M
Columbia
Stanford
Yale
Princeton
University

Mean
24.13
Median
23.65
Measures of Dispersion
If the data is of a
Sample
Population
Variance
70.6312222
St. Dev.
8.40423835
Top Private Equity Deals

45
40
35
$(billions)
1-44.
Texas
Harvard
30
25
20
15
10
5
0
1
1-9
10
1-45.

Mean
13.333333
Median
12.5
Credit Default Sw ap Values
1-46.
1-47.
Using MINITAB
Stem
4 5
8 6
14 6
(9) 7
11 7
3 8
Leaves
5688
0123
677789
002223334
55667889
224
1-10
20<25
Sales
15<20
10<15
5<10
0<5
frequencies
Sales ($)
8
7
6
5
4
3
2
1
0
Box and Whisker Plot
1-48.
8.5
7.9
C1
7.3
6.7
6.1
5.5
34 cases
There are no outliers. Distribution is skewed to the left.
1.49.
1-50.
A stem-and-leaf display is a quickly drawn type of histogram useful in analyzing data. A box plot
is a more advanced display useful in identifying outliers and the shape of the distribution of the
data.
Stem
1 0
1 1
1 2
7 3
(13) 4
11 5
2 6
1 7
Leaves
5
234578
2234567788899
012235678
3
8
1-11
1-51.
The data are narrowly and symmetrically concentrated near the median (IQR and the whisker
lengths are small), not counting the two extreme outliers.

80
C1
60
40
20
0
31 cases
1-52.
Wider dispersion in data set #2. Not much difference in the lower whiskers or lower hinges of the
two data sets. The high value, 24, in data set #2 has a significant impact on the median, upper
hinge and upper whisker values for data set #2 with respect to data set #1.
1-53.
Mean = 127
Var = 137
sd = 11.705
mode = 127
outliers: TWA, Lufthansa
1-12
160
150
140
130
120
110
100
1-54.
Stem-and-leaf of C2
Leaf Unit = 1.0
f
13
18
(6)
21
15
8
6
3
2
Stem
1
1
2
2
3
3
4
4
5
N = 45
Leaves
0011111223444
55689
022333
567789
0122234
78
012
7
23
1-55.
Outliers are detected by looking at the data set, constructing a box plot or stem-and-leaf display.
An outlier should be analyzed for information content and not merely eliminated.
1-56.
The median is the line inside the box. The hinges are the upper and lower quartiles. The inner
fences are the two points at a distance of 1.5 (IQR) from the upper and lower quartiles. Outer
fences are similar to the inner fences but at a distance of 3 (IQR). The box itself represents 50%
of the data.
1-13
1-57.
Mine A:
f
2
4
7
(5)
7
4
4
3
1
Stem
3
3
4
4
5
5
6
7
8
Mine B:
f
2
4
6
9
(3)
7
4
1
Leaves
24
57
123
55689
123
0
36
5
Stem
2
2
3
3
4
4
5
5
Leaves
34
89
24
578
034
789
012
9
Values for Mine A are smaller than for Mine B, right-skewed, and there are three outliers. Values
for Mine B are larger and the distribution is almost symmetric. There is larger variance in B.
1-58.
No. One needs to use descriptive statistics and/or statistical inference.
1-59.
[Using the template: Box Plot.xls]
Box Plot
Daily Percentage Change in Stock Prices

Lower
Whisker
-0.3
1.60.
Lower
Hinge
0.275
Median
0.6

Mean
4.88
Median
4.9
The two measures are virtually equivalent.
[Using the template: BoxPlot.xls]
1-14
Upper
Hinge
1.15
Upper
Whisker
1.6
Box Plot
0 to 60 times
Lower
Whisker
4.2
1-61.
Lower
Hinge
4.725
Median
4.9
Upper
Hinge
5.1
Upper
Whisker
5.3
Answers will vary.

a. If we add the value 5 to all the data points, then the average, median, mode, first quartile,
third quartile and 80th percentile values will change by 5. There is no change in the
variance, standard deviation, skewness, kurtosis, range and interquartile range values.
b. Average: if we add 5 to all the data points, then the sum of all the numbers will increase by
5*n, where n is the number of data points. The sum is divided by n to get the average. So
5*n / n = 5: the average will increase by 5.
Median: If we add 5 to all the data points, the median value will still be the midway point
in the ordered array. Its value will also increase by 5
Mode: Adding 5 to all the data points changes the number that occurs most frequently by
5
First Quartile: adding 5 to all the data points does not change the location of the first
quartile in the ordered array of numbers, which is: (.25)(n+1) where n is the number of data
points. Whether the first quartile falls on a specific data point or between two data points, the
resulting value will have been increased by 5.
Third Quartile: adding 5 to all the data points does not change the location of the third
quartile in the ordered array of numbers, which is: (.75)(n+1) where n is the number of data
points. Whether the third quartile falls on a specific data point or between two data points,
the resulting value will have been increased by 5.
80th percentile: adding 5 to all the data points has the same effect as in the calculation of
the first or third quartile. The value will be increased by 5
Range: adding 5 to the all the data points will have no effect on the calculation of the
range. Since both the highest value and the lowest value have been increase by the same
number, the subtraction of the lowest value from the highest value still yields the same value
for the range.
Variance: adding 5 to all the data points has no effect on the calculation of the variance.
Since each data point is increased by 5 and the average has also been shown to increase by
the same factor, the differences between each individual new data point and the new average
will not change and will not be affected by squaring the difference, summing the squared
differences and dividing by number of data points.
1-15
Standard Deviation: since the variance is not affected by adding 5 to each data point,
neither is the standard deviation.
Skewness: Since each data point is increased by 5 and the average has also been shown to
increase by the same factor, the differences between each individual new data point and the
new average will not change. Therefore, the numerator in the formula for skewness is not
affected. Since the standard deviation is not affected as well (the denominator), there is no
change in the value for skewness.
Kurtosis: Since each data point is increased by 5 and the average has also been shown to
increase by the same factor, the differences between each individual new data point and the
new average will not change. Therefore, the numerator in the formula for kurtosis is not
affected. Since the standard deviation is not affected as well (the denominator), there is no
change in the value for kurtosis.
Interquartile Range: given that both the first quartile and the third quartile increased by the
same factor, 5, the difference between the two values remains the same.
c. Multiplying each data point by a factor 3 results in the following changes. The mean,
median, mode, first quartile, third quartile and 80th percentile values will be increased by the
same factor 3. In addition, the standard deviation and the range will also increase by the
same factor 3. The variance will increase by the factor squared, and the skewness and
kurtosis values will remain unchanged.
d. Multiplying all data points by a factor 3 and adding a value 5 to each data point has the
following results. The order of operation is first to multiply each data point and then add a
value to each data point. Each data point is first multiplied by the factor 3 and then the
value 5 is added to each newly multiplied data point. Multiplying each data point by the
factor 3 yields the results listed in c). Adding a value 5 to the newly multiplied data points
yields the results listed in a).
1.62. [Using the template: Basic Statistics.xls]
Mean
41.01
Median
23.8
Variance
St. Dev.
If the data is of a
Sample
Population
1136.941
33.7185557
1-16
1-63. = 504.688
= 94.547

Mean
504.6875
Median
501.5
Mode
#N/A
Range
IQR
346
149.5
Variance
St. Dev.
If the data is of a
Sample
Population
8939.15234
94.5470906
1-64.
Step 1: Enter the data from problem 1-63 into cells Y4:Y35 of the template: Histogram.xls from Chapter
1. The template will order the data automatically.
Step 2: We need to select a starting point for the first class, an ending point for the last class, and
a class interval width. The starting point of the first class should be a value less than the
smallest value in the data set. The smallest value in the data set is 344, so you would
want to set the first class to start with a value smaller than 344. Lets use 320. We also
selected 710 as the ending value of the last class, and selected 50 as the interval width.
The data input column and the histogram output from the template are presented below.
The end-point for each class is included in that class; i.e., the first class of data goes from
more than 320 up to and including 370, the second class starts with more than 370 up to
and including 420, etc.
1-17
1-65.
Range: 690 344 = 346

90th percentile lies in position: 33(90/100) = 29.7 It is 632.7
First quartile lies in position: 33(25/100) = 8.25 It is 419.25
Median lies in position: 33(50/100) = 16.5
It is 501.5
Third quartile lies in position: 33(75/100) = 24.75 It is 585.75
1-66.
6
Ogive: TV Sets
20
3
2
47.5
42.5
37.5
32.5
27.5
22.5
17.5
12.5
7.5
1
0
TV sets
cum freq
freq
5
4
15
10
5
0
10
15
20
25
30
TV Sets
1-67.
2
7
(3)
6
4
2
2
1-68.
Stem
1
1
2
2
3
3
4
Leaves
24
56789
023
55
24
01

42
36
C2
30
24
18
12
1-18
35
40
45
The data is skewed to the right.

1-69.
Stem Leaves
3 1
012
4 1
9
12 2
1122334
(9) 2
556677889
6 3
024
3 3
57
1 4
1 4
1 5
1 5
1 6
2
The data is skewed to the right with one extreme outlier (62) and three suspected outliers
(10,11,12)

1-70.
80
C1
60
40
20
0
1.71.
[Using the template: Basic Statistics.xls]

Mean
8.0666667
Median
Mode
10
Based on just these three measures, cheap wine appears to work well in cooking
1-19
1-72.

Mean 20.3
Median
20.2
If the data is of a
Sample
Population
Variance
0.10909091
St. Dev.
0.33028913
Motorola's Stock Prices

21
20.8
20.6
20.4
20.2
20
19.8
19.6
19.4
19.2
1
Box Plot
10
11
12
Motorolas Stock
Prices
Lower
Whisker
19.8
Lower
Hinge
20.075
Median
20.2
1-20
Upper
Hinge
20.525
Upper
Whisker
20.8
1-73.
Mean = 33.271
sd = 16.945
var = 287.15
QL = 25.41
Med = 26.71
QU = 35
Outliers: Morgan Stanley (91.36%)

100
C1
80
60
40
20
15 cases
1-74.
Mean = 3.18
sd = 1.348
var = 1.817
QL = 1.975
Med = 2.95
QU = 3.675
Outliers: 8.70

9
C1
7
5
3
1
20 cases
1-21
1-75.
a.
b.
c.
d.
Minitab output:
Descriptive Statistics: change in bad loans, change in

Provisions
Variable
change bad loans
change Provisions
Mean
56.28
8.12
StDev
42.73
12.21
Median
43.40
8.60
While the average of the Change in Provisions is close to the 4.1 average for all banks, the
average of the Change in Bad Loans is considerably higher than the industry average of 11.00.
The box plot for change in Bad Loans does not show any outliers.
Boxplot of change in bad loans

160
140
120
change in bad loans
1.76.
IQR = 3.5
data is right-skewed
9.5 is more likely to be the mode, since the data is right-skewed
Will not affect the plot.
100
80
60
40
20
0
1-22
The box plot for change in Provisions does show one possible outlier for W Holding at 37.3:
Boxplot of change in Provisions
40
change in Provisions
30
20
10
-10
1.77.
The Minitab output:
Descriptive Statistics: bank assets

Variable
bank assets
Mean
186.7
StDev
355.6
Median
56.2
The average for the bank assets of the 19 lending institutions is larger than the industry average of
149.30.
1-23
The box plot of bank assets show three possible outliers for Bank of America (1459), Wachovia
(707.1), and Wells Fargo (481.9)
Boxplot of bank assets
1600
1400
bank assets
1200
1000
800
600
400
200
0
1-78.
Mean
1720.2
Median
930
[Using the template: Basic

Statistics.xls]

Mean
56.266667
Median
57
If the data is of a
Sample
Population
Variance
164.780952
153.795556
St. Dev.
12.8367033
12.4014336
The mean and median for the 15 selected countries are higher than the overall mean approval
rating of 53%.
1.78.
The chart indicates that there is a significantly large difference between the annual sales per
square foot for Apple Stores relative to the other four companies listed.
1-24
If the data is of a
Sample
Population
1987680.96
1409.8514
Variance
St. Dev.
1-80.
Mean = 99.039
sd = .4366
var = .1907
Median = 99.155
1-81.
Mean = 17.587
sd = .466
var = .2172
Mean
17.5875
Median
17.5
Mode
18.3
Range
IQR
1.4
0.75
If the data is of a
Sample
Population
Variance 0.21716667 0.20359375
St. Dev. 0.46601144 0.45121364
1-82.
Mean = 259.82
sd = 357.24
(Using the template: Basic Statistics.xls)

Mean
259.82
Median
9.5
If the data is of a
Sample
Population
Variance
127622.462
St. Dev.
357.242861
1-25
1-83.
Mean = 37.17
sd = 13.128
Median = 34
Mean
37.166667
Median
34
If the data is of a
Sample
Population
Variance
172.333333
St. Dev.
13.1275791
1-84. Stock Prices for period: April, 2001 through June, 2001 [Answers will vary due to dates
used.]
a). Mean and Standard Deviation for Wal-Mart
Stock Prices: Wal-Mart

Mean
51.041478
Median
51.1266
Mode
50.158
Range
IQR
6.1911
1.9613
If the data is of a
Sample Population
Variance 2.25711298 2.22128579
St. Dev. 1.50236912 1.49039786
Higher Moments
If the data is of a
Sample Population
Skewness 0.07083784 0.06913994
(Relative) Kurtosis -0.711512 -0.7500338
1-26
b). Mean and Standard Deviation for K-Mart
Stock Prices: K-Mart

Mean
10.450952
Median
10.66
Mode
11.8
Range
IQR
3.51
1.955
If the data is of a
Sample Population
Variance 0.9852023 0.96956417
St. Dev. 0.99257358 0.9846645
Higher Moments
If the data is of a
Sample Population
Skewness -0.4070262 -0.3972703
(Relative) Kurtosis -1.132009 -1.1378913
c). Coefficient of variation:

CV = std. dev mean
For Wal-Mart:
considering the data as a population:
CV = 1.49039786 / 51.041478 = 0.0292
considering the data as a sample:
CV = 1.50236912 / 51.041478 = 0.02943
for K-Mart:
CV = 0.9846645 / 10.450952 = 0.0942
CV = 0.99257358 / 10.450952 = 0.09497
d). There is a greater degree of risk in the stock prices for K-Mart than for Wal-Mart over
this three month period.
e). For DJIA
considering the data as a population:

CV = 427.913791 / 10681.11 = 0.04006
considering the data as a sample:
CV = 431.350905 / 10681.11 = 0.04038
Wal-Mart stocks provided a less risky return for this time period relative to DJIA and KMart.
f). 100 Shares of Wal-Mart stocks purchased April 2, 2001:
Price = $50.5674 Cost = $5056.74
Mean of holding 100 shares: $5104.15
1-27
Std dev of holding 100 shares: 1.4904 (rounded: if data considered a population)
1.5024 (rounded: if data considered a sample)
1-85.
a). for a process mean = 2004

VARP = Average SSD2004 + offset2
VARP = 3.5 + offset2
where offset = target process
b). if target = process, then offset = 0
substituting: VARP = 3.5 + offset2 = 3.5 + 02 = 3.5
1-86.
a) & b): CPI and Gas prices for period: June 97 through May 01. (Non-seasonally
adjusted series.)
CPI index converted (by 100) in order to compare both series on same chart. There is no
seasonal pattern present in the CPI index. Steady trend present in CPI; considerable
variability in gas prices. Gas prices increased considerably more than the overall CPI for
the same time period.
1-28
1-87.
a). Pie Chart: AIDS cases by Age groups

Age Group
Under 5:
Ages 5 to 12:
Ages 13 to 19:
Ages 20 to 24:
Ages 25 to 29:
Ages 30 to 34:
Ages 35 to 39:
Ages 40 to 44:
Ages 45 to 49:
Ages 50 to 54:
Ages 55 to 59:
Ages 60 to 64:
Ages 65 or older:
No.
6812
1992
3865
26518
99587
168723
168778
124398
72128
38118
20971
11636
10378
%
0.90%
0.26%
0.51%
3.52%
13.21%
22.38%
22.39%
16.50%
9.57%
5.06%
2.78%
1.54%
1.38%
AIDS cases by age
Under 5: (0.90%)
Ages 5 to 12: (0.26%)
Ages 65 or older: (1.38%)
Ages 13 to 19: (0.51%)
Ages 60 to 64: (1.54%)
Ages 55 to 59: (2.78%)Ages 20 to 24: (3.52%)
Ages 50 to 54: (5.06%)
Ages 25 to 29: (13.21%)
Ages 45 to 49: (9.57%)
Ages 40 to 44: (16.50%)
Ages 30 to 34: (22.38%)
Ages 35 to 39: (22.39%)
b). Pie Chart: AIDS cases by Race

Race
White, not Hispanic
Black, not Hispanic
Hispanic
Asian/Pacific Islander
American Indian/Alaska Native
Race/ethnicity unknown
No.
324822
282720
137575
5546
2234
1010
%
43.09%
37.50%
18.25%
0.74%
0.30%
0.13%
1-29
AIDS cases by Race

Race/ethnicity unknown (0.13%)
American Indian/Alaska Native (0.30%)
Asian/Pacific Islander (0.74%)
Hispanic (18.25%)
White, not Hispanic (43.09%)
Black, not Hispanic (37.50%)
1-88. (Using the template: Box Plot 2.xls)

Comparing two data sets using Box Plots
Lower
Whisker
Cubs 300000
White Sox 301000
Lower
Hinge
650000
340000
Salaries 2004
Upper
Upper
Median
Hinge Whisker
1550000 5750000 9500000
775000 3875000 8000000
Cubs
White Sox
Outliers: Cubs: Sosas salary of $16M

White Sox: Ordonezs salary of $14M
Furthermore, the median salary of the Cubs is twice the median salary of the White Sox. There
are some players on both teams making the league minimum salary.
Somewhat lower salary range for the White Sox relative to the Cubs due to the fact that only
seven (7) players on the Cubs were paid $500,000 or less while eleven (11) players earned less
than that amount on the White Sox.
1-30
1-89
[Using the Basic Statistics.xls]

Mean
5.1477778
Median
5.35
If the data is of a
Sample
Population
Variance
0.36249444
St. Dev.
0.60207512
Case 1: NASDAQ Volatility

1) NASDAQ Combined Composite Index for 2000
[Using template: Time Plot 2.xls]
NASDAQ Composite Index
5000
4500
4000
3500
3000
2500
2000
1500
1000
500
0
Jan
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
Feb Mar
2000
3940.35
4696.69
4572.83
3860.66
3400.91
3966.11
3766.99
4206.35
3672.82
3369.63
2597.93
2470.52
1-31
Apr
May Jun
Jul
Aug Sep Oct
Nov Dec
2) Compare 2006 with 2007. [Please note: at the time of printing, data for 2007 was available only
through close on 5?25/07.]
Plots suggest there may be more volatility in 2006.
Standard deviation for 2006 = 105.3317
Standard deviation for 2007 = 82.3060
1-32

NASDAQ Composite
Index
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
2006
2305.82
2281.39
2339.79
2322.57
2178.88
2172.09
2091.47
2183.75
2258.43
2366.71
2431.77
2415.29
2007
2463.93
2416.15
2421.64
2525.09
2604.52
2588.96
3) Comparison of NASDAQ with S&P 500 Index for 2007

Comparison using Time Plot NASDAQ vs S&P for 2007
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
S&P
1438.24
1406.82
1420.86
1482.37
1530.62
1502.56
NASDAQ
2463.93
2416.15
2421.64
2525.09
2604.52
2588.96
There was more volatility in the NASDAQ Index in 2007 than in the S&P 500 Index in 2007.
Standard deviation for NASDAQ in 2007 = 82.3060
Standard deviation for S&P 500 in 2007 = 49.1033
4) Comparison of the NASDAQ with DJIA for 2000
1-33
Comparison using Time Plot NASDAQ vs DJI for 2007
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
DJI
12621.69
12268.63
12354.35
13062.91
13627.64
13360.26
NASDAQ
2463.93
2416.15
2421.64
2525.09
2604.52
2588.96
There was more volatility in the DJI Index in 2007 than in the NASDAQ Index.
Standard deviation for NASDAQ in 2007 = 82.3060
Standard deviation for DJIA in 2007 = 554.948
5). Answers will vary given date of assignment.
1-34

Chap 001

Diunggah oleh

Informasi Dokumen

Hak Cipta

Format Tersedia

Bagikan dokumen Ini

Bagikan atau Tanam Dokumen

Opsi Berbagi

Apakah menurut Anda dokumen ini bermanfaat?

Apakah konten ini tidak pantas?

Hak Cipta:

Format Tersedia

Chap 001

Diunggah oleh

Hak Cipta:

Format Tersedia

Chapter 01 - Introduction and Descriptive Statistics

Chapter 01 - Introduction and Descriptive Statistics

1-10. Nationality: qualitative. Length of intended stay: quantitative.

Chapter 01 - Introduction and Descriptive Statistics

1-14. First, order the data:

Order the data:

[Using the Excel Template: Basic Statistics.xls]

Order the data: 1, 1, 1, 2, 2, 2, 3, 3, 4, 4, 5, 5, 6, 6, 7.

Chapter 01 - Introduction and Descriptive Statistics

[Using the Excel Template: Basic Statistics.xls]

The data are already ordered; there are 16 data points.

[Using the Excel Template: Basic Statistics.xls]

Mean, median, mode(s) of the observations in Problem 1-13:

Chapter 01 - Introduction and Descriptive Statistics

[Using the Excel Template: Basic Statistics.xls]

For the data of Problem 1-14:

1-21. For the data of Problem 1-15:

[Using the Excel Template: Basic Statistics.xls]

For the data of Problem 1-16:

[Using the Excel Template: Basic Statistics.xls]

For the data of Problem 1-17:

[Using the Excel Template: Basic Statistics.xls]

Chapter 01 - Introduction and Descriptive Statistics

For the data of Example 1-1:

1-26. (Using the template: Basic Statistics.xls)

1-27. [Using the Excel Template: Basic Statistics.xls]

Measures of variability tell us about the spread of our observations.

Chapter 01 - Introduction and Descriptive Statistics

For the data of Problem 1-14: Range = 18 (1.2) = 19.2

For the data of Problem 1-15: Range = 98 38 = 60

For the data of Problem 1-16: Range = 7 1 = 6

For the data of Problem 1-17: Range = 1,209 23 = 1,186

n 33, x 126.64, s 7.60, so x 2 s 111 .44,141.84 ; this captures 31/33 of the

n 15, x 11.253, s 5.090, so x 2s 1.073, 21.433 ; this captures 14/15 of the

Chapter 01 - Introduction and Descriptive Statistics

n 22, x 66.95, s 17.93, so x 2 s 31.09,102.81 ; this captures all the data

Chapter 01 - Introduction and Descriptive Statistics

[Using the Excel Template: Basic Statistics.xls]

Top Private Equity Deals

Chapter 01 - Introduction and Descriptive Statistics

[Using the Excel Template: Basic Statistics.xls]

Credit Default Sw ap Values

Chapter 01 - Introduction and Descriptive Statistics

Box and Whisker Plot

Chapter 01 - Introduction and Descriptive Statistics

Box and Whisker Plot

Chapter 01 - Introduction and Descriptive Statistics

Chapter 01 - Introduction and Descriptive Statistics

No. One needs to use descriptive statistics and/or statistical inference.

[Using the template: Box Plot.xls]

Daily Percentage Change in Stock Prices

[Using the Excel Template: Basic Statistics.xls]

The two measures are virtually equivalent.

[Using the template: BoxPlot.xls]

Chapter 01 - Introduction and Descriptive Statistics

Answers will vary.

Chapter 01 - Introduction and Descriptive Statistics

Chapter 01 - Introduction and Descriptive Statistics

Measures of Central tendency

Chapter 01 - Introduction and Descriptive Statistics

Range: 690 344 = 346

Box and Whisker Plot

Chapter 01 - Introduction and Descriptive Statistics

The data is skewed to the right.