Anda di halaman 1dari 34

Chapter 01 - Introduction and Descriptive Statistics

CHAPTER 1
INTRODUCTION AND DESCRIPTIVE STATISTICS
1-1.

1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.

1-2.

Data are based on numeric measurements of some variable, either from a data set
comprising an entire population of interest, or else obtained from only a sample (subset)
of the full population. Instead of doing the measurements ourselves, we may sometimes
obtain data from previous results in published form.

1-3.

The weakest is the Nominal Scale, in which categories of data are grouped by qualitative
differences and assigned numbers simply as labels, not usable in numeric comparisons.
Next in strength is the Ordinal Scale: data are ordered (ranked) according to relative size
or quality, but the numbers themselves don't imply specific numeric relationships.
Stronger than this is the Interval Scale: the ordered data points have meaningful distances
between any two of them, measured in units. Finally is the Ratio Scale, which is like an
Interval Scale but where the ratio of any two specific data values is also measured in units
and has meaning in comparing values.

1-4.

quantitative/ratio
qualitative/nominal
quantitative/ratio
qualitative/nominal
quantitative/ratio
quantitative/interval
quantitative/ratio
quantitative/ratio
quantitative/ratio
quantitative/ratio
quantitative/ordinal

Name:
Wealth:
Age:
Industry:
Country of Citizenship:

Qualitative
Quantitative
Quantitative
Qualitative
Qualitative

1-5.

Ordinal.

1-6.

A qualitative variable describes different categories or qualities of the members of a data set,
which have no numeric relationships to each other, even when the categories happen to be coded
as numbers for convenience. A quantitative variable gives numerically meaningful information, in
terms of ranking, differences, or ratios between individual values.

1-7.

The people from one particular neighborhood constitute a non-random sample (drawn
from the larger town population). The group of 100 people would be a random sample.

1-1

Chapter 01 - Introduction and Descriptive Statistics

1-8.

A sample is a subset of the full population of interest, from which statistical inferences
are drawn about the population, which is usually too large to permit the variables to be
measured for all the members.

1-9.

A random sample is a sample drawn from a population in a way that is not a priori biased
with respect to the kinds of variables being measured. It attempts to give a representative
cross-section of the population.

1-10. Nationality: qualitative. Length of intended stay: quantitative.


1-11. Ordinal. The colors are ranked, but no units of difference between any two of them are
defined.
1-12. Income:
quantitative, ratio
Number of dependents:
quantitative, ratio
Filing singly/jointly: qualitative, nominal
Itemized or not:
qualitative, nominal
Local taxes: quantitative, ratio
1-13. Lower quartile = 25th percentile = data point in position (n + 1)(25/100) =
34(25/100) = position 8.5. (Here n = 33.) Let us order our observations: 109, 110,
114, 116, 118, 119, 120, 121, 121, 123, 123, 125, 125, 127, 128, 128, 128, 128, 129, 129,
130,
131, 132, 132, 133, 134, 134, 134, 134, 136, 136, 136, 136.
Lower quartile = 121
[Using the formula given in the text: (n+1)(p/100)]
Middle quartile is in position: 34(50/100) = 17. Point is 128.
Upper quartile is in position: 34(75/100) = 25.5. Point is 133.5
10th percentile is in position: 34(10/100) = 3.4. Point is 114.8.
15th percentile is in position: 34(15/100) = 5.1. Point is 118.1.
65th percentile is in position: 34(65/100) = 22.1. Point is 131.1.
IQR = 133.5 - 121 = 12.5.
[Using the Excel Template: Basic Statistics.xls]
Percentile and Percentile Rank Calculations
x-th
x
Percentile
10
116.4
15
118.8
65
130.8

y
116.4
118.8
130.8

Quartiles
1st Quartile
Median
3rd Quartile

121
128
133

IQR

1-2

12

Percentile
rank of y
10
15
65

Chapter 01 - Introduction and Descriptive Statistics

1-14. First, order the data:


-1.2, 3.9, 8.3, 9, 9.5, 10, 11, 11.6, 12.5, 13, 14.8, 15.5, 16.2, 16.7, 18
[Using the formula given in the text: (n+1)(p/100)]
The median, or 50th percentile, is the point in position 16(50/100) = 8. The point is 11.6.
First quartile is in position 16(25/100) = 4. Point is 9.
Third quartile is in position 16(75/100) = 12. Point is 15.5.
55th percentile is in position 16(55/100) = 8.8. Point is 12.32.
85th percentile is in position 16(85/100) = 13.6. Point is 16.5.
1-15.

Order the data:


-1.3, -0.7, -0.7, -0.5, -0.4, 0.1, 0.2, 0.7, 0.8, 1.6
[Using the formula given in the text: (n+1)(p/100)]
Median is in position 11(50/100) = 5.5. Point is 0.15.
20th percentile is in position 11(20/100) = 2.2. Point is 0.7.
30th percentile is in position 11(30/100) = 3.3. Point is 0.64.
60th percentile is in position 11(60/100) = 6.6. Point is 0.16.
90th percentile is in position 11(90/100) = 9.9. Point is 1.52.

[Using the Excel Template: Basic Statistics.xls]


Percentile and Percentile Rank Calculations
x-th
x
Percentile
20
-0.7
30
-0.56
60
0.14
90
0.88

Quartiles
1st Quartile
Median
3rd Quartile

1-16.

-0.65
-0.15
0.575

IQR

1.225

Order the data: 1, 1, 1, 2, 2, 2, 3, 3, 4, 4, 5, 5, 6, 6, 7.


Lower quartile is the 25th percentile, in position 16(25/100) = 4. Point is 2.
The median is in position 16(50/100) = 8. The point is 3.
Upper quartile is in position 16(75/100) = 12. Point is 5.
IQR = 5 - 2 = 3.
60th percentile is in position 16(60/100) = 9.6. Point is 4.

1-3

Chapter 01 - Introduction and Descriptive Statistics

[Using the Excel Template: Basic Statistics.xls]


Percentile and Percentile Rank Calculations
x-th
x
Percentile
60
4
1
1

y
4.0
0
0

Quartiles
1st Quartile
Median
3rd Quartile

1.17.

2
3
5

IQR

The data are already ordered; there are 16 data points.


[Using the formula given in the text: (n+1)(p/100)]
The median is the point in position 17(50/100) = 8.5 It is 51.
Lower quartile is in position 17(25/100) = 4.25. It is 30.5.
Upper quartile is in position 17(75/100) = 12.75. It is 194.25.
IQR = 194.25 - 30.5 = 163.75.
45th percentile is in position 17(45/100) = 7.65. Point is 42.2.

[Using the Excel Template: Basic Statistics.xls]


Percentile and Percentile Rank Calculations
x-th
x
Percentile
45
43

y
43.0
0
0

Quartiles
1st Quartile
Median
3rd Quartile

31.5
51
162.75

IQR

131.25

1-18.

The mean is a central point that summarizes all the information in the data. It is sensitive to
extreme observations. The median is a point "in the middle" of the data set and does not contain
all the information in the set. It is resistant to extreme observations. The mode is a value that
occurs most frequently.

1.19.

Mean, median, mode(s) of the observations in Problem 1-13:


Mean x xi 126.64

Median = 128
Modes = 128, 134, 136 (all have 4 points)

1-4

Chapter 01 - Introduction and Descriptive Statistics

[Using the Excel Template: Basic Statistics.xls]


Measures of Central tendency
Mean 126.63636

1-20.

Median

128

Mode

128

For the data of Problem 1-14:


Mean = 11.2533
Median = 11.6
Mode: none

1-21. For the data of Problem 1-15:


Mean = 66.955
Median = 70
Mode = 45

[Using the Excel Template: Basic Statistics.xls]


Measures of Central tendency
Mean 66.954545

1-22.

Median

70

Mode

45

For the data of Problem 1-16:


Mean = 3.466
Median = 3
Mode = 1 and 2

[Using the Excel Template: Basic Statistics.xls]


Measures of Central tendency
Mean 3.4666667

1-23.

Median

Mode

For the data of Problem 1-17:


Mean = 199.875
Median = 51
Mode: none

[Using the Excel Template: Basic Statistics.xls]


Measures of Central tendency
Mean

199.875

Median

51

1-5

Mode

#N/A

Chapter 01 - Introduction and Descriptive Statistics

1-24.

For the data of Example 1-1:


Mean = 163,260
Median = 166,800
Mode: none

1-25. (Using the template: Basic Statistics.xls, enter the data in column K.)
Basic Statistics from Raw Data
Measures of Central tendency
Mean 21.75

Median 13

Mode 12

1-26. (Using the template: Basic Statistics.xls)


50
40
30
20
10

iti

gr
ou

r
Pf
ize

of
t
M
ic

ro
s

ile
M
ob
Ex
xo
n

G
E

AT
&T

-10

In
te
l

Mean = 17.571
Median = 16.9
Outliers: -6.9, 46.5

1-27. [Using the Excel Template: Basic Statistics.xls]


Measures of Central tendency
Mean

18.35

Median

19.1

Mode

#N/A

1-28.

Measures of variability tell us about the spread of our observations.

1-29.

The most important measures of variability are the variance and its square root- the standard
deviation. Both reflect all the information in the data set.

1-30.

For a sample, we divide the sum of squared deviations from the mean by n 1, rather than by n.

1-6

Chapter 01 - Introduction and Descriptive Statistics

1-31.

For the data of Problem 1-13, assumed a sample: Range = 136 109 = 27
Variance = 57.74
Standard deviation = 7.5986

Variance
St. Dev.

If the data is of a
Sample
Population
57.7386364
7.59859437

1-32.

For the data of Problem 1-14: Range = 18 (1.2) = 19.2


Variance = 25.90 Standard deviation = 5.0896

1-33.

For the data of Problem 1-15: Range = 98 38 = 60


Variance = 321.38 Standard deviation = 17.927
If the data is of a
Sample

1-34.

Variance

321.378788

St. Dev.

17.9270407

For the data of Problem 1-16: Range = 7 1 = 6


Variance = 3.98 Standard deviation = 1.995

Variance
St. Dev.

1-35.

Population

If the data is of a
Sample
Population
3.98095238
1.99523241

For the data of Problem 1-17: Range = 1,209 23 = 1,186


Variance = 110,287.45 Standard deviation = 332.096
If the data is of a
Sample Population
Variance 110287.45
St. Dev. 332.095543

1-36.

n 33, x 126.64, s 7.60, so x 2 s 111 .44,141.84 ; this captures 31/33 of the

data points, so Chebyshev's theorem holds. The data set is not mound-shaped, so the empirical
rule does not apply.
1-37.

n 15, x 11.253, s 5.090, so x 2s 1.073, 21.433 ; this captures 14/15 of the


data points, so Chebyshev's theorem holds. The data set is not mound-shaped, so the empirical
rule does not apply

1-7

Chapter 01 - Introduction and Descriptive Statistics

1-38.

1-39.

n 22, x 66.95, s 17.93, so x 2 s 31.09,102.81 ; this captures all the data


points, so Chebyshev's theorem holds. The data set is not mound-shaped, so the empirical rule
does not apply.
n 15, x 3.467, s 1.995, so x 2s 0.523, 7.457 ; this captures all the data

points, so Chebyshev's theorem holds. The data set is not mound-shaped, so the empirical rule
does not apply.
1-40.

n 16, x 199.9, s 332.1, so x 2s 464.3, 864.1 ; this captures 15/16 of the data points,
so Chebyshev's theorem holds. The data set is not mound-shaped, so the empirical rule does not
apply.

1-41.

Electrolux
GE
Matsushita
Whirlpool
B-S
Philips
Maytag

1-42.
Stock 5
Stock 4
Stock 3
Stock 2
Stock 1
0

10

15

20

1-8

Chapter 01 - Introduction and Descriptive Statistics

1-43.

Endowments ($ billions)

$ billions

4
3
2
1
Texas
A&M

Columbia

Stanford

Yale

Princeton

University

[Using the Excel Template: Basic Statistics.xls]


Measures of Central tendency
Mean

24.13

Median

23.65

Measures of Dispersion
If the data is of a
Sample
Population
Variance
70.6312222
St. Dev.
8.40423835

Top Private Equity Deals


45
40
35
$(billions)

1-44.

Texas

Harvard

30
25
20
15
10
5
0
1

1-9

10

Chapter 01 - Introduction and Descriptive Statistics

1-45.

[Using the Excel Template: Basic Statistics.xls]


Measures of Central tendency
Mean

13.333333

Median

12.5

Credit Default Sw ap Values

1-46.

1-47.

Using MINITAB
Stem
4 5
8 6
14 6
(9) 7
11 7
3 8

Leaves
5688
0123
677789
002223334
55667889
224

1-10

20<25

Sales

15<20

10<15

5<10

0<5

frequencies

Sales ($)
8
7
6
5
4
3
2
1
0

Chapter 01 - Introduction and Descriptive Statistics

Box and Whisker Plot

1-48.

8.5
7.9

C1

7.3
6.7
6.1
5.5
34 cases
There are no outliers. Distribution is skewed to the left.
1.49.

1-50.

A stem-and-leaf display is a quickly drawn type of histogram useful in analyzing data. A box plot
is a more advanced display useful in identifying outliers and the shape of the distribution of the
data.
Stem
1 0
1 1
1 2
7 3
(13) 4
11 5
2 6
1 7

Leaves
5
234578
2234567788899
012235678
3
8

1-11

Chapter 01 - Introduction and Descriptive Statistics

1-51.

The data are narrowly and symmetrically concentrated near the median (IQR and the whisker
lengths are small), not counting the two extreme outliers.

Box and Whisker Plot


80

C1

60
40
20
0
31 cases
1-52.

Wider dispersion in data set #2. Not much difference in the lower whiskers or lower hinges of the
two data sets. The high value, 24, in data set #2 has a significant impact on the median, upper
hinge and upper whisker values for data set #2 with respect to data set #1.

1-53.

Mean = 127
Var = 137
sd = 11.705
mode = 127
outliers: TWA, Lufthansa

1-12

Chapter 01 - Introduction and Descriptive Statistics

160
150
140
130
120
110
100

1-54.

Stem-and-leaf of C2
Leaf Unit = 1.0
f
13
18
(6)
21
15
8
6
3
2

Stem
1
1
2
2
3
3
4
4
5

N = 45

Leaves
0011111223444
55689
022333
567789
0122234
78
012
7
23

1-55.

Outliers are detected by looking at the data set, constructing a box plot or stem-and-leaf display.
An outlier should be analyzed for information content and not merely eliminated.

1-56.

The median is the line inside the box. The hinges are the upper and lower quartiles. The inner
fences are the two points at a distance of 1.5 (IQR) from the upper and lower quartiles. Outer
fences are similar to the inner fences but at a distance of 3 (IQR). The box itself represents 50%
of the data.

1-13

Chapter 01 - Introduction and Descriptive Statistics

1-57.

Mine A:
f
2
4
7
(5)
7
4
4
3
1

Stem
3
3
4
4
5
5
6
7
8

Mine B:
f
2
4
6
9
(3)
7
4
1

Leaves
24
57
123
55689
123
0
36
5

Stem
2
2
3
3
4
4
5
5

Leaves
34
89
24
578
034
789
012
9

Values for Mine A are smaller than for Mine B, right-skewed, and there are three outliers. Values
for Mine B are larger and the distribution is almost symmetric. There is larger variance in B.
1-58.

No. One needs to use descriptive statistics and/or statistical inference.

1-59.

[Using the template: Box Plot.xls]

Box Plot

Daily Percentage Change in Stock Prices


Lower
Whisker
-0.3

1.60.

Lower
Hinge
0.275

Median
0.6

[Using the Excel Template: Basic Statistics.xls]


Measures of Central tendency
Mean

4.88

Median

4.9

The two measures are virtually equivalent.

[Using the template: BoxPlot.xls]

1-14

Upper
Hinge
1.15

Upper
Whisker
1.6

Chapter 01 - Introduction and Descriptive Statistics

Box Plot

0 to 60 times
Lower
Whisker
4.2

1-61.

Lower
Hinge
4.725

Median
4.9

Upper
Hinge
5.1

Upper
Whisker
5.3

Answers will vary.


a. If we add the value 5 to all the data points, then the average, median, mode, first quartile,
third quartile and 80th percentile values will change by 5. There is no change in the
variance, standard deviation, skewness, kurtosis, range and interquartile range values.
b. Average: if we add 5 to all the data points, then the sum of all the numbers will increase by
5*n, where n is the number of data points. The sum is divided by n to get the average. So
5*n / n = 5: the average will increase by 5.
Median: If we add 5 to all the data points, the median value will still be the midway point
in the ordered array. Its value will also increase by 5
Mode: Adding 5 to all the data points changes the number that occurs most frequently by
5
First Quartile: adding 5 to all the data points does not change the location of the first
quartile in the ordered array of numbers, which is: (.25)(n+1) where n is the number of data
points. Whether the first quartile falls on a specific data point or between two data points, the
resulting value will have been increased by 5.
Third Quartile: adding 5 to all the data points does not change the location of the third
quartile in the ordered array of numbers, which is: (.75)(n+1) where n is the number of data
points. Whether the third quartile falls on a specific data point or between two data points,
the resulting value will have been increased by 5.
80th percentile: adding 5 to all the data points has the same effect as in the calculation of
the first or third quartile. The value will be increased by 5
Range: adding 5 to the all the data points will have no effect on the calculation of the
range. Since both the highest value and the lowest value have been increase by the same
number, the subtraction of the lowest value from the highest value still yields the same value
for the range.
Variance: adding 5 to all the data points has no effect on the calculation of the variance.
Since each data point is increased by 5 and the average has also been shown to increase by
the same factor, the differences between each individual new data point and the new average
will not change and will not be affected by squaring the difference, summing the squared
differences and dividing by number of data points.

1-15

Chapter 01 - Introduction and Descriptive Statistics

Standard Deviation: since the variance is not affected by adding 5 to each data point,
neither is the standard deviation.
Skewness: Since each data point is increased by 5 and the average has also been shown to
increase by the same factor, the differences between each individual new data point and the
new average will not change. Therefore, the numerator in the formula for skewness is not
affected. Since the standard deviation is not affected as well (the denominator), there is no
change in the value for skewness.
Kurtosis: Since each data point is increased by 5 and the average has also been shown to
increase by the same factor, the differences between each individual new data point and the
new average will not change. Therefore, the numerator in the formula for kurtosis is not
affected. Since the standard deviation is not affected as well (the denominator), there is no
change in the value for kurtosis.
Interquartile Range: given that both the first quartile and the third quartile increased by the
same factor, 5, the difference between the two values remains the same.
c. Multiplying each data point by a factor 3 results in the following changes. The mean,
median, mode, first quartile, third quartile and 80th percentile values will be increased by the
same factor 3. In addition, the standard deviation and the range will also increase by the
same factor 3. The variance will increase by the factor squared, and the skewness and
kurtosis values will remain unchanged.
d. Multiplying all data points by a factor 3 and adding a value 5 to each data point has the
following results. The order of operation is first to multiply each data point and then add a
value to each data point. Each data point is first multiplied by the factor 3 and then the
value 5 is added to each newly multiplied data point. Multiplying each data point by the
factor 3 yields the results listed in c). Adding a value 5 to the newly multiplied data points
yields the results listed in a).
1.62. [Using the template: Basic Statistics.xls]
Measures of Central tendency
Mean

41.01

Median

23.8

Measures of Dispersion

Variance
St. Dev.

If the data is of a
Sample
Population
1136.941
33.7185557

1-16

Chapter 01 - Introduction and Descriptive Statistics

1-63. = 504.688

= 94.547

Measures of Central tendency


Mean

504.6875

Median

501.5

Mode

#N/A

Range
IQR

346
149.5

Measures of Dispersion

Variance
St. Dev.

If the data is of a
Sample
Population
8939.15234
94.5470906

1-64.
Step 1: Enter the data from problem 1-63 into cells Y4:Y35 of the template: Histogram.xls from Chapter
1. The template will order the data automatically.

Step 2: We need to select a starting point for the first class, an ending point for the last class, and
a class interval width. The starting point of the first class should be a value less than the
smallest value in the data set. The smallest value in the data set is 344, so you would
want to set the first class to start with a value smaller than 344. Lets use 320. We also
selected 710 as the ending value of the last class, and selected 50 as the interval width.
The data input column and the histogram output from the template are presented below.
The end-point for each class is included in that class; i.e., the first class of data goes from
more than 320 up to and including 370, the second class starts with more than 370 up to
and including 420, etc.

1-17

Chapter 01 - Introduction and Descriptive Statistics

1-65.

Range: 690 344 = 346


90th percentile lies in position: 33(90/100) = 29.7 It is 632.7
First quartile lies in position: 33(25/100) = 8.25 It is 419.25
Median lies in position: 33(50/100) = 16.5
It is 501.5
Third quartile lies in position: 33(75/100) = 24.75 It is 585.75

1-66.
6

Ogive: TV Sets
20

3
2

47.5

42.5

37.5

32.5

27.5

22.5

17.5

12.5

7.5

1
0

TV sets

cum freq

freq

5
4

15
10
5
0
10

15

20

25

30

TV Sets

1-67.
2
7
(3)
6
4
2
2

1-68.

Stem
1
1
2
2
3
3
4

Leaves
24
56789
023
55
24
01

Box and Whisker Plot


42
36

C2

30
24
18
12

1-18

35

40

45

Chapter 01 - Introduction and Descriptive Statistics

The data is skewed to the right.


1-69.

Stem Leaves
3 1
012
4 1
9
12 2
1122334
(9) 2
556677889
6 3
024
3 3
57
1 4
1 4
1 5
1 5
1 6
2
The data is skewed to the right with one extreme outlier (62) and three suspected outliers
(10,11,12)

Box and Whisker Plot


1-70.

80

C1

60
40
20
0

1.71.

[Using the template: Basic Statistics.xls]

Measures of Central tendency


Mean

8.0666667

Median

Mode

10

Based on just these three measures, cheap wine appears to work well in cooking

1-19

Chapter 01 - Introduction and Descriptive Statistics

1-72.

[Using the template: Basic Statistics.xls]


Measures of Central tendency
Mean 20.3

Median

20.2

Measures of Dispersion
If the data is of a
Sample
Population
Variance
0.10909091
St. Dev.
0.33028913

Motorola's Stock Prices


21
20.8
20.6
20.4
20.2
20
19.8
19.6
19.4
19.2
1

Box Plot

10

11

12

Motorolas Stock
Prices
Lower
Whisker
19.8

Lower
Hinge
20.075

Median
20.2

1-20

Upper
Hinge
20.525

Upper
Whisker
20.8

Chapter 01 - Introduction and Descriptive Statistics

1-73.

Mean = 33.271
sd = 16.945
var = 287.15
QL = 25.41
Med = 26.71
QU = 35
Outliers: Morgan Stanley (91.36%)

Box and Whisker Plot


100

C1

80
60
40
20
15 cases

1-74.

Mean = 3.18
sd = 1.348
var = 1.817
QL = 1.975
Med = 2.95
QU = 3.675
Outliers: 8.70

Box and Whisker Plot


9

C1

7
5
3
1
20 cases

1-21

Chapter 01 - Introduction and Descriptive Statistics

1-75.
a.
b.
c.
d.

Minitab output:

Descriptive Statistics: change in bad loans, change in


Provisions
Variable
change bad loans
change Provisions

Mean
56.28
8.12

StDev
42.73
12.21

Median
43.40
8.60

While the average of the Change in Provisions is close to the 4.1 average for all banks, the
average of the Change in Bad Loans is considerably higher than the industry average of 11.00.
The box plot for change in Bad Loans does not show any outliers.

Boxplot of change in bad loans


160
140
120
change in bad loans

1.76.

IQR = 3.5
data is right-skewed
9.5 is more likely to be the mode, since the data is right-skewed
Will not affect the plot.

100
80
60
40
20
0

1-22

Chapter 01 - Introduction and Descriptive Statistics

The box plot for change in Provisions does show one possible outlier for W Holding at 37.3:
Boxplot of change in Provisions
40

change in Provisions

30

20

10

-10

1.77.

The Minitab output:

Descriptive Statistics: bank assets


Variable
bank assets

Mean
186.7

StDev
355.6

Median
56.2

The average for the bank assets of the 19 lending institutions is larger than the industry average of
149.30.

1-23

Chapter 01 - Introduction and Descriptive Statistics

The box plot of bank assets show three possible outliers for Bank of America (1459), Wachovia
(707.1), and Wells Fargo (481.9)
Boxplot of bank assets
1600
1400

bank assets

1200
1000
800
600
400
200
0

Measures of Central tendency

1-78.

Mean

1720.2

Median

930

[Using the template: Basic


Statistics.xls]

Measures of Central tendency


Mean

56.266667

Median

57

Measures of Dispersion
If the data is of a
Sample
Population
Variance
164.780952
153.795556
St. Dev.
12.8367033
12.4014336

The mean and median for the 15 selected countries are higher than the overall mean approval
rating of 53%.

1.78.

The chart indicates that there is a significantly large difference between the annual sales per
square foot for Apple Stores relative to the other four companies listed.

[Using the template: Basic Statistics.xls]

1-24

Chapter 01 - Introduction and Descriptive Statistics

Measures of Dispersion
If the data is of a
Sample
Population
1987680.96
1409.8514

Variance
St. Dev.

1-80.

Mean = 99.039
sd = .4366
var = .1907
Median = 99.155

1-81.

Mean = 17.587
sd = .466
var = .2172
Measures of Central tendency
Mean

17.5875

Median

17.5

Mode

18.3

Range
IQR

1.4
0.75

Measures of Dispersion
If the data is of a
Sample
Population
Variance 0.21716667 0.20359375
St. Dev. 0.46601144 0.45121364

1-82.

Mean = 259.82
sd = 357.24

(Using the template: Basic Statistics.xls)


Measures of Central tendency
Mean

259.82

Median

9.5

Measures of Dispersion
If the data is of a
Sample
Population
Variance
127622.462
St. Dev.
357.242861

1-25

Chapter 01 - Introduction and Descriptive Statistics

1-83.

Mean = 37.17
sd = 13.128
Median = 34
Measures of Central tendency
Mean

37.166667

Median

34

Measures of Dispersion
If the data is of a
Sample
Population
Variance
172.333333
St. Dev.
13.1275791

1-84. Stock Prices for period: April, 2001 through June, 2001 [Answers will vary due to dates
used.]
a). Mean and Standard Deviation for Wal-Mart
Basic Statistics from Raw Data

Stock Prices: Wal-Mart

Measures of Central tendency


Mean

51.041478

Median

51.1266

Mode

50.158

Range
IQR

6.1911
1.9613

Measures of Dispersion
If the data is of a
Sample Population
Variance 2.25711298 2.22128579
St. Dev. 1.50236912 1.49039786
Higher Moments
If the data is of a
Sample Population
Skewness 0.07083784 0.06913994
(Relative) Kurtosis -0.711512 -0.7500338

1-26

Chapter 01 - Introduction and Descriptive Statistics

b). Mean and Standard Deviation for K-Mart

Basic Statistics from Raw Data

Stock Prices: K-Mart

Measures of Central tendency


Mean

10.450952

Median

10.66

Mode

11.8

Range
IQR

3.51
1.955

Measures of Dispersion
If the data is of a
Sample Population
Variance 0.9852023 0.96956417
St. Dev. 0.99257358 0.9846645
Higher Moments
If the data is of a
Sample Population
Skewness -0.4070262 -0.3972703
(Relative) Kurtosis -1.132009 -1.1378913

c). Coefficient of variation:


CV = std. dev mean
For Wal-Mart:
considering the data as a population:
CV = 1.49039786 / 51.041478 = 0.0292
considering the data as a sample:
CV = 1.50236912 / 51.041478 = 0.02943

for K-Mart:
CV = 0.9846645 / 10.450952 = 0.0942
CV = 0.99257358 / 10.450952 = 0.09497

d). There is a greater degree of risk in the stock prices for K-Mart than for Wal-Mart over
this three month period.
e). For DJIA

considering the data as a population:


CV = 427.913791 / 10681.11 = 0.04006
considering the data as a sample:
CV = 431.350905 / 10681.11 = 0.04038

Wal-Mart stocks provided a less risky return for this time period relative to DJIA and KMart.
f). 100 Shares of Wal-Mart stocks purchased April 2, 2001:
Price = $50.5674 Cost = $5056.74
Mean of holding 100 shares: $5104.15
1-27

Chapter 01 - Introduction and Descriptive Statistics

Std dev of holding 100 shares: 1.4904 (rounded: if data considered a population)
1.5024 (rounded: if data considered a sample)
1-85.

a). for a process mean = 2004


VARP = Average SSD2004 + offset2
VARP = 3.5 + offset2
where offset = target process
b). if target = process, then offset = 0
substituting: VARP = 3.5 + offset2 = 3.5 + 02 = 3.5
1-86.

a) & b): CPI and Gas prices for period: June 97 through May 01. (Non-seasonally
adjusted series.)

CPI index converted (by 100) in order to compare both series on same chart. There is no
seasonal pattern present in the CPI index. Steady trend present in CPI; considerable
variability in gas prices. Gas prices increased considerably more than the overall CPI for
the same time period.

1-28

Chapter 01 - Introduction and Descriptive Statistics

1-87.

a). Pie Chart: AIDS cases by Age groups


Age Group
Under 5:
Ages 5 to 12:
Ages 13 to 19:
Ages 20 to 24:
Ages 25 to 29:
Ages 30 to 34:
Ages 35 to 39:
Ages 40 to 44:
Ages 45 to 49:
Ages 50 to 54:
Ages 55 to 59:
Ages 60 to 64:
Ages 65 or older:

No.
6812
1992
3865
26518
99587
168723
168778
124398
72128
38118
20971
11636
10378

%
0.90%
0.26%
0.51%
3.52%
13.21%
22.38%
22.39%
16.50%
9.57%
5.06%
2.78%
1.54%
1.38%
AIDS cases by age

Under 5: (0.90%)
Ages 5 to 12: (0.26%)
Ages 65 or older: (1.38%)
Ages 13 to 19: (0.51%)
Ages 60 to 64: (1.54%)
Ages 55 to 59: (2.78%)Ages 20 to 24: (3.52%)
Ages 50 to 54: (5.06%)
Ages 25 to 29: (13.21%)
Ages 45 to 49: (9.57%)

Ages 40 to 44: (16.50%)

Ages 30 to 34: (22.38%)

Ages 35 to 39: (22.39%)

b). Pie Chart: AIDS cases by Race


Race
White, not Hispanic
Black, not Hispanic
Hispanic
Asian/Pacific Islander
American Indian/Alaska Native
Race/ethnicity unknown

No.
324822
282720
137575
5546
2234
1010

%
43.09%
37.50%
18.25%
0.74%
0.30%
0.13%

1-29

Chapter 01 - Introduction and Descriptive Statistics

AIDS cases by Race


Race/ethnicity unknown (0.13%)
American Indian/Alaska Native (0.30%)
Asian/Pacific Islander (0.74%)
Hispanic (18.25%)

White, not Hispanic (43.09%)

Black, not Hispanic (37.50%)

1-88. (Using the template: Box Plot 2.xls)


Comparing two data sets using Box Plots
Lower
Whisker
Cubs 300000
White Sox 301000

Lower
Hinge
650000
340000

Salaries 2004
Upper
Upper
Median
Hinge Whisker
1550000 5750000 9500000
775000 3875000 8000000

Cubs
White Sox

Outliers: Cubs: Sosas salary of $16M


White Sox: Ordonezs salary of $14M
Furthermore, the median salary of the Cubs is twice the median salary of the White Sox. There
are some players on both teams making the league minimum salary.
Somewhat lower salary range for the White Sox relative to the Cubs due to the fact that only
seven (7) players on the Cubs were paid $500,000 or less while eleven (11) players earned less
than that amount on the White Sox.

1-30

Chapter 01 - Introduction and Descriptive Statistics

1-89

[Using the Basic Statistics.xls]


Measures of Central tendency
Mean

5.1477778

Median

5.35

Measures of Dispersion
If the data is of a
Sample
Population
Variance
0.36249444
St. Dev.
0.60207512

Case 1: NASDAQ Volatility


1) NASDAQ Combined Composite Index for 2000
[Using template: Time Plot 2.xls]
NASDAQ Composite Index

5000
4500
4000
3500
3000
2500
2000
1500
1000
500
0
Jan

Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec

Feb Mar

2000
3940.35
4696.69
4572.83
3860.66
3400.91
3966.11
3766.99
4206.35
3672.82
3369.63
2597.93
2470.52
1-31

Apr

May Jun

Jul

Aug Sep Oct

Nov Dec

Chapter 01 - Introduction and Descriptive Statistics

2) Compare 2006 with 2007. [Please note: at the time of printing, data for 2007 was available only
through close on 5?25/07.]
Plots suggest there may be more volatility in 2006.
Standard deviation for 2006 = 105.3317
Standard deviation for 2007 = 82.3060

1-32

Chapter 01 - Introduction and Descriptive Statistics

[Using template: Time Plot 2.xls]


NASDAQ Composite
Index

Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec

2006
2305.82
2281.39
2339.79
2322.57
2178.88
2172.09
2091.47
2183.75
2258.43
2366.71
2431.77
2415.29

2007
2463.93
2416.15
2421.64
2525.09
2604.52
2588.96

3) Comparison of NASDAQ with S&P 500 Index for 2007


[Using template: Time Plot 2.xls]

Comparison using Time Plot NASDAQ vs S&P for 2007

Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec

S&P
1438.24
1406.82
1420.86
1482.37
1530.62
1502.56

NASDAQ
2463.93
2416.15
2421.64
2525.09
2604.52
2588.96

There was more volatility in the NASDAQ Index in 2007 than in the S&P 500 Index in 2007.
Standard deviation for NASDAQ in 2007 = 82.3060
Standard deviation for S&P 500 in 2007 = 49.1033
4) Comparison of the NASDAQ with DJIA for 2000

1-33

Chapter 01 - Introduction and Descriptive Statistics

[Using template: Time Plot 2.xls]

Comparison using Time Plot NASDAQ vs DJI for 2007

Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec

DJI
12621.69
12268.63
12354.35
13062.91
13627.64
13360.26

NASDAQ
2463.93
2416.15
2421.64
2525.09
2604.52
2588.96

There was more volatility in the DJI Index in 2007 than in the NASDAQ Index.
Standard deviation for NASDAQ in 2007 = 82.3060
Standard deviation for DJIA in 2007 = 554.948
5). Answers will vary given date of assignment.

1-34

Anda mungkin juga menyukai