Anda di halaman 1dari 18

Unit 2: Basics: Data Description

Leo's Data Mine


At left you will see an outline of the topics
Describing and Summarizing Data covered in Unit 2: Basics: Data Description.
Working with Data
Histograms There are three homework problems associated
with this unit; one for each major area. There is
Outliers one tab for each problem in this workbook. Once
Summary you have completed the unit online, you should
Creating Histograms in Excel attempt these problems and submit them to your
coach for assessment and feedback.
Central Values for Data
The Mean
The Median
The Mode
Summary
Finding The Mean In Excel
Finding The Median In Excel
Finding The Mode In Excel
Variability
The Standard Deviation
Calculating
Interpreting
Summary
Finding in Excel
The Coefficient of Variation
Summary
Applying Data Analysis
Pricing the Scuba Schools
Exercise 1: VA Linux Stock Bonanza
Exercise 2: Employee Turnover
Exercise 3: Honidew Internship
Exercise 4: Scuba Regulations
Exercise 5: Fluctuations in Energy Prices
Exercise 6: Big Mart Personal Care Products
Summary
Relationships Between Variables
Two Variables
Variable and Time
False Relationships
Hidden Variables
Summary
Creating Scatter Diagrams in Excel
Correlation
Influence of Outliers
Summary
Finding in Excel
Occupancy and Arrivals
Exercise 1: The Effectiveness of Search Engines
Exercise 2: Education and Income
ne of the topics
ata Description.

problems associated
major area. There is
n this workbook. Once
it online, you should
d submit them to your
eedback.
Bin Frequency
99 4
109 3
119 4
129 4 July 2009 Electricity Cost
139 5
149 7
One Bedroom Apartment in Hiale
159 6 8
169 5
7
179 4
189 3 6
199 2

Number of Apartments
209 2 5
219 1 4
More 0
3

0
99 109 119 129 139 149 159 169 179 189 1
Utility Charge in $

The largest percentage of observations lie between 149 a


159
9 Electricity Cost
Apartment in Hialeah

Frequency

159 169 179 189 199 209 219 More


Utility Charge in $

observations lie between 149 and


Utility Charge Bin Range
96 99
171 109 This data represent the cost of electricity
202 119 during July 2009 for a random sample of 50
one-bedroom apartments in Hialeah.
178 129
147 139 Form a frequency disribution (Histogram) that
102 149 has class intervals with the upper class
153 159 boundaries of $99, $119, and so on.
197 169
127 179 Q-1a: Following the procedure above,
82 189
between what two amounts do the largest
percentage of observations lie?
157 199
185 209
90 219
116
172
111
148
213
130
165
141
149
206
175
123
128
144
168
109
167
95
163
150
154
130
143
187
166
139
149
108
119
183
151
114
135
191
137
129
158
f electricity
sample of 50
Hialeah.

(Histogram) that
per class
so on.

e above,
o the largest
?
Expected Employment Number of Satisfaction
ID Num Gender Age Height Class Major GPA Salary Status Affiliations Advisement
ID01 m 19 69 so mr 3.19 40 un 0 2
ID02 m 21 67 sr m 3.11 50 pt 0 2
ID03 m 20 68 jr ef 3.02 50 pt 0 5
ID04 m 18 79 fr ef 4.00 50 pt 0 5
ID05 m 19 67 so m 2.75 40 pt 1 1
ID06 m 21 70 jr a 3.24 60 pt 2 5
ID07 m 20 68 jr ef 2.93 50 un 0 4
ID08 m 21 71 jr m 3.26 40 pt 0 1
ID09 f 20 62 so mr 3.21 45 pt 0 4
ID10 m 19 70 so a 3.23 50 pt 0 6
ID11 m 36 67 so a 3.77 60 pt 1 4
ID12 f 19 65 so a 3.71 40 un 0 5
ID13 f 20 65 jr a 3.20 45 pt 3 5
ID14 f 21 65 jr mr 2.94 40 pt 0 4
ID15 f 19 66 so mr 3.22 40 pt 0 3
ID16 m 20 69 jr un 3.34 60 pt 0 5
ID17 f 19 64 fr ib 3.09 40 un 1 4
ID18 m 20 67 jr mr 3.72 50 pt 2 4
ID19 m 23 70 jr ef 2.50 50 un 0 2
ID20 m 20 70 so ef 2.74 60 un 0 4
ID21 f 20 63 so mr 3.55 60 pt 2 5
ID22 f 19 67 so m 3.00 45 pt 0 3
ID23 f 19 65 so mr 3.62 40 pt 0 3
ID24 f 20 63 jr m 2.60 40 pt 1 3
ID25 f 22 63 sr ef 3.63 50 pt 3 6
ID26 f 21 65 sr o 2.38 40 pt 2 4
ID27 m 21 73 jr m 2.45 40 pt 0 2
ID28 m 30 71 jr m 3.28 50 pt 0 5
ID29 f 20 66 so ib 3.18 50 un 1 5
ID30 m 24 62 so a 3.33 55 pt 0 4
ID31 f 19 69 so mr 2.87 30 pt 0 3
ID32 f 33 67 sr a 3.14 45 ft 0 5
ID33 f 19 64 fr ib 3.44 45 pt 1 6
ID34 m 20 72 so m 3.85 60 pt 1 1
ID35 f 22 61 jr o 3.50 45 un 0 7
ID36 m 21 69 so ef 2.92 55 pt 0 5
ID37 f 19 60 fr a 2.80 55 pt 0 3
ID38 f 21 66 jr mr 2.67 40 pt 0 3
ID39 m 20 69 so a 2.65 45 un 0 3
ID40 f 20 63 so is 2.88 50 un 1 4
ID41 f 19 65 so ef 3.43 50 pt 0 3
ID42 f 21 63 jr ef 3.48 60 pt 0 5
ID43 m 20 68 so a 2.91 45 pt 1 4
ID44 m 19 72 so a 2.75 50 pt 0 5
ID45 m 22 69 jr is 3.62 55 un 2 4
ID46 m 21 68 jr m 2.42 35 pt 1 3
ID47 f 22 66 jr mr 2.76 40 pt 0 3
ID48 m 19 69 fr un 3.10 45 pt 0 4
ID49 m 20 68 so is 2.61 40 pt 1 3
ID50 f 20 66 so is 3.13 45 pt 0 2
Spending on
Textbooks Fifty undergraduate students answered a survey. The results are shown here.
550
400 Q-2a: What is the average expected salary of these undergraduates upon graduation? (Round to two de
450
Q-2b: What is the median amount spent on textbooks? (Round to two decimals)
360
500 Q-2c: What is the mode with respect to the number of affiliations to clubs and other organizations of the
650
500 Q-2d: What is the standard deviation of these students' GPAs? (Round to two decimals)
500
350 Q-2e: Using coefficients of variation, tell me which has more variation, the age of the students in this sam
300
200
550
425
600
600
400
250
350
400
400
500
600
400
500
1000
300
450
550
600
400 Q2-a
700 Q2-b
500 Q2-c
350 Q2-d
450
600 Q2-e
400
450
800
400
375
400
500
350
525
400
450
500
400
450
500
Fifty undergraduate students answered a survey. The results are shown here.

Q-2a: What is the average expected salary of these undergraduates upon graduation? (Round to two decimals)
Q-2b: What is the median amount spent on textbooks? (Round to two decimals)

Q-2c: What is the mode with respect to the number of affiliations to clubs and other organizations of these undergraduates?

Q-2d: What is the standard deviation of these students' GPAs? (Round to two decimals)

Q-2e: Using coefficients of variation, tell me which has more variation, the age of the students in this sample or the amounts that they

Average expected salery of recent undergraduates:


Median amount spent on textbooks:
Mode with respect to the number of affiliations to clubs and other organizations of these undergraduates:
Standard deviation of students' GPAs:

The amount spent on textbooks has a larger Coefficient of Variation


Standard Deviation
Mean
Coefficient of Variation

Standard Deviation
Mean
Coefficient of Variation
mals)

undergraduates?

le or the amounts that they spent on textbooks.

$47.30 =AVERAGE(H2:H51)
$450.00 =MEDIAN(L2:L51)
0 =MODE(J2:J51)
0.40 =STDEV(G2:G51)

Age of Students Amount Spent on Textbooks


3.34 135.98
20.96 470.70
0.16 0.29

=STDEV(C2:C51) =STDEV(L2:L51)
=AVERAGE(C2:C51) =AVERAGE(L2:L51)
=P37/P38 =Q37/Q38
City Overall Cost Rent Coffee Hamburger Dry Cleaning Toothpaste Movie Tickets
Tokyo 130.7 4536 4.76 5.99 10.48 2.02 32.66
London 119.0 3019 3.11 7.62 13.30 3.07 28.41
New York 100.0 3500 3.30 5.75 8.80 3.12 20.00
Sydney 91.8 1381 2.42 4.45 8.28 2.73 20.71
Chicago 84.5 2300 2.10 4.99 9.99 3.23 18.00
San Francisco 84.3 2100 3.52 5.29 6.50 2.48 19.50
Boston 76.4 1750 2.90 4.39 5.25 2.05 18.00
Atlanta 72.9 1250 1.71 3.70 5.95 2.24 16.00
Toronto 71.8 1383 2.11 4.62 7.37 1.66 18.05
Rio de Janeiro 59.3 1366 0.94 2.99 6.32 1.38 9.90

Q-3a

Cost Comparison of Hamburger to Movie


Tickets in Select Cities

35.00

30.00

25.00

20.00
Hamburgers

15.00

10.00

5.00

0.00
2.00 3.00 4.00 5.00 6.00 7.00 8.00

Movie Tickets
5.00

0.00
2.00 3.00 4.00 5.00 6.00 7.00 8.00

Movie Tickets

There is a strong postitve, upward sloping and near linear, relationship between the cost of a hamburger and the cost of a mov
Correclation coefficient: 0.83 =CORREL(E2:E11,H2:H11)

Q-3b Correlation coefficient between the cost of dry cleaning a man's suit and the cost of coffee in these ten cities

Correclation coefficient: 0.42 =CORREL(F2:F11,D2:D11)


There exists a moderately postive, upward sloping and semi linear, realtionship between the cost of a mans su
In this data set, we have the cost of various
items in ten different cities around the
world.

Q-3a: Create a scatter diagram relating the


cost of a Hamburger Meal to the cost of a
couple of Movie Tickets in these cities. Put
Hamburgers on the x-axis. Is the relationship
positive, negative, strong, moderately
strong, moderately weak, or simply weak?

Q-3b: What is the correlation coefficient


between the cost of dry cleaning a man's
suit and the cost of coffee in these ten
cities?

o Movie

Movie Tickets

7.00 8.00
7.00 8.00

hamburger and the cost of a movie ticked

e cost of coffee in these ten cities.

hip between the cost of a mans suit and coffee

Anda mungkin juga menyukai