Anda di halaman 1dari 10

Math 533 1) Given the data set below

Final Exam Review

Bowling scores from the Sunday evening league are listed below:

78 103

67 156

112 86

109 190

78 255

97 76

112 130

45 180

43 112

112 115

117 112

125 145

179 180

a) Find the mean, median, mode and standard deviation. b) Find the First (Q1), second (Q2) and third (Q3)quartiles . c) In the context of this situation interpret, the median, Q1 and Q3

2) Carter blood mobile tested 335 people for their blood type and Rh factor. The following table represents their findings Type A Rh negative Rh positive 20 45 65 Type B 30 25 55 Type AB 5 15 20 Type O 75 120 195 Tot al 130 205 335

a)What is the probability that a a person will have type O blood? ____________________________ b) What is the probability that a person will have a negative Rh factor? ________________________

c) What is the probability of having type B blood and a positive Rh factor ? ______________ d) What is the probability of having Type AB or Type B ? _______________ e) What is the probability that given a person has type B blood their Rh factor will be negative? _______ f) What is the probability that given a person has a positive Rh factor they will be type A?____________

3) The 2011 Youth and Money Survey, sponsored by the American Savings Education Council, talked to 1000 students, age 16-22, about personal finance. The survey found that 33% of the students have their own credit card. If you ask 10 high school students if they have a credit card, what is the probability that a) at least 6 have a credit card_______________ b) exactly 6 have a credit card____.054652__________ c) less than 6 have a credit card_____________ x 0 1 2 3 4 5 6 7 8 9 10 p(x) 0.018228 0.089782 0.198993 0.261365 0.225281 0.133151 0.054652 0.015382 0.002841 0.000311 0.000015

4) The mean amount spent per child on back-to-school clothes in August 2011 was $727. Assume the standard deviation is $160 and that the amount spent approximates a normal distribution. a) find the probability that the amount spend exceeds 850 dollars =.2210

b) find the probability that the amount spend is between 500 and 700 dollars = 0.3550 c)How much would a parent have to spend so that they are in the bottom 20% = 592.3
Distribution Plot
Normal, Mean=727, StDev=160 0.0025

0.0020

Density

0.0015

0.0010

0.0005

0.2210

0.0000

a)

727 X

850

5) Firestone claims that their tires will last a very long time due to the high quality of the manufacturing process. The sample below represents a the population of the tires at the Michigan plant. Sample size = 75 Sample mean = 32000 miles Sample Standard Deviation = 5500 miles Construct a 95% confidence interval for the average life of the tires One-Sample Z
The assumed standard deviation = 5500 N 75 Mean 32000 SE Mean 635 95% CI (30755, 33245)

a) b) Interpret the interval I am 95% confident that the true mean of the population of tire life falls between 30755, 33245) c) How large a sample size will need to be taken if we wish to have a 95% confidence interval with a margin of error of 2000 miles?
Results Margin Sample ofM Error Size d) 2000 32

e)

6) There are 3000 students at Richardson High School, you survey them to find out how many plan to go to college. Out of a sample of 150 students 78 say they plan to go to college. Proportion = 78/150 =.52 a) Construct a 99% confidence interval for the percentage of students that plan to go to college Test and CI for One Proportion
Sample 1 X 78 N 150 Sample p 0.520000 95% CI (0.437018, 0.602177)

b) Interpret the interval we are 95% confident that the percentage of students that plan to go to college are between 43.7% to 60.2% c) How many students should be sampled in order to be 99% confident of being within 2% of the actual population percentage of students that want to go to college?
Results Margin of Error 0.02 Sample Size 4196

7) A baseball manufacturer has a quality control standard that requires no more than 20% of all balls have Defects. The next shipment of 200 balls is tested and there are 46 defective balls Does the sample data provide evidence to conclude that the percentage of defective balls is more than 20% Use = .01 Use the hypothesis testing procedure outlined below to make your decision. a) Formulate the null and alternative hypothesis b) State the level of significance c)Find the critical value(or values) and clearly show the rejection and non-rejection regions d) Compute the test statistic e) Decide whether you can reject Ho and accept Ha or not. f)Explain and interpret your conclusion in part e. What does this mean? g) Find the observed p-value for the hypothesis test and interpret this value. What does it mean? h) Does the sample data provide evidence with =.01 that the percentage of defective balls is more than 20%?

Test and CI for One Proportion


Test of p = 0.2 vs p < 0.2 99% Upper Bound 0.299226

Sample 1

X 46

N 200

Sample p 0.230000

Z-Value 1.06

P-Value 0.856

Using the normal approximation.

8) A car repair shop says that the mean repair cost for a damaged bumper is less than $350. You work for this shop and want to test this claim. You randomly select 15 cars and find the mean cost to be $315 with a standard deviation of $35. At a significance level of .05 ( = .05 ) Does the sample data provide sufficient evidence to conclude that the cost of repair is less than 325? Use the hypothesis testing procedure outlined below to make your decision. a) Formulate the null and alternative hypothesis b) State the level of significance c)Find the critical value(or values) and clearly show the rejection and non-rejection regions d) Compute the test statistic e) Decide whether you can reject Ho and accept Ha or not. f)Explain and interpret your conclusion in part e. What does this mean? g) Find the observed p-value for the hypothesis test and interpret this value. What does it mean? h) Does the sample data provide evidence with =.05 that the average repair is less than $325 One-Sample T
Test of mu = 350 vs < 350 95% Upper Bound 330.92

N 15

Mean 315.00

StDev 35.00

SE Mean 9.04

T -3.87

P 0.001

9) American Express believes that people who travel use their card more. A research firm selected a random sample of 14 cardholders and found out how much they traveled and their credit card charges for the month.
Customer s 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Miles traveled per year 3000 7000 500 2500 4600 9000 25000 4800 9500 12500 14600 4000 5600 15000 Dollars charged 500 6700 150 450 600 1000 2500 670 980 700 800 200 650 1200
1000 0 0 5000 10000 15000 Miles traveled per year 20000 25000 Dollars charged

Scatterplot of Dollars charged vs Miles traveled per year


7000 6000 5000 4000 3000 2000

Regression Analysis: Dollars charged versus Miles traveled per year The regression equation is Dollars charged = 686 + 0.0637 Miles traveled per year Predictor Constant Miles traveled per year S = 1689.69 Coef 686.1 0.06373 SE Coef 751.3 0.07147 T 0.91 0.89 P 0.379 0.390

R-Sq = 6.2%

R-Sq(adj) = 0.0%

Analysis of Variance Source Regression Residual Error Total DF 1 12 13 SS 2269717 34260654 36530371 MS 2269717 2855055 F 0.79 P 0.390

Unusual Observations Miles traveled per year 7000 25000 Dollars charged 6700 2500

Obs 2 7

Fit 1132 2279

SE Fit 463 1270

Residual 5568 221

St Resid 3.43R 0.20 X

Predicted Values for New Observations New Obs 1 Fit 845 SE Fit 618 95% CI (-501, 2192) 95% PI (-3074, 4765)

Values of Predictors for New Observations Miles traveled per year 2500

New Obs 1

_________________________________________________________________________________________________

a. Analyze the above output to determine the regression equation. Y= 686+.0637x1

b. Find and interpret 1in the context of this problem. B1 is the slope of the estimated line = .0637 therefore for each mile that a customer travels their charges should increase by 6 cents. b B0 is the y intercept and it implies that if you travel 0 miles then you will have charges of $686 This is not meaningful because it is outside the observed values, but it would possible for someone to have charges of $686 when they did not travel. Therefor it has practical purposes and could be used for forecasting and interpellation

c. Find and interpret the coefficient of determination (r-squared). R squared = 6.2% which implies that 6.2 5 of your charges are explained by the amount that you travel. The adjusted r square is 0.00, this means that very little of your charges are really explained by your travel, this is probably due to the very small data size (n=14) which is very biased.

d. Find and interpret coefficient of correlation. R = the square root of r square = .062=0.24899 which implies that this a weak positive correlations ( it is positive because my slope is positive) . Does the data provide significant evidence ( = .05) that Miles can be used to predict Charges? Test the utility of this model using a two-tailed test. Find the observed p-value and interpret. H0 B1 = 0 ( there is not a relationship between x and y) H1 B1 > 0 ( there is a relationship between x and y) p-value =.390 The p-value is not less than significance level ( = .05) therefore we can not reject Ho0 There is not significant evidence to support the relationship between x and y. Y is not being significantly impacted by X

f. Find the 95% confidence interval for the mean charges for a driver that travesl 2500 miles. Find the 95% prediction interval. What can we say about the charges for a 2500 mile trip? The 95% confidence interval is
(-501, 2192) This implies that a person who travels 2500 miles will charge between -501 and 2192. The lower limit of the confidence interval is meaningless because we do not have negative charges, or they never give us money

The 95% prediction interval is

(-3074, 4765). For the entire population of customers the charges will lie between -3074 and 4765 ( noting that the lower limit is meaningless). This means that the true upper limit of charges for all people who travel 2500 miles is 4765.

10) Housing Problem.

Regression Analysis: PRICE versus SQ_FT, BEDS, BATHS, GARAGE


The regression equation is PRICE = - 25.3 + 0.0401 SQ_FT + 2.06 BEDS + 2.90 BATHS + 18.7 GARAGE Predictor Constant SQ_FT BEDS BATHS GARAGE S = 13.8399 Coef -25.27 0.040114 2.056 2.901 18.729 SE Coef 10.77 0.004061 2.134 3.242 4.466 T -2.35 9.88 0.96 0.89 4.19 P 0.021 0.000 0.338 0.373 0.000

R-Sq = 73.6%

R-Sq(adj) = 72.6%

Analysis of Variance Source Regression Residual Error Total Source SQ_FT BEDS BATHS GARAGE DF 1 1 1 1 DF 4 103 107 SS 55013 19729 74742 MS 13753 192 F 71.80 P 0.000

Seq SS 51197 249 197 3369

Unusual Observations Obs 10 16 19 52 59 63 90 91 SQ_FT 1707 1996 838 1608 1725 1794 2167 2170 PRICE 64.00 75.21 68.69 132.00 69.00 70.95 84.90 115.00 Fit 92.63 106.28 41.10 114.40 98.31 98.17 113.98 123.17 SE Fit 2.40 3.39 4.70 5.66 1.82 2.92 2.61 5.25 Residual -28.63 -31.07 27.60 17.60 -29.31 -27.22 -29.08 -8.17 St Resid -2.10R -2.32R 2.12R 1.39 X -2.14R -2.01R -2.14R -0.64 X

98 104 105 107 108

2282 2380 2505 2804 2809

150.58 155.00 156.90 192.00 195.00

120.65 124.58 129.60 160.32 160.52

2.34 2.63 3.04 4.69 4.70

29.93 30.42 27.30 31.68 34.48

2.19R 2.24R 2.02R 2.43R 2.65R

R denotes an observation with a large standardized residual. X denotes an observation whose X value gives it large leverage. Predicted Values for New Observations Fit 95% CI 95% PI 1 51.51 5.32 (40.96, 62.05)

(22.10, 80.91)X

New Obs Fit SE X denotes a point that is an outlier in the predictors. Values of Predictors for New Observations New Obs 1 SQ_FT 1200 BEDS 2.00 BATHS 2.00 GARAGE 1.00

b. Analyze the above output to determine the regression equation.


PRICE = - 25.3 + 0.0401 SQ_FT + 2.06 BEDS + 2.90 BATHS + 18.7 GARAGE

b. Find and interpret 1in the context of this problem. B0 If you have no square footage and no bedrooms or bathrooms and not garage you pay 25300

B1 For every 1 unit increase in square footage there will a .0401 increase in the price of the house. ( You will pay 4 .01 dollars for each square foot) B2 For every 1 unit increase in bedrooms there will be a 2.06 increase in price ( you will pay 2060 dollars for each additional bedroom) B3 For every 1 unit increase in bathrooms there will be a 2.90 increase in price ( you will pay 2900 fro each additional bathroom) B4 For every 1 unit increase in garages there will be a 18.7 increase in price ( If you have a grarage you will pay an additional 18700)

c. Find and interpret the coefficient of determination (r-squared). R-Sq = 73.6%

this implies that 73.6 of the price of the house is due to the size, the number of bedrooms, the number of bathrooms and whether or not it has a garage

d. Find and interpret coefficient of correlation. R =.736 =.86 The is a strong positive correlation among the dependent variables, size, bed, bath and garage and the price of the house

e. Does the data provide significant evidence ( = .05) that Miles can be used to predict Charges? Test the utility of this model using a two-tailed test. Find the observed p-value and interpret. H0 B1 = 0 ( there is not a relationship between x and y) H1 B1 > 0 ( there is a relationship between x and y) p-value =.000 The p-value is less than significance level ( = .05) therefore we can reject Ho0 There is significant evidence to support the relationship between SqFt and Garage and Price.

But there is not a relationship between Bedrooms and Bathrooms and Price

f. Find the 95% confidence interval for the mean charges for a driver that travesl 2500 miles. Find the 95% prediction interval. What can we say about the charges for a 2500 mile trip?

Fit 1

95% CI 51.51

5.32

95% PI (40.96, 62.05)

(22.10, 80.91)X