Anda di halaman 1dari 19

College Statistics

1. The following table shows the number of animal tracks found in a


park. Find the probability that a randomly selected track belongs to
either a deer or a squirrel.

Opossum Deer Squirrel Raccoon Dog


15 32 97 32 41

:*
(A) .1475
(B) 0.4055
(C) 0.4470
(D) 0.5945

2. A study is to be conducted on the proportion of students at Georgia


Tech who participate in a pool for March Madness. The resulting
confidence interval is supposed to have a margin of error of .04 with
95% confidence, and there is no previous study to use as a guideline.
What is the minimum number of subjects that should be sampled?: *
(A) 25
(B) 307
(C) 601
(D) 1201

Questions 3-4: You are guessing at random on an 11-question


multiple choice quiz. Each question has five choices, one of which is
correct.

3. What is the probability of getting 5 or more questions correct?: *


(A) 0.0117
(B) 0.0504
(C) 0.9496
(D) 0.9883

4. How many questions do you expect to get correct?: *


(A) 2.2
(B) 4.8
(C) 5
(D) 5.5

5. A 95% confidence interval for the proportion of people who believe


the Loch Ness Monster exists is (0.6234, 0.7368). What single value
would you use to estimate the true proportion of people who believe
in the monster?: *
(A) 0.1134
(B) 0.6234
(C) 0.6801
(D) 0.7368

6. All of the following statements about confidence intervals are


correct EXCEPT?: *
(A) Holding other numbers fixed, increasing the level of confidence will result in a wider
confidence interval.
(B) Holding other numbers fixed, increasing sample size will result in a narrower
confidence interval.
(C) The sample mean / proportion will always be inside the confidence interval.
(D) The population mean / proportion will always be inside the confidence interval.

7. Weight, in pounds, is measured for each person in a sample. After


the data are collected, all the weight measurements are converted
from pounds to kilograms by multiplying each measurement by 2.2.
Which of the following statistics will remain the same for both units of
measure?: *
(A) The z-scores of the weight measurements.
(B) The maximum of the weight measurements.
(C) The standard deviation of the weight measurements.
(D) The median of the weight measurements.
(E) The mean of the weight measurements.

8. The weight of a carton of strawberries has mean of 16 ounces and


standard deviation of 1.5 ounces. What can you say about the
distribution of the mean weight of a random sample of 41 cartons?: *
(A) Mean = 16, standard error = 1.5, unknown shape
(B) Mean = 16, standard error = 1.5, approximately normal
(C) Mean = 16, standard error = 0.234, unknown shape
(D) Mean = 16, standard error = 0.234, approximately normal

9. At a certain school, 41% of the students play soccer, 30% play


volleyball, and 14% play both soccer and volleyball. If a student is
chosen at random, find the probability that he/she plays neither
soccer nor volleyball.: *
(A) 0.71
(B) 0.57
(C) 0.43
(D) 0.413

Questions 10-12: A simple linear regression model was fit to the


situation of using the number of pages in a book (in hundreds) to
predict the number of typos in the book. The equation is y = 1.2 + 3.4x.

10. Interpret the slope.: *


(A) For every additional page in length, a book is expected to have an extra 3.4 typos on
average.
(B) A 400-page book, on average, should have 3.4 more typos than a 300-page book.
(C) For every additional 3.4 pages in length, a book is expected to have an extra 1.2 typos
on average.
(D) The slope has no practical interpretation in this context.

11. Find the predicted number of typos in a 500-page book.: *


(A) 17
(B) 18.2
(C) 171.2
(D) 1701.2

12. Explain what it would mean if an actual 500-page book had a


residual of -3.2.: *
(A) This particular book is definitely an outlier and should be dropped from the model.
(B) This particular book had a predicted value smaller than its actual value.
(C) This particular book had a predicted value larger than its actual value.
(D) A mistake has been made since residuals cannot be negative.
13. You run a Box-Cox test on your linear regression model to see
what sort of transformation is needed, if any. The estimate for turns
out to be 1.03 with a 95% confidence interval of (0.97, 1.09). What
transformation do you recommend for the response variable?: *
(A) A square root transformation
(B) An inverse transformation
(C) A log transformation
(D) A transformation is not necessary here

14. Which of the following statements is/are correct?


(I) A confidence interval is for a mean.
(II) A prediction interval is for a specific value.
(III) A confidence interval is wider than a prediction interval.
:*
(A) I only
(B) III only
(C) I and II only
(D) I, II, and III

Questions 15-18: A multiple regression model was run on a sample of


150 high school students to see whether the heights of their mothers
and fathers (in inches) could be used to predict the students own
height (in inches). Consider the following partial output.

Parameter Estimate Standard Error


Intercept 16.967 4.658
Mother 0.299 0.069
Father 0.412 0.051

15. Find the T test for the variable Father.: *


(A) 0.231
(B) 3.643
(C) 4.333
(D) 8.078

16. Choose the best way to interpret the estimated coefficient for
Mother.: *
(A) Every extra inch in height of the mother causes the student to be 0.299 inches taller.
(B) Holding the fathers height constant, every additional inch in height from the mother is
associated with an increase of 0.299 inches on average in the students height.
(C) Holding the fathers height constant, every additional inch in height from the mother is
associated with a decrease of 0.299 inches on average in the students height.
(D) The coefficient of 0.299 does not have a practical interpretation.

17. Suppose the coefficient for Father turns out to be significant.


Choose the best answer: *
(A) A confidence interval for Father would contain 0.
(B) A confidence interval for Father would be completely positive.
(C) A confidence interval for Father would be completely negative.
(D) There is not enough information to tell.

18. Find the 95% confidence interval for Mother.: *


(A) (0.1617, 0.4363)
(B) (0.1626, 0.4354)
(C) (0.1848, 0.4132)
(D) (0.1855, 0.4125)
19. Given that SSR = 432.189 and SSE = 113.456 for a multiple
regression model, compute R2.: *
(A) 0.2079
(B) 0.2625
(C) 0.6274
(D) 0.7921

20. All of the following statements are true about


correlation EXCEPT: *
(A) Correlation is always between -1 and 1, inclusive.
(B) Correlation does not change if we change the scales of the explanatory variable (e.g.
inches to feet).
(C) Correlation measures how strong the linear tendency is between the explanatory and the
response variables.
(D) A correlation close to 1 or -1 indicates that the explanatory variable causes the response
variable to behave a certain way.

21. You are running a multiple regression model with eight predictor
variables. At least some of the variables in the full model are
insignificant at = 0.05. A backwards elimination procedure would
begin with which steps?: *
(A) Drop the one variable with the highest p-value above 0.05, and refit the model to the
data.
(B) Drop the two variables with the highest p-values above 0.05, and refit the model to the
data.
(C) Drop all variables with p-values above 0.05, and refit the model to the data.
(D) Drop all variables with p-values above 0.05. The variables remaining will provide the
final model.
22. We fit a simple linear regression model using price (in dollars) to
predict the number of packets of dog biscuits sold per day. The
regression equation is y = 98.1 - 9.8x, and R2 = 0.5275.

Explain how to interpret the R2 in the context of this problem.: *


(A) 52.75% of the variation in the price is explained by the number of packets of dog
biscuits sold per day.
(B) 52.75% of the variation in the number of packets of dog biscuits sold per day is
explained by the price.
(C) The model is correct 52.75% of the time.
(D) If there is no association between the number of packets of dog biscuits sold and the
price, we have a probability of 0.5275 of getting a slope of -9.8, or a more extreme result.

23. Fifty children were selected at random from students at an


elementary school. Each of these students was classified according
to sugar consumptions (high or low) and exercise level (high or low).
The resulting data are summarized in the following frequency table.
Using the resulting data summarized in the frequency table below,
determine the row conditional relative frequency for a child with high
sugar consumption and low exercise level.: *

(A) 14/50 = 0.280


(B) 14/32 = 0.4375
(C) 18/50 = 0.360
(D) 18/32 = 0.5625

24. What is the best method for detecting constant variance in the
residuals?: *
(A) Approximate straight line on a QQ-plot
(B) Plot of explanatory variable versus response variable
(C) Plot of fitted values versus residuals
(D) Histogram of the residuals

Questions 25-26: A company claims that their fridges have an average


of 22 gallons of useable space inside. You want to see whether they
are cheating the consumer. To test this claim, you take a sample of 50
fridges and find the sample mean and sample standard deviation to be
21.2 and 3.0, respectively.

25. What would be the appropriate alternative hypothesis? (Hint: is the


customer being cheated if the usable space is too little, too much, or
simply different from the assumed amount?): *
(A) Ha: = 22
(B) Ha: > 21.2
(C) Ha: 22
(D) Ha: < 21.2
(E) Ha: < 22

26. Compute the appropriate test statistic.: *


(A) -1.8856
(B) -0.2667
(C) 0.2667
(D) 1.8856

:*
Choice (A)
Choice (B)
Choice (C)
Choice (D)
Choice (E)

Questions 28-29: You have collected data on heights, in inches, from


41 males and 52 females. The sample standard deviations for males
and females are, respectively, 6.1 and 4.9.

28. Find the appropriate test statistic to test equality of variances


between the genders.: *
(A) 0.2541
(B) 1.2449
(C) 1.5498
(D) 7.5939

29. On what distribution would you obtain the p-value?: *


(A) The F distribution with degrees of freedom 41 and 52.
(B) The T distribution with degrees of freedom 91.
(C) The T distribution with degrees of freedom 40.
(D) The chi-square distribution with degrees of freedom 1.
(E) The F distribution with degrees of freedom 40 and 51.

30. You conduct a two-tailed hypothesis test, which turns out to be


significant at = 0.03. A corresponding confidence interval for the
same test would have what confidence?: *
(A) 3%
(B) 95%
(C) 97%
(D) 97.5%

Questions 31-32: Consider the following table of data. We want to


know if there is an association between gender and whether a college
student is in a fraternity / sorority.

Yes No Total
Male 12 39 51
Female 33 76 109
Total 45 115 160

31. Find the expected number of females that are not in a sorority
(No).: *
(A) 30.6563
(B) 36.6563
(C) 76
(D) 78.3438
32. What are the degrees of freedom for this test?: *
(A) 1
(B) 3
(C) 44
(D) 159

Questions 33-34: A random sample of 20 adults who have frequent


headaches count the number of headaches they experience over one
week. These 20 adults then get chiropractic treatment, and then they
count the number of headaches they experience over the following
week. A hypothesis test is to be conducted to test the following
alternative hypothesis: that chiropractic treatment lowers the
frequency of headaches.

33. Explain what type of test would be appropriate to conduct here.: *


(A) Paired T-Test
(B) Paired Z-Test
(C) 2 Independent Sample T-Test
(D) 2 Independent Sample Z-Test
(E) Chi-Square Test for Independence

34. Explain what a Type II error would be in this context.: *


(A) You decide that chiropractic treatment significantly lowers the frequency of headaches,
and in fact it does.
(B) You decide that chiropractic treatment significantly lowers the frequency of headaches,
and in fact it does not.
(C) You decide that chiropractic treatment does not significantly lower the frequency of
headaches, and in fact it does not.
(D) You decide that chiropractic treatment does not significantly lower the frequency of
headaches, and in fact it does.

35. The table below shows the number of people who have and have
not been involved in a car accident categorized by drinking and
smoking habits. If you randomly selected one person, what is the
probability that this person is a drinker and is involved in a car
accident?: *

(A) 0.1633
(B) 0.2041
(C) 0.3673
(D) 0.5714

36. A study was conducted to see if music has an effect on


productivity of workers. The music was turned on during the working
hours of 32 randomly selected workers. The workers productivity
level averaged 82 with a standard deviation of 25. On a different day
the music was turned off and there were 50 randomly selected
workers. Their productivity level averaged 58 with a standard
deviation of 15. What is 90% confidence interval of the difference
between the average productivity levels of these 2 groups of workers?
:*
(A) (15.94, 32.06)
(B) (16.66, 31.34)
(C) (16.74, 31.26)
(D) Unable to determine because assumptions are not met.

37. Based on previous research, the standard deviation of the


distribution of the age at which children begin to walk is estimated to
be 1.5 months. A random sample of children will be selected, and the
age at which each child begins to walk will be recorded. A 90 percent
confidence interval for the average age at which children begin to
walk will be constructed using the data obtained from the sample of
children. Of the following, which is the smallest sample size that will
result in a margin of error of 0.25 month or less for the confidence
interval? : *
(A) 95
(B) 100
(C) 135
(D) 140
(E) 240

Questions 38-39: A manager wants to know whether the three levels


of employee ranks (A, B, C) have different mean times to complete a
certain task. Consider the following table.
2
Group Mean S N
A 24.2 2.54 10
B 27.1 6.64 10
C 30.2 3.76 10

38. Compute SSE.: *


(A) 19.313
(B) 90.033
(C) 116.460
(D) 180.066

39. Compute MSE.: *


(A) 4.313
(B) 6.209
(C) 18.624
(D) 31.803

Questions 40-42: Consider the following partial one-way ANOVA table


with 44 subjects:

Source DF SS MS F
Treatment 2 180.067 90.033 __
Error ___ _______ ______
Total ___ 912.527

40. What are the degrees of freedom for error and total,
respectively?: *
(A) 41 and 43
(B) 42 and 44
(C) 41 and 44
(D) 42 and 43

41. Compute SSE.: *


(A) 3.901
(B) 5.068
(C) 732.460
(D) 1092.594

42. Compute the F statistic.: *


(A) 5.040
(B) 10.135
(C) 51.079
(D) Not enough information to determine.

43. Matthew wanted to determine of the proportion of females for a


certain species of laboratory animal is greater than 0.5. He was given
access to appropriate records that contained information of 15,000
live births for the species. To construct a 99 percent confidence
interval, he selected a random sample of 100 births from the records
and found that 59 births were female. Based on the study, which of
the following expressions is an approximate 99 percent confidence
interval estimate for p , the proportion of females in the 15,000 live
births?: *

(A)
(B)
(C)
(D)
(E)

44. All of the following techniques are acceptable ad hoc procedures


to conduct after an ANOVA has been fit, EXCEPT: *
(A) Tukey
(B) Kolmogorov-Smirnoff
(C) Scheffe
(D) LSD
45. Monthly rent was determined for each apartment in a random
sample of 250 apartments. The sample mean was $940 and the sample
standard deviation was $55. An approximate 95 percent confidence
interval for the true mean monthly rent for the population of
apartments from which this sample was selected is ($933 , $947) .
Which of the following statements is a correct interpretation of the 95
percent confidence level?: *
(A) In this population, about 95 percent of all rental prices are between $933 and $947.
(B) In this sample, about 95 percent of the 100 rental prices are between $933 and $947.
(C) In repeated sampling, the method produces intervals that include the population mean
approximately 95 percent of the time.
(D) In repeated sampling, the method produces intervals that include the sample mean
approximately 95 percent of the time.
(E) There is a probability of 0.95 that the true mean is between $933 and $947.

46. A small business surveyed all 25 of its employees to determine the


proportion who participate in the 401-k retirement program. Which of
the following statements is true?: *
(A) The small business did not use a random sample, so the information from the survey
will not provide useful information.
(B) The small business should not use the data from this survey because this is an
observational study.
(C) The small business would have to use the survey data to construct a confidence interval
in order to estimate the proportion of employees who participate in the 401-k program.
(D) The small business does not need to use an inference procedure to determine the
proportion of employees who participate in the 401-k program because the survey was a census
of all employees.
(E) The small business can use the result of this survey to prove that working for the small
business causes employees to participate in the 401-k program.

47. Which of the following are assumptions for the ANOVA


procedure?
(I) Independent subjects
(II) Sample size greater than 30
(III) Equal variances
(IV) Normality
:*
(A) I, II, and III only
(B) II and III only
(C) II, III, and IV only
(D) I, II, III, and IV
(E) I, III, and IV only

Questions 48-50: You are conducting a randomized block design,


testing a new type of medicine thats supposed to lower cholesterol,
with the following partial summary:

Source SS DF
Medicine 5.20 4
Block ____ __
Error 0.54 6
Total 12.91 19

48. What was the total sample size?: *


(A) 4
(B) 6
(C) 9
(D) 20

49. Find the block sum of squares.: *


(A) 1.17
(B) 1.73
(C) 7.17
(D) 7.71

50. How many blocks were included in the study?: *


(A) 4
(B) 5
(C) 7
(D) 10

Anda mungkin juga menyukai