Number of
orange
candies
16
Number of
yellow
candies
7
Number of
green
candies
16
Number of
purple
Candies
10
Total
61
Proportion
Number of
red
candies
317
Number of
orange
candies
346
Number of
yellow
candies
321
Number of
green
candies
352
Number of
purple
candies
298
19%
21%
20%
22%
18%
Total
1634
Nicole Denkers
Statistics 1040 Final Project
Nicole Denkers
Statistics 1040 Final Project
Nicole Denkers
Statistics 1040 Final Project
Nicole Denkers
Statistics 1040 Final Project
MIN
1/4 %
MED
3/4%
MAX
Red
Orang
e
Yello
w
Green
Purpl
e
5
10
12
14
18
7
10
13
14
21
5
9.5
12
14.5
21
6
11.5
14
15
17
5
9
11
13
16
Orang
e
7
Yello
w
5
Purpl
e
5
4.5
5.5
2.5
2.5
2.5
6.5
Red
MIN
.25 % MIN
MED - .
25%
0.75%MED
MAX - .
75%
Green
Organizing and Displaying Quantitative Data: the Number of Candies per Bag
Supposedly each bag weight the 2.17 oz of Skittles indicated on the packaging, yet there was
variation in each bags count. The total of 1634 from entire class sample of 27 bags gave a mean
of 60.519 candies per bag, and the Standard Deviation for the number of candies per bag was
2.471. The frequency distribution of the Skittles gave a normal distribution between 56-64
candies per bag with only one outlier. The graphs appear to have a slightly skewed right
distribution, and again this is not what I expected, as I would have assumed that the same
amount for each color/flavor would be produced and packed in each individual bag. The data of
the class does also support my individual bag count.
Nicole Denkers
Statistics 1040 Final Project
Reflection
Quantitative data comprises of data that is countable or measurable, in our case it is the number
of Skittles per bag. Categorical (or Qualitative) data comprises of data that is given meaning, but
cannot actually be measured by numbers, and is instead a representative measure (i.e. colors of
the Skittles).
Quantitative data can be represented using Scatter Plots, Dot Plots, Stem Plots, and Time Series
Plots.
Categorical Data can be represented using Pie Charts, Pareto Charts, and Bar Graphs.
The Confidence Interval values were used to determine the proportion of Skittles that were Yellow
in each bag. We were 99% confident that the interval of 0.171 to 0.221 contained the value of
the population proportion of the Skittles. Meaning that if random Skittle bags were selected, then
99% of them would contain the true value of the population proportion.
Nicole Denkers
Statistics 1040 Final Project
The Confidence Interval values were used to determine the mean number of Skittles per bag. We
are confident that 95% from the interval 60.011 to 61.989 contained the value of the mean
number of candies per bag in the population. Meaning if random bags of Skittles samples of
these bags were selected, then 95% of them would contain the true value of the population
mean.
Nicole Denkers
Statistics 1040 Final Project
The Confidence Interval values were performed to determine the Standard Deviation for the
number of Skittles per bag. We have 98% confidence that the intervals of 1.887 to 3.650
contained the value of the Standard Deviation of the number of candies per bag in the
population of Skittles. Meaning if random bags were selected, 98% of them would actually
contain the true value of the population Standard Deviation.
Hypothesis Tests:
This references the procedures in which Statistical analysis is used to either accept or reject the
null hypothesis. It is to prove the hypothesis about whether or not a population parameter is
true.
Nicole Denkers
Statistics 1040 Final Project
Test Statistic is 12.471 and is within reject region. Reject H, sufficient evidence that rejection of
mean number of candies in bag of Skittles is 55.
The purpose of a confidence interval is used in the measuring of the probability that a population
parameter will fall in between 2 sets of values.
REFLECTION
Interval Estimates and Hypothesis Tests for:
Population Proportions:
1. The sample must be of random observations
-This condition was met
2. The conditions for the binomial distribution must be met and satisfied (i.e. fixed number of
trials, trials are independent, 2 categories for outcomes, and the probability remains constant for
each trial)
-Binomial distribution condition is met
3. At least 5 sucesses (np) and 5 failures (nq) must occur (n=1634)
- Condition is also met
Population Mean:
Nicole Denkers
Statistics 1040 Final Project
1. The sample must be of random observations
- This condition was met
2. The population must be normally distributed OR the number of observations must be > 30
(this condition is not met, but it was normally distributed so overall condition was met)
Population Standard Deviation
1. The sample must be of random observations
- This condition was met
2. The population must be normally distributed
- This condition was met
The possible errors include miscount, incorrect data entry accidentally being submitted, color
blindness, inability to correctly use Excel, or simple miscalculation.
Sampling method could be improved by a larger sample size, repetitive counting to verify work,
or having another individual recount or double check data collected.