Anda di halaman 1dari 5

Math 1040

Statistics Skittles Project

Ashton Miller
Introduction
The term project we are working on this semester in my statistics course focuses on the
practical uses of using statistics in our lives. We are going to be using data we collect
individually from skittles bags and then collecting the total data and figuring out the mean
amount of skittles in each bag, the standard deviation for this number and taking notes and
showing how we got our results. Our overall goal is to determine how random the numbers and
colors in any given skittles bag really are.
Data Collection
Red Orange Yellow Green Purple Total
11

12

16

16

61

Organizing and displaying categorical Data: Colors

First in order to get a better visual representation of our data we are going to create a pie chart:

RED

ORANGE
YELLOW
GREEN

179; 19%

155; 17%

208; 23%

PURPLE

Another method of representing the data is through a bar Pareto chart.

250
200

208

197

179

181

GREEN

PURPLE

155
150
100
50
0
RED

ORANGE

Frequ
Bin

ency

YELLOW

Frequ
Bin
12.33

ency

6
12.33

333
18.66

36

333
18.66

36

667

33

667
More

33 More
4
6

4
2

Organizing and Displaying Quantitative Data: the Number of Candies per Bag
Class Mean: 61.33
Class Standard Deviation: 2.82
5-number Summary: Min: 54, Q1: 57.7 Q2: 61.3, Q3: 63.2 Max: 65

More

18.66666667
Frequency
Bin
12.33333333

Frequency

6
0

10

15

20

25

30

35

40

Histogram:
The data shows a pretty good bell curve but its weighted more to the left. We
can reasonably assume that any random bag of skittles will contain 57-65
skittles. There are only 2 outliers in this data: 54,56 but we can still
confidently state that there are about 60 skittles per bag.
The main difference between categorical and quantiative data is that
categorical is more about organizing things into groups like the colors of
skittles,: red, yellow, orange, grren, purple. Quanitative is more specific to
the ammount of data in a group like how many of each color skittle is in a
bag.
Confidence Intervals
A confidence interval is the calculated version of your theory. It states how
confident you are in your assumptions and by what sort of margins your
theory operates.

A Hypothesis test is an even more detailed version of a confidence interval,

but more importantly it is the method in which we can mathematically
disprove or fail to disprove any given theory. By determining the critical
values and the p-value we find the minimum and maximum range of theory
then test the likelihood of the given hypothesis.
As long as you have the mean, the population proportion, and the standard
deviation intervals are fairly simple to accomplish. With hypothesis testing
you need the above plus you need a significance level so you can determine
the parameters of the theory in order to get a big picture idea of what the
data is telling you.
There are 3 different methods in getting the data based on what pieces you
are originally missing but the overall goal is to find the above values and
plug them in. Then obtain the results.