Anda di halaman 1dari 55

SOCRATES

469 / 470 BC - 399 BC

Socratic Ignorance
"I know that I know nothing"
STATISTICS HISTORY
 Some scholars pinpoint the origin of
statistics to 1662, with the publication of
Natural and Political Observations upon the
Bills of Mortality by John Graunt.
 Its mathematical foundations were laid in
the 17th century with the development of
probability theory by Blaise Pascal and
Pierre de Fermat.
 Probability theory arose from the study of
games of chance. The method of least
squares was first described by Carl
Friedrich Gauss around 1794.
STATISTICS APPLICATIONS
 Early applications of statistical thinking
revolved around the needs of states to
base policy on demographic and
economic data, hence its stat- etymology.
The scope of the discipline of statistics
broadened in the early 19th century to
include the collection and analysis of data
in general.
 The term statistics is ultimately derived
from the New Latin statisticum collegium
("council of state") and the Italian word
statista ("statesman" or "politician").
 Today, statistics is widely employed in
government, business, and the natural
and social sciences.
INTRO TO STATISTICS
 WHAT IS STATISTICS ? VID
Resources
 www.collegeboard.com

 www.whfreeman.com/tps3e
 Register to take Online Quizzes
 Send me the result
gil_op@hotmail.com

 www.ti.com
"Imagination is more important than knowledge."
Einstein’s Riddle
ALBERT EINSTEIN WROTE THIS
RIDDLE EARLY DURING THE 19th
CENTURY. HE SAID THAT 98% OF
THE WORLD POPULATION WOULD
NOT BE ABLE TO SOLVE IT.

ARE YOU IN THE TOP 2% OF


INTELLIGENT PEOPLE IN THE
WORLD?
SOLVE THE RIDDLE AND FIND OUT.
There are no tricks, just pure logic, so good
luck and don't give up.

1. In a street there are five houses, painted


five different colours.
2. In each house lives a person of different
nationality
3. These five homeowners each drink a
different kind of beverage, smoke different
brand of cigar and keep a different pet.

THE QUESTION: WHO OWNS THE FISH?


HINTS
1. The Brit lives in a red house.
2. The Swede keeps dogs as pets.
3. The Dane drinks tea.
4. The Green house is next to, and on the left of
the White house.
5. The owner of the Green house drinks coffee.
6. The person who smokes Pall Mall rears birds.
7. The owner of the Yellow house smokes Dunhill.
8. The man living in the centre house drinks milk.
9. The Norwegian lives in the first house.
10. The man who smokes Blends lives next to the one who keeps
cats.
11. The man who keeps horses lives next to the man who smokes
Dunhill.
12. The man who smokes Blue Master drinks beer.
13. The German smokes Prince.
14. The Norwegian lives next to the blue house.
15. The man who smokes Blends has a neighbour who drinks
water.
Einstein's Riddle - ANSWER

The German owns the fish.


"Do not worry about your difficulties in Mathematics.
I can assure you mine are still greater."
Fundamental Definitions
 Statistics
 Population
 Sample
DEFINITIONS

Like any new field, you have to learn the vocabulary


in order to understand what is going on. In the
beginning it seems like a lot because the terms may
be unfamiliar or they may be words that you know
but they are used in a different way. There is no
way around it but to memorize the meanings. Its
like with a foreign language. You just have to learn
what the new words mean.
 Definition: Statistics refers to a set of methods and
rules for organizing , summarizing and interpreting
information.
POPULATIONS AND SAMPLES

 In research, we are trying to find out general


information about a class of people. (e. g. why do
people commit crimes?) As a researcher, I want to
know something about people in general. This
group of people in general is called a population.

 Definition: A population is the set of all


individuals of interest in a particular study.
 E. g. All people in the United States
 All students in school
 Students in 3rd Semester
 Population

 You want to know how many people in Mexico prefer Coke


over Pepsi… Define the Population:

 Give another example of population:


 All individuals in the U. S. is not only a large sample, it is
a very diverse sample – hard to find factors that relate to
them all.
 I could not include every individual from the group in
my study. So I want to take a sample from the population
and hope it would represent the whole group.

 Definition: A sample is a set of individuals selected


from a population, usually intended to represent the
population in a research study
 Definition: A sample is a set of individuals selected
from a population, usually intended to represent the
population in a research study.

 How would you Sample all people in the US?

 Sample for Cola drinkers in Mexico?

 Sample your own example:


 When we describe information, we use different
terms to represent populations and samples.

 Definition: A parameter is a value which describes a


population.
 Define a parameter in the Cola drinkers in Mexico:
 Define a parameter in your example:

 Definition: A statistic is a value which describes a


sample.
 Define a statistic in the Cola drinkers in Mexico:
 Define a statistic in your own example:
Once we have data, there are two things we can do with
it. We can describe it and we can use it to make
generalizations from it. These are the two different
roles of statistics.

 Definition: Descriptive statistics are statistical


procedures that summarize, organize and simplify data.
Write an example:

 Definition: Inferential statistics are techniques that


allow us to study samples and then make
generalizations about the populations from which they
were selected.
Write an example:
There is usually some difference between the way the
sample looks and the way the population looks. This
difference is known as sampling error

 Definition: Sampling error is the discrepancy that


exists between the sample statistic and the population
parameter. (e. g. “margin of error” in voters’ polls).
Write an example:

We want to reduce sampling error whenever possible.


 One way that we use to try to insure that our sample is
representative is to use random selection

 Definition: Random selection or random sampling is a


process for obtaining a sample from a population that
requires that every individual in the population have
the same chance of being selected for the sample. A
sample obtained by this method is called a random
sample.

Write an example:
The Scientific Method
and the Design of Experiments

The scientific method is a process for studying behavior


that relies on objectivity. It requires that we try to
eliminate personal biases from influencing the outcome
of our studies.

Theory - Hypothesis  data collection


 Definition: Theory – an integrated and overarching
set of principles that explain and predicts phenomena

 Definition: Hypothesis – a specific testable


prediction (usually) derived from a theory
Mean, Median, Mode, and Range
 The "mean" is the "average" you're used to, where you
add up all the numbers and then divide by the number
of numbers.
 The "median" is the "middle" value in the list of
numbers. To find the median, your numbers have to be
listed in numerical order, so you may have to rewrite
your list first.
 The "mode" is the value that occurs most often. If no
number is repeated, then there is no mode for the list.
 The "range" is just the difference between the largest
and smallest values.
Example for Mean, Median, Mode, and Range
Find the mean, median, mode, and range for the
following list of values:
13, 18, 13, 14, 13, 16, 14, 21, 13

 The mean is the usual average, so:


(13 + 18 + 13 + 14 + 13 + 16 + 14 + 21 + 13) ÷ 9 = 15
 The median is the middle value, so we have to rewrite
the list in order:
13, 13, 13, 13, 14, 14, 16, 18, 21
 There are nine numbers in the list, so the middle one
will be the (9 + 1) ÷ 2 = 10 ÷ 2 = 5th number:
13, 13, 13, 13, 14, 14, 16, 18, 21 So the median is 14.
 The mode is the number that is repeated more often
than any other, so 13 is the mode.

 The largest value in the list is 21, and the smallest is 13,
so the range is 21 – 13 = 8.

 mean: 15
median: 14
mode: 13
range: 8
M&Ms Activity
DATA PRODUCTION
Producing data
 Survey
 Surveys are popular ways to gauge public opinion
 The idea of a survey:
 Select a sample of people to represent a larger population.
 Ask the individuals in the sample some questions and record their
responses.
 Use sample results to draw some conclusions about the population.

 Observational study
 In an observational study, we observe individuals and
measure variables of interest but do not attempt to influence
the responses.
 Experiment
 In an experiment, we deliberately do something to individuals
in order to observe their responses.
Example Observational study VS
Experiment
 Census
 Phone survey
 Vaccines
 Interview
 Comparing two drugs
 Exam
 PAAR students
Exercise Observational study -
Experiment
 Design an Observational  Design an Experiment.
Study.  Question?
 Question?  Population?
 Population?  Sample?
 Sample?  Data production?
 Data production?  Analisys?
 Analisys?  Conclusion?
 Conclusion?
Homework Exercises (from pg 11)
 P1
 P2
 P3
 P4
 P5
DATA ANALYSIS
Individuals and Variables
 Individuals are the objects described by a set of data (people,
animals, things).
 Variables are any characteristics of an individual. A variable
can take different values for different individuals.
 Categorical variable. Places an individual into one of several
groups or categories.
 Quantitative variable. Takes numerical values for which
arithmetic operations (adding, average…) make sense.

 Id: Individuals, Variables (Categorical/Quantitative)


Education in the United States
State Region Population SAT SAT Percent Percent Teachers pay
(1000´s) Verbal Math taking No HS ($ 1000)
CA  Example.
PAC Education
35,894 in the
499 US 519 54 18.9 54.3
CO MTN 4,601 551 553 27 11.3 40.7
CT NE 3,504 512 514 84 12.5 53.6
Distribution
 The Distribution of a variable tells us what values the
variable takes and how often it takes these values.
 Describing Categorical variables
 Bar graph
 Side-by-side Bar graph
 Do you wear your seat belt?
Region Percent wearing Percent wearing
seatbelts, 2003 seatbelts, 1998
Northeast 74 66.4
Midwest 75 63.6
South 80 78.9
West 84 80.8

Percents of Front-seat Passengers Percent 1998 Wearing Seat belts:


Percent Wearing Seat belts in 2003 Percent 2003 1998 vs 2003
85 100
80 80
75 60
40
70 20
65 0
 Describing Quantitative variables
 Dotplot

The number of goals scored by the US women´s soccer team


in 34 games played during the 2004 season
3 0 2 7 8 2 4 3 5 1 1 4 5 3 1 1 3 3 3 2 1
2 2 2 4 3 5 6 1 5 5 1 1 5

A Dotplot of goals scored by the US women´s Use your TI 84 plus


soccer team in 2004
Find Mode:

Mean:

Median:

Range:
0 2 4 6 8 10
Graph the distribution
Goals scored
 Exploring Relationships between variables

On time Delayed Percents of late


flights
Alaska Airlines 3274 501 13.3 %
America West 6438 787 10.9 %
Alaska Airlines America West
Departure city On time Delayed % of late On time Delayed % of late
flights flights

Los Angeles 497 62 11.1 % 694 117 14.4 %


Phoenix 221 12 5.2 % 4840 415 7.9 %
San Diego 212 20 8.6 % 383 65 14.5 %
San Francisco 503 102 16.8 % 320 129 28.7 %
Seatle 1841 305 14.2 % 201 61 23.2 %
Total 3274 501 13.3 % 6438 787 10.9 %

Many relationships between two variables


are influenced by other variables lurking in the background.
Comparing the percents of delayed flights
for the two airlines at five airports.
35
Alaska Airlines
30
America West
25

20

15

10

0
Los Phoenix San Diego San Seattle
Angeles Francisco
Class Exercises (from pg 19)
 EXERCISES
 P.7
 P.8
 P.9
 P.10
 P.12
PROBABILITY
Probability: What are the Chances?
 When you toss a coin…
 What is the probability of getting heads?
1
0.9
0.8
Proportion of Heads

0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
1 5 10 50 100 500 1000
Number of Tosses

 Probability: what happens in the long run.


The Big idea of Probability
 Chance behavior is unpredictable in the short run,
but has a regular and predictable pattern in the long
run.
 Games of Chance:
 Texas Hold’ em Black jack Roulette Dice

 Probability quantifies the pattern of chance variation.


Basic Probability
 The probability for a given event can be thought of as the
ratio of the number of ways that event can happen divided
by the number of ways that any possible outcome could
happen. If we identify the set of all possible outcomes as
the "sample space" and denote it by S, and label the desired
event as E, then the probability for event E can be written

How Likely? = What is the Probability?


Rolls in Craps

What is the Probability? iE: Betting Dice


1 2 3 4 5 6

Five (Fever Natural or


1 Snake Eyes Ace Deuce Easy Four Easy Six
Five) Seven Out

Five (Fever Natural or


2 Ace Deuce Hard Four Easy Six Easy Eight
Five) Seven Out

Five (Fever Natural or


3 Easy Four Hard Six Easy Eight Nine (Nina)
Five) Seven Out

Natural or
4 Five (Fever Five) Easy Six Hard Eight Nine (Nina) Easy Ten
Seven Out

Natural or
5 Easy Six Easy Eight Nine (Nina) Hard Ten Yo (Yo-leven)
Seven Out

6 Natural or Seven Out Easy Eight Nine (Nina) Easy Ten Yo (Yo-leven) Boxcars
The following chart shows the dice combinations needed to roll each number

Dice Roll Possible Dice Combinations


2 1-1
3 1-2, 2-1
4 1-3, 2-2, 3-1
5 1-4, 2-3, 3-2, 4-1
6 1-5, 2-4, 3-3, 4-2, 5-1
7 1-6, 2-5, 3-4, 4-3, 5-2, 6-1
8 2-6, 3-5, 4-4, 5-3, 6-2
9 3-6, 4-5, 5-4, 6-3
10 4-6, 5-5, 6-4
11 5-6, 6-5
12 6-6
Exploring Probability.
Playing Cards, Dice, Spinners, and Coins
Game Theoretical # of # of Experimental # of # of Experimental
Probability attempts wins Probability attempts wins Probability

Draw a card with a heart on it. 1 out of 4 or


Be sure to replace the drawn card .25
and shuffle the cards before the 4 100
next attempt.

Roll the die – Roll the number 3. 1 out of 6 or

.16 4 100

Spinner – Spin the color red. 1 out of 6 or


.16
4 100
Coin toss – The coin must land 1 out of 2 or
on heads. .50
4 100
STATISTICAL INFERENCE
Drawing Conclusions from Data
 Have you ever cheated on a test or exam? “Yes” 48%

 Internet survey of 1200 students, aged 13 to 17, between


January 23 and February 10, 2003.

 If all 13 to 17 year-old students were asked the same


question, would exactly 48% have answer “yes”?

 What about with a second sample or a third sample?

Variation is everywhere!
 Probability provides a description of how the Sample results
will vary in relation to the true Population percent.
 We rely on Probability to help us answer research questions
with a known degree of confidence.

 Based in the Sampling method in the previous example, we can


say the estimate of 48% is very likely to be within the 3% of the
true Population percent.
 That is, we can be confident that between 45% and 51% of all
teenage students would say that they have cheated on a test.

 Statistical Inference allows us to use the results of properly


designed experiments and observetional studies, to draw
conclusions that go beyond the data themselves.
REVIEW EXERCISES
Exercises. P13, P14, P17, P18
Chapter Exs. P19,P20, P22, P23, P26

On-line Quiz. *20% First Period grade


Chapter P: What is Statistics?
www. whfreeman.com/tps3e
Register as a Student
Instructor e-mail gil_op@hotmail.com

Anda mungkin juga menyukai