Anda di halaman 1dari 59

Statistics

Chapter 4: Random Variables and


Distribution

Where Were Going

Develop the notion of a random


variable
Numerical data and discrete random
variables
Discrete random variables and their
probabilities

4.1: Two Types of Random


Variables

A random variable is a variable hat


assumes numerical values associated
with the random outcome of an
experiment, where one (and only one)
numerical value is assigned to each
sample point.

4.1: Two Types of Random


Variables

A discrete random variable can assume a


countable number of values.

Number of steps to the top of the Eiffel Tower*

A continuous random variable can


assume any value along a given interval of
a number line.

The time a tourist stays at the top


once s/he gets there

*Believe it or not, the answer ranges from 1,652 to 1,789. See Great Buildings

4.1: Two Types of Random


Variables

Discrete random variables

Number of sales

Number of calls

Shares of stock

People in line

Mistakes per page

Continuous random
variables

Length

Depth

Volume

Time

Weight
5

4.2: Probability Distributions


for Discrete Random Variables

The probability distribution of a


discrete random variable is a graph,
table or formula that specifies the
probability associated with each
possible outcome the random variable
can assume.

p(x) 0 for all values of x


p(x) = 1

4.2: Probability Distributions


for Discrete Random Variables

Say a random variable


x follows this pattern:
p(x) = (.3)(.7)x-1
for x > 0.

This table gives the


probabilities (rounded
to two digits) for x
between 1 and 10.

P(x)

.30

.21

.15

.11

.07

.05

.04

.02

.02

10

.01
7

4.3: Expected Values of


Discrete Random Variables

The mean, or expected value, of a


discrete random variable is

E ( x) xp ( x).

4.3: Expected Values of


Discrete Random Variables

The variance of a discrete random


variable x is

2 E[( x )2 ] ( x ) 2 p( x).

The standard deviation of a discrete


random variable x is

E[( x ) ]
2

(x )

p ( x).
9

4.3: Expected Values of


Discrete Random Variables
Chebyshevs Rule

Empirical Rule

.68

P ( 2 x 2 )

.75

.95

P ( 3 x 3 )

.89

1.00

P( x )

10

4.3: Expected Values of


Discrete Random Variables

In a roulette wheel in a U.S. casino, a $1 bet on


even wins $1 if the ball falls on an even number
(same for odd, or red, or black).
The odds of winning this bet are 47.37%
P ( win $1) .4737
P (lose $1) .5263
$1 .4737 $1 .5263 .0526
.9986

On average, bettors lose about a nickel for each dollar they put down on a bet like this.
(These are the best bets for patrons.)
11

Binomial Distribution

Tree Diagram

4 Properties of Binomial Distribution


1.

Fixed number of Trials (n)

Tree Diagram

Tree Diagram

4 Properties of Binomial Distribution

1. Fixed number of Trials (n)


2. Two outcomes in a trial, SUCCESS or FAILURE

Tree Diagram

Tree Diagram

4 Properties of Binomial Distribution

1. Fixed number of Trials (n)


2. Two outcomes in a trial, SUCCESS or FAILURE
3. Trials are independent

Tree Diagram

Tree Diagram

4 Properties of Binomial Distribution

1. Fixed number of Trials (n)


2. Two outcomes in a trial, SUCCESS or FAILURE
3. Trials are independent
4. Probability of success (p) remains constant

Tree Diagram

Tree Diagram
Throwing a die

Tree Diagram
X ~ B(n,p)
X number of
successes in a trial
X ~ B(3, 1/6)

Is there a formula for calculating


Binomial Probabilities rather than draw
a tree diagram?

There are five things you need to do to


work a binomial story problem.
1. Define Success first. Success must be for a single
trial. Success = "Rolling a 6 on a single die"
2. Define the probability of success (p): p = 1/6
3. Find the probability of failure (q): q = 5/6
4. Define the number of trials: n = 3
5. Define the number of successes out of those trials (r)

The General Binomial


Probability Formula

r number of successes out of those trials


n number of trials
p probability of success
q probability of failure
Where: q = 1 - p

The General Binomial


Probability Formula

In the old days, there was a probability of 0.8 of success


in any attempt to make a telephone call. Calculate the
probability of having 7 successes in 10 attempts.

Mean and Variance

4.5: The Poisson Distribution

The Poisson distribution is a discrete probability


distribution for the counts of events that occur randomly
in a given interval of time (or space).

The Poisson distribution can be used to calculate the


probabilities of various numbers of "successes" based on
the mean number of successes. In order to apply the
Poisson distribution, the various events must
be independent.

39

4.5: The Poisson Distribution


Many experimental situations occur in which we
observe the counts of events
within a set unit of time, area, volume, length etc. For
example,
The number of cases of a disease in different towns
The number of mutations in set sized regions of a
chromosome
The number of dolphin pod sightings along a flight
path through a region
The number of particles emitted by a radioactive
source in a given time
The number of births per hour during a given day
40

4.5: The Poisson Distribution


FORMULA
The formula for the Poisson probability mass function is

where
eis the base of natural logarithms (2.7183)
is the mean number of "successes"
x is the number of "successes" in question

41

4.5: The Poisson Distribution

EXAMPLE
The average number of homes sold by the Acme Realty company is 2
homes per day. What is the probability that exactly 3 homes will be sold
tomorrow?
Solution: This is a Poisson experiment in which we know the following:
= 2;
x = 3;
e = 2.71828; since e is a constant equal to approximately 2.71828.
We plug these values into the Poisson formula as follows:
P(x; ) = (e-) (x) / x!
P(3; 2) = (2.71828-2) (23) / 3!
P(3; 2) = (0.13534) (8) / 6
P(3; 2) = 0.180
Thus, the probability of selling 3 homes tomorrow is 0.180 .
42

4.5: The Poisson Distribution


EXAMPLE
Suppose you knew that the mean number of calls to a
fire station on a weekday is 8. What is the probability
that on a given weekday there would be 11 calls?

= 8;

x = 11;

e = 2.71828; sinceeis a constant equal to


approximately 2.71828.

43

4.5: The Poisson Distribution


Changing the size of the interval
Suppose we know that births in a hospital occur
randomly at an average rate of
1.8 births per hour.
What is the probability that we observe 5 births in a
given 2 hour interval?

Well, if births occur randomly at a rate of 1.8 births


per 1 hour interval
Then births occur randomly at a rate of 3.6 births
44
per 2 hour interval

4.5: The Poisson Distribution


Sum of two Poisson variables
Now suppose we know that in hospital A births
occur randomly at an average rate
of 2.3 births per hour and in hospital B births occur
randomly at an average rate
of 3.1 births per hour.
What is the probability that we observe 7 births in
total from the two hospitals
in a given 1 hour period?

So if we let X = No. of births in a given hour


at
45
hospital A

4.5: The Poisson Distribution


Cumulative Poisson Probability
Acumulative Poisson
probabilityrefers to the probability
that the Poisson random variable is
greater than some specified lower
limit and less than some specified
upper limit.
46

4.5: The Poisson Distribution


Example 1
To solve this problem, we need to find the probability that
een
on a 1-day safari is 5. What is the probability
tourists will see 0, 1, 2, or 3 lions. Thus, we need to calculate
that
will
see fewerP(0;
than
the tourists
sum of four
probabilities:
5) four
+ P(1;lions
5) + on
P(2;the
5) +
next
safari? this sum, we use the Poisson formula:
P(3; 1-day
5). To compute
P(x<3, 5) = P(0;
+ P(1; 5)experiment
+ P(2; 5) + P(3;
Solution:This
is a5)Poisson
in 5)
-5
0
-5
1
-5
2
P(x<3,
5)
=
[
(e
)(5
)
/
0!
]
+
[
(e
)(5
)
/
1!
]
+
[
(e
)(5
) / 2!
Suppose-5 the
average
number
of
lions
s
which
we
] + [ (e )(53) / 3! ]
know
the following:
P(x<3, 5) = [ (0.006738)(1) / 1 ] + [ (0.006738)(5) / 1 ] +
/ 2 ] + [ (0.006738)(125) / 6 ]
[(0.006738)(25)
= 5;

P(x<3, 5) = [ 0.0067 ] + [ 0.03369 ] + [ 0.084224 ] +


= 0, 1, ]
2, or 3;
[x0.140375
P(x<3,
5) = 0.2650
e = 2.71828;
sinceeis a constant equal to
Thus,
the probability
of seeing at no more than 3
lions is
approximately
2.71828.
47
0.2650.

4.6: The Hypergeometric


Distribution

In the binomial situation, each trial was


independent.

Drawing cards from a deck and replacing


the drawn card each time

If the card is not replaced, each trial


depends on the previous trial(s).

The hypergeometric distribution can be


used in this case.
48

4.6: The Hypergeometric


Distribution

Randomly draw n elements from a set


of N elements, without replacement.
Assume there are r successes and N-r
failures in the N elements.
The hypergeometric random variable
is the number of successes, x, drawn
from the r available in the n selections.
49

4.6: The Hypergeometric


Distribution
P( x)

r

x

N r

nx
N

n

where
N = the total number of elements
r = number of successes in the N elements
n = number of elements drawn
X = the number of successes in the n elements
50

4.6: The Hypergeometric


Distribution
P( x)

r

x

N r

nx
N

n

nr

N
r ( N r ) n( N n)
2

N 2 ( N 1)
51

4.6: The Hypergeometric


Distribution

Suppose a customer at a pet store wants to buy two hamsters


for his daughter, but he wants two males or two females (i.e.,
he wants only two hamsters in a few months)
If there are ten hamsters, five male and five female, what is the
probability of drawing two of the same sex? (With hamsters, its
virtually a random selection.)

P( M 2) P( F 2)

5

2

10 5

2 2 (10)(1) .22
45
10

2

P( M 2 or F 2) P( M 2) P( F 2) 2 .22 .44
52

Continuous Random Variable


Normal Distribution
Thenormal distributionrefers to a
family of
continuous probability distributions
described by the normal equation.

53

Continuous Random Variable


Normal Distribution
z = (X - ) /
where X is a normal random variable, is the
mean of X, and is the standard deviation of X

54

Continuous Random Variable


Normal Distribution
Solution:
Example

The value of the normal random variable is 365 days.


The mean is equal to 300 days.
light bulb
An
Theaverage
standard deviation
is equalmanufactured
to 50 days.

by the
Acme Corporation lasts 300 days with a
z = (X - ) / = (365-300)/50
standard deviation ofz=
501.3
days. Assuming that
bulb life is normally distributed, what is the
probability
that
light bulb
will last at
The
answer
is:anP(Acme
X<365)
= 0.90.
most 365there
days?is a 90% chance that a
Hence,

light bulb will burn out within 365 days.


55

Continuous Random Variable


Standard Normal Distribution
The standard normal distribution is a special case of the
normal distribution. It is the distribution that occurs when a
normal random variable has a mean of zero and a standard deviation of
one.
Standard Score (aka, z Score)
The normal random variable of a standard normal distribution is called
a standard score or a z-score. Every normal random variable X can be
transformed into a z score via the following equation:
z = (X - ) /
where X is a normal random variable, is the mean of X, and is the
standard deviation of X.
56

Continuous Random Variable


Standard Normal Distribution
The standard normal distribution is a special case of the
normal distribution. It is the distribution that occurs when a
normal random variable has a mean of zero and a standard deviation of
one.
Standard Score (aka, z Score)
The normal random variable of a standard normal distribution is called
a standard score or a z-score. Every normal random variable X can be
transformed into a z score via the following equation:
z = (X - ) /
where X is a normal random variable, is the mean of X, and is the
standard deviation of X.
57

Continuous Random Variable


Standard Normal Distribution
Example
Molly earned a score of 940 on a national achievement test. The
mean test score was 850 with a standard deviation of 100. What
proportion of students had a higher score than Molly? (Assume
that test scores are normally distributed.)
(A) 0.10
(B) 0.18
(C) 0.50
(D) 0.82
(E) 0.90
58

Continuous Random Variable


Standard Normal Distribution
First, we transform Molly's test score into az-score, using the zscore transformation equation.
z= (X- ) / = (940 - 850) / 100 = 0.90
Then, using the standard normal distribution table, we find the
cumulative probability associated with the z-score. In this case,
we find P(Z < 0.90) = 0.8159.
Therefore, the P(Z > 0.90) = 1 - P(Z < 0.90) = 1 - 0.8159 =
0.1841.
Thus, we estimate that 18.41 percent of the students tested had
a higher score than Molly.
59

Anda mungkin juga menyukai