Anda di halaman 1dari 22

Random variables and

Probability Distributions
1. Discrete Random Variables and Probability Distributions
Random Variables
If corresponding to every point U of an event space S, we have by a given
rule, a unique real value of X=X(U), i.e. X is a real valued function defined on
S, then X is called a random or stochastic variable or sometimes a variate. The
range of the function X, i.e. the set of all values which X takes up will be
called the spectrum of the random variable. The spectrum may be discrete or
continuous and accordingly the random variable is said to be discrete or
continuous.
Example.
Consider the random experiment of throwing a coin. The event space contains
two points head and tail. Thus S= {Head, Tail}. Now define a rv X by
X(H)=1 X(T)=0. The spectrum of X consists of two points 0 and 1, and
P(X=0)=P(X=1)=0.5
Let the random experiment consists in throwing a die and X denote the number on
the turned up face of the die. Then X is a random variable with X(face 1)=1,
X(face2)=2 and so on then P(X=1)=1/6,, P(X=6)=1/6.

Ex-3.
Consider the experiment of tossing two dice and observing the total of the points on
the two dice. For each of the possible 36 outcomes with equal probability, X
associates a value. For example, X(1,3)=4; X(2,5)=7. The possible values of X are
2,3,4,5,6,7,8,9,10,11,12.
We can see that the set of simple events satisfying X=7 is given by {(1,6);(6,1);
(2,5);(5,2);(3,4);(4,3)}
Thus the set
{X=7}={(1,6);(6,1);(2,5);(5,2);(3,4);(4,3)}
and P(X=7)=6/36=1/6.
Probability Distribution or Probability Mass Function:
The probability distribution or probability mass function (pmf) of a discrete rv is
defined for every number x by p(x)=P(X=x)=P(all s S: X(s)=x).

Ex: Suppose we go to a university bookstore during the first week of classes and
observe whether the next person buying a computer buys a laptop or a desktop
model.
X= 1 if the customer purchases a laptop computer
0 if the customer purchases a desktop computer

If 20% of all purchasers during that week select a laptop, the pmf for X is
p(0)= P(X=0)=P(next customer purchases a desktop model)=0.8
p(1)=P(X=1)=P(next customer purchases a laptop model)=0.2
p(x)=P(X=x)=0 for x0 or 1
An equivalent description is
= 0.8 if x=0
P(x) = 0.2 if x=1
= 0 if x0 or 1
The picture of this pmf, called a line graph.

The Cumulative Distribution Function


The cumulative distribution function (cdf) F(x) of a discrete r.v. X with pmf p(x) is
defined for every number x by
F(x) = P(X x)= y
:y x
p( y)

For any number x, F(x) is the probability that the observed value of X will be at
most x.
Example. The pmf of Y is
y 1 2 3 4
p(y) .4 .3 .2 .1
We first determine F(y) for each value in the set {1,2,3,4} of possible values
F(1)=P(Y1)=P(Y=1)=p(1) = .4
F(2)=P(Y2)=P(Y=1 or 2)= p(1) + p(2)= .7
F(3)=P(Y3)=P(Y=1 or 2 or 3)= p(1)+p(2)+p(3)=0.9
F(4)=p(Y4)=P(Y=1 or 2 or 3 or 4)= 1

Now for any other number y , F(y) will equal the value of F at the closest possible
value of Y to the left of y. For example F(2.7)=P(Y2.7)=P(Y2) =0.7, and
F(3.999)=F(3)=0.9.
The cdf is thus
0 if y<1
.4 if 1y<2
F(y)= .7 if 2y<3
.9 if 3y<4
1 if 4y

A graph of F(y) is shown in figure below


More generally, the probability that X falls in a specified interval is easily obtained
from the cdf. For example,
P(2 X 4)=p(2)+p(3)+p(4)
=[p(0)++p(4)]-[p(0)+p(1)]
=P(X 4)-P(X 1)
=F(4)-F(1)
Proposition: For any two numbers a and b with
a b,
X
P(a a
b)=F(b)-F( )
a
Where represents the largest possible X value that is strictly less than a. In
particular, if the only possible values are integers and if a and b are integers,
then

P(a X b)=P(X=a or a+1oror b)
=F(b)-F(a-1)
Taking a=b yields P(X=a)=F(a)-F(a-1) in this case.
Example. Let X=the number of days of sick leave taken by a
randomly selected employee of a large company during a particular
year. If the maximum number of allowable sick days per year is 14,
possible values of X are 0,1,,14. With F(0)=0.58, F(1)=0.72,
F(2)=0.76, F(3)=0.81, F(4)=0.88, and F(5)=0.94 ,
P(2 X 5)=P(X=2,3,4, or 5)=F(5)-F(1)=0.22
And P(X=3)= F(3)-F(2)=0.05
Expected value and variance for a discrete random variable.
Just as for the collection of sample and population data it is often useful to describe
a rv in terms of its mean and variance. The (long-run) mean for a rv X is called the
expected value and is denoted by E(X). For a discrete rv, it is the weighted average
of all possible numerical values of the variable with the respective probabilities
used as weights.

Definition: Let X be a discrete rv with set of possible values D and pmf p(x). The
expected value or mean value of X, denoted by E(X)= x = x D
x. p ( x)

The Variance of X
Let X have pmf p(x) and expected value . Then the variance of X, denoted by V(X)
or x or just is V(X)= ( x ) . p ( x) E[( X ) ]
2 2 2 2

The standard deviation (SD) of X is


x 2x
Exercise. 1. A mail-order computer business has six telephone lines. Let X denotes
the number of lines in use at a specified time. Suppose the pmf of X is as given in
the accompanying table.
x 0 1 2 3 4 5 6
p(x) 0.10 0.15 0.20 0.25 0.20 0.06 0.04

Calculate the probability of each of the following events.


a. {at most 3 lines are in use}
b. {fewer than 3 lines are in use}
c. {at least 3 lines are in use}
d. {between 2 and 5 lines, inclusive, are in use}
e. {between 2 and 4 lines, inclusive, are in use}
f. {at least 4 lines are in use}
Exercise 2. An automobile service facility specializing in engine tune-ups knows
that 45% of all the tune ups are done on four cylinder automobiles, 40% on six
cylinder automobiles, and 15% on eight cylinder automobiles. Let X = the number
of cylinders on the next car to be tuned.
a. What is the pmf of X?
b. Draw both a line graph and a probability histogram for the pmf of part (a).

Exercise .Bill Johnson has just bought a VCR from Jims Videotape Service at a cost of $300.
He now has the option of buying an extended service warranty offering5 years of coverage
for $100. After talking to friends and reading reports, Bill believes the following maintenance
expenses could be incurred during the next five year.

Expense 0 50 100 150 200 250 300


Probability 0.35 0.25 0.15 0.10 0.08 0.05 0.02

Find the expected value of the anticipated maintenance costs. Should Bill pay $100 for
warranty?
Exercise 4. The pmf for X=the number of major defects on a
randomly selected appliance of a certain type is
X = 0 1 2 3 4
-----------------------------------------------------------------------
P(x) = 0.08 0.15 0.45 0.27 0.05

Compute the following


a. E(X)
b. V(X) directly from the definition
c. The Standard deviation of X
d. V(X) by using the short cut formula.
2. Continuous Random Variables
A random variable X is said to be continuous if its set of possible values is an
entire interval of numbers-that is, if for some A<B, any number x between A and B
is possible.
Example. If a chemical compound is randomly selected and its pH X is determined,
then X is a continuous rv because any pH value between 0 and 14 is possible. If
more is known about the compound selected for analysis, then the set of possible
values might be a subinterval of [0,14] such that 5.5 x 6.5, but X would still be
continuous.
Probability Distributions for continuous Random Variables
Let X be a continuous r.v. Then a probability distribution or probability density
function (pdf) of X is a function f(x) such that for any two numbers a and b with
ab
b

P(a X b)= a f ( x) dx
That is, the probability that X takes on a value in the interval [a, b] is the area
under the graph of the density function. The graph often referred to as the
density curve.
f(x)

a b

For f(x) to be a pdf, it must satisfy the following two conditions:


1. 1. f(x) 0 for all x

2. 2. f ( x)dx

= area under the entire graph of f(x)
=1
Example: Suppose I take a bus to work, and that every 5 minutes a bus arrives at
my stop. Because of variation in the time that I leave my house, I dont always
arrive at the bus stop at the same time, so my waiting time X for the next bus is a
continuous random variable. The set of possible values of X in the interval [0, 5].
One possible pdf for X is
1
0 x 5
5
f(x)=
0 Otherwise

The pdf f(x) is graphed in the following figure. Clearly f(x) 0 and the area under
the graph is 5(1/5)=1. Also the probability that I will wait between 1 and 3 minutes
is
3
1 3

P(1 X 3) = 1 f ( x ) dx dx 2 / 5
15

f(x)
(1 PX 3)

5 x 1 3 5
The cumulative distribution function (cdf) F(x)
The cumulative distribution function (cdf) F(x) for a discrete rv X gives, for any
specified number x, the probability P(X x). It is obtained by summing the pmf p(y)
over all possible values y satisfying y x. The cdf of a continuous rv gives the same
probabilities P(X x) and is obtained by integrating the pdf f(y) between the limits
- and x.
The cumulative distribution function F(x) for a continuous rv X is defined for every
number x by
x

F(x)=P(X x)= f ( y )dy


For each x, F(x) is the area under the density curve to the left of x.

F(8)
1
f(x)
F(8)

x
5 8 5 8
A PDF and associated CDF
Using F(x) to compute Probabilities
The importance of the cdf here, just as for discrete rvs is the probabilities of
various intervals can be computed from the formula for or table of F(x).
Proposition: Let X be a continuous rv with pdf f(x) and cdf F(x). Then for any
number a, P(X>a) =1-F(a)
And for any two numbers a and b a<b,
P(a X b) =F(b) F(a)

Obtaining f(x) from F(x)


For X discrete, the pmf is obtained from the cdf by taking the difference between
F(x) values. The continuous analog of a difference is a derivative. The following
result is a consequence of the Fundamental Theorem of Calculus.
If X is a continuous rv with pdf f(x) and cdf F(x), then at every x at which the
derivative F (x) exists F ( x) f ( x)
Expected Values for Continuous Random Variables
Definition: The expected or mean value of a continuous rv X with

pdf f(x) is E ( X ) x. f ( x)dx


x

Proposition: If X is a continuous rv with pdf f(x) and h(x)is any


function of X, then E[h(X)]= h( x) f ( x)dx h( X )


The variance of a continuous random variable


The variance of a continuous random variable X with pdf f(x) and


is V ( X ) ( x ) 2 . f ( x)dx E[( X )
2
mean value X

The standard deviation (SD) of X is X V ( X )


Example1. The distribution of the amount of gravel (in tons)
sold by a particular construction supply company in a given
week is a continuous random variable X with pdf
3
(1 x 2 ) 0 x 1
2
f(x) = 0 otherwise
Find out the CDF, E(X) and V(X).
Exercises: 1. Let X denote the amount of time for which a book on 2-hour
reserve at a college library is checked out by a randomly selected student and
suppose that X has density function
0.5 x 0 x 2
f (x)
0 Otherwise
Calculate the following probabilities:
a. P( X 1)
b. P( 0.5 X 1.5)
c. P(1.5<X)
Exercise 2. A college professor never finishes his lecture before the bell rings to end
the period and always finishes his lecture within two minutes after the bell
rings. Let X = the time elapses between the bell and the end of the lecture and
suppose the pdf of X is
kx 2 0 x 2
f (x)

0 Otherwise
a. Find the value of k
b. What is the probability that the lecture ends within 1 minutes of the bell ringing
?
c. What is the probability that the lecture continues beyond the bell for between
60 and 90 seconds?
d. What is the probability that the lecture continues for at least 90 seconds beyond
the bell?
Exercise 3. The cdf of checkout duration X as described in exercise 1 is

F(x)= 0 for x<0


2
x
0 x2
4
2 x
1
Use this to compute the following:
a. P ( X 1)
b. P(0.5 X 1)
c. P(X >0.5)
~
~

d. The median checkout duration {solve 0.5=F ]
e. F (x) to obtain the density function f(x)
Exercise 4. Let X denote the checkout time duration with pdf given in the above
exercise.
a. Compute E(X)
b. Compute V(X) and X

c. If the borrower is charged an amount h( X ) X 2

when checkout duration is X, compute the expected charge E[h(X)].

Anda mungkin juga menyukai