Anda di halaman 1dari 11

Unit 11

Probability and Calculus I


Discrete Distributions
Terminology:
Each problem will concern an experiment, which can result in any one
of a number of dierent outcomes.
The set consisting of all possible outcomes of an experiment is called
the sample space, S, for the experiment. For any set A, we use |A| to
denote the number of elements in A (the size of A). Thus |S| is the
number of dierent possible outcomes of the experiment.
Any collection of outcomes (i.e. any subset of the sample space) is
called an event.
Associated with each event is a number between 0 and 1 called its
probability, which measures the likelihood of the event occuring when
the experiment is performed.
Notice: All of the outcomes in a sample space must be dened so as to
be mutually exclusive, i.e., only one can occur at a time. As well, the sample
space must include all of the possible outcomes. So any time that an exper-
iment is performed, one and only one element of S is observed (i.e., occurs).
Theorem 11.1. Let S be the sample space for some experiment and E be
any event dened on that sample space. If all outcomes of the experiment
are equally likely to occur, then the probability associated with E, denoted
prob(E) is given by
prob(E) =
|E|
|S|
i.e., prob(E) = number of outcomes in E number of outcomes in S.
1
Example 1. For the experiment roll a fair die, nd the probabilities of: A,
the event that an even number is rolled, B, the event that an odd number is
rolled, and C, the event that either a 1 or a 2 is rolled.
Solution: If an experiment consists of rolling a fair die, then the possible
outcomes are the numbers which may be rolled. We see that
S = {1, 2, 3, 4, 5, 6} with |S| = 6
and we have
A: roll an even number A = {2, 4, 6} so |A| = 3
B: roll an odd number B = {1, 3, 5} so |B| = 3
C: roll a 1 or a 2 C = {1, 2} so |C| = 2
Since each number is equally likely to come up when we roll a fair die, we
have:
prob(A) =
|A|
|S|
=
3
6
=
1
2
prob(B) =
|B|
|S|
=
3
6
=
1
2
prob(C) =
|C|
|S|
=
2
6
=
1
3
When the various outcomes of an experiment are not equally likely to oc-
cur, then we nd the probability of an event by adding up the probabilities
of the various outcomes contained in the event. Notice: This is eectively
what we have done for equally likely outcomes, too.
Example 2. An urn contains 1 yellow ball, 2 red balls, 3 blue balls and 4
green balls. One ball is drawn from the urn at random. Find the probability
that either a yellow or a blue ball is drawn.
Solution: We can dene the outcomes of this experiment as the colour of
the ball drawn. Letting Y , R, B and G denote yellow, red, blue and green,
respectively, we have S = {Y, R, B, G}.
The urn contains 10 balls, and each of these balls is equally likely to be
drawn. Since we have 1 yellow, 2 red, 3 blue and 4 green balls, we have
1 chance in 10 of drawing a yellow ball, 2 chances in 10 of drawing a red
ball, 3 chances in 10 of drawing a blue ball and 4 chances in 10 of drawing a
green ball. Thus we see that prob(Y ) =
1
10
, prob(R) =
2
10
, prob(B) =
3
10
and
prob(G) =
4
10
.
2
Letting E be the event that either a yellow ball or a blue ball is drawn, we
see that E = {Y, B}, so we have:
prob(E) = prob(Y ) +prob(B) =
1
10
+
3
10
=
4
10
= .4
There are various properties of probabilities which must hold:
Remember that, by denition, a probability is always a number between 0
and 1 (inclusive).
When any experiment is performed, some outcome will be observed. The set
S must contain all possible outcomes, and some outcome must occur, so the
event S, i.e., the event that some outcome in the sample space will be ob-
served, must happen, i.e., is a certainty, so its probability is 1. Similarly, the
empty event, i.e., the event (the empty set) cannot occur. Its probability
is 0, because there is no likelihood that it will occur.
Recall that for any sets A and B, A B denotes the union of A and B, i.e.,
the set of all elements which are either in set A or in set B (or in both), while
AB denotes the intersection of these sets, i.e., the set of all elements which
appear in both sets. Consider some events A and B dened on the sample
space S of some experiment. Suppose that events A and B have no outcomes
in common, so that AB = . Now consider the event AB. We can nd
the probability of AB by summing the probabilities of all outcomes which
are contained in this event. But since AB = , then each of the outcomes
in AB is either in A or in B, but is not in both. Therefore, before summing
the probabilities of these oucomes, we could group them into those which
are in A and those which are in B. If we now sum the probabilities of all
outcomes in each of the groups, what we get is prob(A) and prob(B). That
is, as long as A B = , we have prob(A B) = prob(A) +prob(B).
Notice: If there is one or more outcome which is contained in A and also
in B, so that A B = , then this relationship is not true, because the
probability of such an outcome is counted twice in prob(A) +prob(B) (once
in each part), but is counted only once in prob(A B).
The complement of an event is dened to be the set of all possible outcomes
which do not appear in that event. We denote the complement of event A
by A
c
. Clearly, A and A
c
cannot both occur on the same performance of
3
the experiment, since no outcome appears in both sets (i.e., A A
c
= ,
so prob(A A
c
) = 0). Therefore, prob(A A
c
) = prob(A) + prob(A
c
). But
since A
c
contains all the elements of S which are not in A, then A A
c
= S
and since prob(S) = 1, then we have prob(A) + prob(A
c
) = 1, which gives
prob(A
c
) = 1 prob(A).
We summarize these properties in the following theorem:
Theorem 11.2. Let S be a sample space of an experiment and let A and B
be any events dened on S, where denotes the empty set, and A
c
denotes
the event complementary to A. Then:
1. 0 prob(A) 1
2. prob() = 0 and prob(S) = 1
3. if A and B cannot both occur at the same time (i.e. AB = ), then
prob(A B) = prob(A) +prob(B)
4. prob(A
c
) = 1 prob(A)
Random Variables
Denition 11.3. A Random Variable (abbreviated as R.V.) is a function
dened on a sample space, usually denoted by upper case letters near the
end of the alphabet, say X or Y or Z. That is, a random variable associates a
particular value with each of the elements (i.e., outcomes) in a sample space.
Denition 11.4. A discrete Random Variable is a random variable that may
take on any of a nite number of values. (That is, the range of the function
is a nite set.)
Example 3. An experiment consists of tossing 2 fair dice. Let X be the ran-
dom variable corresponding to the sum of the spots showing on the 2 dice.
Find prob{X 4} and prob{X > 4}.
Solution: Each of the dice will show an integer number of spots, from 1 to 6.
When we sum two such numbers, we always get an integer between 2 and 12
4
(inclusive). Thus the possible values of the random variable X are 2, 3, ...,
12. Since there are only 11 possible values, X is a discrete random variable.
We can denote the possible outcomes of the experiment in the form (a, b),
where a is the number showing on the rst die and b is the number showing
on the second die. The sample space, S, contains 36 possible outcomes, as
shown below. Of course, since each die is equally likely to show any of the 6
possible numbers, each of the outcomes in S is equally likely to occur.
S =
_

_
(1,1) (1,2) (1,3) (1, 4) (1, 5) (1, 6)
(2,1) (2,2) (2, 3) (2, 4) (2, 5) (2, 6)
(3,1) (3, 2) (3, 3) (3, 4) (3, 5) (3, 6)


(6, 1) (6, 2) (6, 3) (6, 4) (6, 5) (6, 6)
_

_
, |S| = 36
We are interested in the event {X 4}. For any particular outcome
(a, b), the value of X corresponding to that outcome is x = a + b. We
can nd the probability that the value of X is at most 4 by examining the
outcomes in the sample space and identifying those which give a sum no
bigger than 4. We see that there are 6 such outcomes, identied above, so
we have |{X 4}| = 6. And since |S| = 36, we get
prob{X 4} =
|{X 4}|
|S|
=
6
36
=
1
6
The probability that the sum of the spots on the two dice adds up to
more than 4 can also be determined from the sample space. We could look
at all of the outcomes and identify which of them correspond to sums greater
than 4. However, it is easier simply to recognize that the outcomes which
have a sum greater than 4 are exactly the outcomes which do not have a sum
less than or equal to 4. That is, the event {X > 4} contains precisely those
outcomes which are not in the event {X 4}, so {X > 4} is the complement
of {X 4}. Since the sample space contains 36 outcomes, of which 6 are in
{X 4}, then there are 36 6 = 30 of them which are not in {X 4} and
hence are in {X > 4}. So we see that
prob{X > 4} =
|{X > 4}|
|S|
=
30
36
=
5
6
5
Alternatively, since {X > 4} = {X 4}
c
, then we can nd this by
prob{X > 4} = 1 prob{X 4} = 1
1
6
=
5
6
Denition 11.5. Suppose a discrete Random Variable, X, has possible val-
ues denoted by x
1
, x
2
, x
3
, ..., x
n
. Then associated with the R.V. X we have
the Probability Function, f
i
, dened by
f
i
= prob{X = x
i
}
The probability function of a discrete R. V. X can be presented either
as a table of values, or as a bar graph in which there is a bar of height f
i
at
each possible value x
i
.
For instance, for the random variable X from the previous example, f
i
is
described by either of the following charts. (You can conrm the values of
f
i
shown here by examining the Sample Space shown previously and deter-
mining how many of the 36 possible outcomes give the value x
1
.)
x
i
f
i
2 1/36
3 2/36
4 3/36
5 4/36
6 5/36
7 6/36
8 5/36
9 4/36
10 3/36
11 2/36
12 1/36
1/36
2/36
3/36
4/36
5/36
6/36
2 3 4 5 6 7 8 9 10 11 12
X
i
f
i
Example 4. Three fair coins are tossed. Let X be the number of heads which
come up. Make a table showing the probability function of X.
Solution: We can denote the possible outcomes of this experiment as lists of
3 letters, H or T, denoting whether each of the 3 tosses comes up heads (H)
6
or tails (T). For instance, HTH denotes that the rst toss was heads, the
second toss came up tails and on the third toss heads was observed again.
We get the sample space
S = {HHH, HHT, HTH, HTT, THH, THT, TTH, TTT}
Since the coins are fair, heads and tails are equally likely to come up on each
toss, so each of these 8 possible outcomes is equally likely to occur. Hence
(since we must have prob(S) = 1) the probability of each of these outcomes
is
1
8
.
X, the number of heads which come up, could be 0, 1, 2 or 3. That is, the
possible values of the random variable X are 0, 1, 2 and 3. For instance,
the event {X = 0} can only happen when all 3 tosses come up tails, so
{X = 0} = {TTT}, and we see that prob{X = 0} =
1
8
. That is, for x
1
= 0,
we have f
1
=
1
8
.
To nd the other values of the probability function, we just do this for each
possible value of X. That is, for possible value x
i
, we think about the event
{X = x
i
} and determine f
i
= prob{X = x
i
} by determining how many pos-
sible outcomes of the experiment correspond to that event.
We make a table showing the values of x
i
, i.e. the possible values of X, and
the corresponding values of f
i
. (The events corresponding to the possible
values of X are shown here as an (optional) extra column at the right of the
table.)
x
i
f
i
Event {X = x
i
}
0
1
8
{TTT}
1
3
8
{HTT, THT, TTH}
2
3
8
{HTH, HHT, THH}
3
1
8
{HHH}
7
Mean and Variance
Denition 11.6. Let X be a random variable whose possible values are
x
1
, x
2
, ..., x
n
. Let f
i
= prob{X = x
i
} be the probability function for X.
Then the mean or expected value of X, denoted by (or
X
), is given by
=
n

i=1
x
i
f
i
Note: is the Greek letter mu (pronounced mew).
Also Note:
n

i=1
means take the sum, for all values of i from i = 1 up to i = n,
of whatever comes next. So
n

i=1
x
i
f
i
means x
1
f
1
+x
2
f
2
+... +x
n
f
n
.
The use of the expression expected value interchangeably with the term
mean arises from the following interpretation: If an experiment is repeated
very many times, and then the observed values of X are averaged, then we
would expect that this average value of the observed Xs would be approx-
imately . That is, when we repeat the experiment n times (where n is
large), we expect to observe value x
i
about n f
i
times, so if we add up all
observations and divide by n (i.e., average the observations), we expect the
resulting value to be
_
n

i=1
x
i
(nf
i
)
_
/n =
n

i=1
x
i
f
i
= . (However, it wont
often be exactly that value in practice.)
Example 5. Find the mean of the random variable X in Example 4.
Solution: From the table we found in Example 4, we have
=
4

i=1
x
i
f
i
= (0)
_
1
8
_
+ (1)
_
3
8
_
+ (2)
_
3
8
_
+ (3)
_
1
8
_
= 0 +
3
8
+
6
8
+
3
8
=
12
8
=
3
2
Example 6. Three coins are tossed. Let Y be the (absolute) dierence be-
tween the number of heads and the number of tails that come up. Find the
mean of Y .
8
Solution: We have the same sample space as in the previous example. That
is, we have S = {HHH, HHT, HTH, HTT, THH, THT, TTH, TTT}. We
nd the value of Y corresponding to each outcome by nding the number of
heads and the number of tails and subtracting the smaller of these numbers
from the larger. That is, we have:
outcome: HHH HHT HTH HTT THH THT TTH TTT
# Heads: 3 2 2 1 2 1 1 0
# Tails: 0 1 1 2 1 2 2 3
Value of Y : 3 1 1 1 1 1 1 3
We see that the probability function for Y is:
y
i
f
i
1
6
8
=
3
4
3
2
8
=
1
4
Thus the mean of Y is given by:
=
2

i=1
y
i
f
i
= (1)
_
3
4
_
+ (3)
_
1
4
_
=
3
4
+
3
4
=
6
4
=
3
2
Knowing the mean of a random variable X tells us some information
about the random variable, by telling us where its probability function is
centred (i.e., its average value). However, if this is all we know, it doesnt
really tell us very much about the probability distribution, because we dont
know how spread out the possible values are. It is also useful to have a mea-
sure of how spread out the values are (i.e., whether they are mostly closely
grouped near the mean, or are widely dispersed, at greater distances from
the mean).
9
The variance of a probability distribution is the most commonly used
measure of how spread out the probability distribution is.
Denition 11.7. The variance of a random variable X which has mean
is denoted by
2
(or
2
X
) and is dened as:

2
=
n

i=1
(x
i
)
2
f
i
=
_
n

i=1
x
2
i
f
i
_

2
Notice: This denition states 2 dierent formulas for
2
. It can easily be
shown that the 2 formulas are equivalent. The rst formula is the formal
denition of variance. However, the second formula is much easier to use in
practice.
Example 7. Three coins are tossed. You will win $2 if heads comes up more
often than tails, but you will lose $1 if tails comes up more often than heads.
Calculate your expected winnings, and the variance of your winnings.
Solution: Let X be the amount you win, in dollars. Then X has possible
values 2 and 1. (Note: a loss is a negative win.) We have the same sample
space as before, i.e.
S = {HHH, HHT, HTH, HTT, THH, THT, TTH, TTT}
We see that {X = 2} = { more heads than tails } = {HHH, HHT, HTH, THH},
with |{X = 2}| = 4, so that prob{X = 2} =
4
8
=
1
2
.
Similarly {X = 1} = { more tails than heads } = {HTT, THT, TTH, TTT},
with |{X = 1}| = 4, so that prob{X = 1} =
4
8
=
1
2
as well. Thus, we
have the probability function
x
i
f
i
2
1
2
1
1
2
Your expected net winnings is just the expected value, i.e. mean, of X, which
is (2)
_
1
2
_
+(1)
_
1
2
_
= 1
1
2
=
1
2
, i.e., $0.50. That is, if you were to play this
game many times, on average you would expect to win about $0.50 per play.
10
We can calculate the variance of your winnings using the second formula
from the denition. This gives:

2
=
_
2

i=1
(x
i
)
2
f
i
_

2
=
_
(2)
2
_
1
2
_
+ (1)
2
_
1
2
__

_
1
2
_
2
= 4
_
1
2
_
+ 1
_
1
2
_

1
4
=
5
2

1
4
=
9
4
11

Anda mungkin juga menyukai