Anda di halaman 1dari 4

Why

": ...

The mathematical
structure
of probabi!ity
distributions that underlies Statistics is quite complexo
Students are bewildered by terrns such as pdf and cdf,

mean and variance, and the rel.uionships

a sarnple

. meeting of friends

space?

If you

require

problern

whereby

two friends agree

to rneet at a particular

coffee house for lunch, arriving


at randorn within lhe lunch hour, and they agree to
wait only 15 rninutes. Both contexts pose obvious
intcresting questions, and in each case lhe uni! square

betwcen the

terms. Carefully
choscn cxercises
help, und it is
particularly
instructive if a variety of such exercises
~;'

use such

motivation for its study, why not consider the old fairground game of rolling acoin of diameter
d orno a
plane of squares of side I? Alternatively,
consider the

have a common
staning
point which is easy to
understand.
"Choosing a point at random in a square" is a sim pie,
intuitive statistical experiment. With rnicro-computers
in the classroom
it can be dernonstrated
easily. We
show in this article just how much of the elernentary

is the correct sarnple space.

theory

Our sample space immediately provides two randorn


variables, X and 1'. As a first step students must answer
probability questions about X and Y by deterrnining

$-

of probability
distributions
can be explored
",ithin this basic frarnework. At the same time;
. difficulties
experienced
by students are highlighted
fr()ID.

in the hope that readers may respond


for.overcoming.rheml,
._..

f;

f 4\'~THE

with suggestions
"

appropriate
._~ar.e.ElS-

Consider

lhe square with veruces

the square has (randorn


random we understandthat

<"*

t) involves

answer

t.

rclationships are uscd inverscly. (The rundorn variable


Xis strictly a function which maps a point (x, y) ente
x, but what is really imponant is to be able to use ihis

at lhe poinis (0,0)


at randorn

P(X < 1) (with

realising lha! the event X


consists of all points in
lhe squarc to the left of the line x =
Note that, as
w irh rn a ny s ituat io ns in Srat istic s , fu nct io n a l

SAMPLE SPACE

(1,0) (0,1) and (1,1). A point chosen

~~.

regions of the unit square and calculating


..
.
. _.

The ca1culation

UNIT SQUARE AS A ~.

~'<//
~~--------------------~-----------------0/1
v

PROBA,8!LlTIES ANO R.ANDOM


VARI.ABLES

Irem

) co-ordinates
X, Y.. ~
the probabilitv
af tbt>

function in reverse, and determine

fillldQm !2oint lying in a parti::ular.rr.gio.o Q[Jb~ ~q!lare


prQPortiQnal to the area of the region (and hence,
's[ri-ce'th total areaequalsJ.
is 'equal tO the area):' - .-.

i.s

+.)

the regions of poirus

(x, y) such that XCx, y) <


1 suspect this aspect of
functions receives insufficient
attention
in A-levei
-Mathernatics courses, at !east frorn a staristical poin;
of view,
A typical set of exercises

(0,1)

might be:

(1,1)

(iii)

?(X <

i Al"iD Y >~)
e,-

62

Co~,,"~

~
x:> y<.

~~

"" t-"'~~

o. ~o-<.c

Students find lhe: answers without too much difficulty


and rnay be prornpred to enquire whether the fact that
answer (iii) = answer (ii)

can be answered easily using the independence cf


X and Y. Of course, (e) can also be answered similarly
if we concentrate first on the complementary event
p( L> -t) = p( X> -t AND Y> -t). The two possible
calculatibns of (e) are written below, together with
appropriatc diagrarns.

answer (i)

is a freak ar not, (Hence to independence of randorn


variables, )
Toe. great thing about randorn variables is that they
hreed well! The usual arithmetical processes can be
applied to randorn variables on lhe same sarnple space
to give birth tO more. Here are some examples and
some suggested caJculations, each of which should
beaccornpanied by a quick sketch of the appropriate
region of the sarnple space.

p(s <.L)2 = 18

(a) s= X+ y
t b)
(e)

(d)
(e)

P(L<t)

D=X-Y

p( D < -t) =

M=XY
R= Y X

p( M < -t) = -t + -t log

= max(X,

=l+.L_lx.L

P(R<2)=%
p( L < ) = %
p( U < 1) = 1

L = rnint X, Y)

(j) U

Y)

=p(X<t OR Y<-t)
=p(X<t)+p(Y<t)
- p( X < t AND Y < t)
2

"4

Each of these exarnples seems to cause difficulties. A


general probJem is the reluctance to deal with
inequalities, Despite answering (i), (ii) and (iii) above,
students balk at detennining the region such that
X + Y < -t and need to be persuaded to look at the
boundary X + Y =
and infer the region from that.
Whilst this rnay be successful, applying lhe same
technique to (b) often lcads to an incorrect region! (1
sug gcst a chcc k hy inserting a poinl into the
incquality.)
_Ques~in (c) causes problems forsimilar reasons.
and, furthermore, lhe calculation of the areainvolves
splitting into separate parts, corresponding to the area
J
of the rectangle X ~ -, and the are a under lhe

?
I
hyperbola x y = -:5- fOm -" to 1.
, Question (d) should be easier than (c) but is not,
because of a reluctance to manipulate inequalities to
turn Y / X <
into Y < x.
The last two questions appear to present particular
difficulties, probably because of lack of farniliarity
with the ideas of maximum and minimum as binary
operations. They are, however, particularly fruitful
questions to ask as they link directly with the basic
ax iorn of addition of probabilities.
The event

-_

..
;::-------rrrrrrrrliT1

P(L>t)=P(X>-t
=

y<-t).

OISTRIBUTIONS

ANO OENSITiES <to

When lhe sample space is explicit, the route to


information concerning randorn variables is as
follows:

p(U<+)=p(X<+.A.ND

p(x > -t).p(

p( L <: t) ==t

(i)

Calculate Fw(s)=P(W~s)forallrelevant

s;

dFw(s)

(ii) Calculate fw(s)==

ds

(iii) Calculate E(W) and Var(W) by integration


invoiving f w(s).

p( L < -t) is cJearly (?) equal to p( X < -t OR Y < -t)


whilst

y>-t)
X> +) == t

AND

Hence(f)

Parts (ii) and (iii) are mechanical


and do not usually
present problems. However, (i) does!
The use of a dummy variable
seems

to cause

conceptual

s)

(here I have used

problems.

Students

),

t),

'for ali relevant

s'. Take

course

E(X

through

this

= E(X)

similar

variables

for example

has

to

+ E(Y)

thc

answer

to

show

for lhe other

is also instructive.

as
that

randorn

For exarnple:

O:O;sS;i

1 - (I - s)2

0:0;.1':0;1

fL(s)=2(I-s)
(f L (s) is the probability

P(X + Y::;S)=T

be
and

ca1culations

mentioned

P(L::; s)

of S = X + Y. Here is a typicai solution.

the di stribution

Y)

E(X) = E(Y) =
is trivial!)
It is worth continuing
the exarnple to show that
Var(X + Y) =3; = Var(X) + Var(Y).
(The equality
holds because X and Y are independent.)
Vlorking

are

happy to calculate
p( X::;
p( X s
p( X::;
etc., but became mildly disconcerted with calculating
,
_< s) clor O _< s _
D(X
< 1.
. A related difficulty is that they very often fail to
note the phrase

(Of

of L).

function

It is worth pointing out that the graph of f L shows that


small values of L are more likely than large values,
not surprising since L is lhe minimum of two randorn
quantities, Furtherrnore,
E(L) =
and so E(U) rnust
equal

But X + Y can take values in the range O to 2, and


moreover
the nature of the calculation
changes,
dependent on s being less than or greater than 1. Here
is the ful! calculation .'

t. Why?

because

L+U=X+Y,
E(L) + E(U) = E(L + U) = E(X + Y) = I.
If time permits, make the student calculate

Var(L) and

Vare U). Then it is easy to check that

P(X+

Var(L) + Vare U) :F Var(L + U) = Var(X + Y) =

Y::;S)=T
O:O;s:O;l

~.

+.
0

Thus we have demonstrated

that lhe iaw

s
does no! work for correlated variables L, U but does
work for independent
variables X, Y.
Whal about M = XY? Calculation
of lhe densiry of

M is straightforward.
P(X

+ y::;

s) = 1 _ (2-2S)2

P(M:O;S)=S+J'

l::;s::;2

"","A typicalreaction

ofthis

is "Row can you hav~

E(X + Y) =

f s.
O

s.ds

s.(2 - s)ds =

[ages

tW~

answers to one question?"! Moreover the disquiet


continues when you suggest they should di fferen ti ate
this cdf to find the pdf and hence calculare the mean
and variance of S = X + Y. The integral has to be split
into two integrais:
J

Ldx=s-s
S

t.

and eventually E(M) =


This, of course, must be
the answer as X and Yare independent.
hence E(XY)
E(X) x E(Y).

This leaves the ratio R = Y/X. Does E(R) = I?


(Argument: E(R) "=" E(Y)IE(X) = 1.). A most ernphatic

t + 1= 1.

No!

In fact, the expeeted value of R is infinite, which is


not difficult to prove providing
the correct cdf is
calculated. (See the remarks for S = X + Y)

64

,-

A more interesting
question
is to calculate
the
distribution of the waiting time of A (c all this H') and
hence evaluate E(Vi'). A problern arises beeause VI is
neither a diserete nor a eontinuous randorn variable it takes the values O and 114 with positive probability
whereas individual
values between O and 1/4 are
possible but have zero probability.
So, such rnixcdtype random variables are not merely mathematical
artefacts: they do oceur in lhe simplest of probability
spaees - thc unit square - inan uncontrived manner.
The calculation
of P(W ::; s) is straighrforward
if

P(Y/X::;s)=1-+

care is taken! Here are the steps with diagrams.

_S

ls s c s=

1/s
P(W

= O) = P(X

1..
4

P(W::;

s) =

P(X I

=1-2("4)

path,
~

cdf ~ expected

values,

"'-"-"-

which is possible because we siart with a simple but


rich sample space - Lhe unit square.

+)

-' PC W

::;

+
3

< Y < X + s)
2

-2(1-s)

:= I

Below is a sketch of lhe curve P(W::; s), and a simple

~ A FiNAL EXAMPLE ~

way to calculate

E(Vi') is to ca1culate

region above this curve, bounded


For lhe fi nal example,

t < Y < X)

=7/32

This is a quick tour de force of the sort of


caJculations
that are possible using fairly obvious
random variables generated in lhe unit square. The
ex arnples are worthy exercises for student and teacher
ali ke. As i hope I have demonstrated,
many aspects of
lhe theory of distributions
and expeeted values are
illustrated, and particular emphasis is placed on the

Probability

1981) and lhe answer s

return to the meeting of'friends

i.

lhe area of lhe

by y = 1 (see Sykes,

problem. Let A and B be the two friends, and suppose


that A arrives at time X (measured in hours, O ::; X::; 1)

r. The

and B arrives at time


calculare lhe probability
. - Assurningthel

usual problem is to
that they actually meet.
waitingtirne,
th{s probability

S'minuie

is sirnply given by

1..
4

n(IX
vI
1\
.
(,,2
r I
- J < -) = I - -)
\

7
16

A.M. (]981).

lhe Mean.

Quote 9
All models

are wrong,

some areuseful,

C.E.? Box per M. Sandford.

6S

Reference

I Sykes,

An Alternative

Approach

Teaching Statistics, 3(3), 82-87.

to

Anda mungkin juga menyukai