© All Rights Reserved

1 tayangan

© All Rights Reserved

- Introduction to Stochastic Processes with Applications in the Biosciences
- Probability of Simple Events
- Isi Mtech Cs 05
- Tut 14 Binomial and Poisson Distn_Solutions
- IE_5004_L1
- Notes
- Triola Cover&Contents
- Excess Loss Variable
- 07 Mourelatos Bearings Paper
- mtechcs10
- CTF_2
- 210 Book
- 2006EA-1
- Go Final
- On Thinning a Spatial Point Process Into a Poisson Process Using the Papangelou Intensity
- games.pdf
- Recitation 1A
- LPC lecture slides
- Index.pdf
- Sample Midterm Exam Questions

Anda di halaman 1dari 19

K B Athreya

Doob, who played a critical role in the devel-

opment of probability theory in the world from

1935 onwards. The goal of the present article is

to explain to the readers of Resonance what prob-

ability theory is all about.

K B Athreya is a retired

professor of mathematics

Probability theory provides the mathematical basis for

and statistics at Iowa State the study of random phenomena, that is, phenomena

University, Ames, Iowa, in whose outcome is not predictable ahead of time. In this

the USA. He spends article, we try to provide a more detailed answer.

several months in India

visiting schools, colleges Introduction

and universities. He enjoys

teaching mathematics and Let us start with an example each of random and non-

statistics at all levels. random (also called deterministic) phenomena:

He loves Indian classical

and folk music.

i) What will be the temperature at 4pm a week from

now at the 18th Cross and Margosa Road intersec-

tion in Bengaluru?

football ground and observe whether it falls down

to the ground or not.

one non-random?

By and large, most physical and natural phenomena can

be classiﬁed into one or other of these two categories.

Keywords

The readers are invited to construct their own examples

Random variables, distribution of real world phenomena of both kinds (say, two each).

function, statistical inference,

error function, law of large num- Over the last few centuries mathematical methods have

bers. been developed to study many deterministic (i.e. non-

GENERAL ⎜ ARTICLE

The study of motion of physical objects over time by

Newton led to his famous three laws of motion as well

as many important developments in the theory of ordi-

nary diﬀerential equations.

Similarly, the construction and study of buildings led

to important results in geometry in many parts of the

world such as India, China, Middle East, Greece. Also

advances in quantum mechanics, relativity, etc., were

based on deep results from the theory of ordinary and

partial diﬀerential equations.

Early Beginnings

A mathematical study of random phenomena could be

said to have originated in the calculations of odds in

some gambling problems in the 18th century Europe.

The principal models considered were binomial distri-

butions and their Poisson approximations, and later on

normal approximations. For example, if a coin is tossed

n times independently (i.e., the outcome of the tosses

of any subset of these n tosses has no eﬀect on the

outcomes of the remaining tosses) and the probability

of ‘heads’ in any one toss is p, 0 ≤ p ≤ 1, then it

can be shown that the probability of getting r heads

in n tosses is simply pn,r ≡ ( nr )pr (1 − p)n−r for r =

0, 1, 2, · · · , n, where ( nr ) = r!(n−r)!

n!

. This collection of

(n + 1) numbers {pn,r , r = 0, 1, 2, · · · , n} is called the

binomial probability distribution B(n, p), 0 ≤ p ≤ 1, A mathematical

n = 0, 1, 2, · · · . Note that pn,r is non-negative and

study of random

n

r=0 pn,r = (p + (1 − p)) by the binomial theorem

n

phenomena could

and hence is equal to 1. Later on, it was shown by Pois- be said to have

son that this quantity pn,r ≡ ( nr )pr (1 − p)n−r could be originated in the

r

approximated for each r, 0 ≤ r ≤ n, by pr ≡ e−λ λr! calculations of

if n is large and p is small but np is neither large nor odds in some

small but close to some λ, 0 < λ < ∞. This collec- gambling problems

r

tion {pr ≡ e−λ λr! , r = 0, 1, 2, · · · } of numbers is called in the 18th century

the Poisson λ probability distribution, 0 < λ < ∞. It

Europe.

GENERAL ⎜ ARTICLE

may

r=0 pr = 1.

Let n be large and let pn be not necessarily small but

such that σn := npn (1 − pn ) converge to a number in

(0, ∞). Then, the binomial probabilities can be approx-

imated by a Gaussian distribution. nMore precisely, the

sum of the binomial probabilities r pn (1 − pn )n−r over

r

will converge to Φ(b) − Φ(a) where Φ(y) is equal to the

integral of the error function over (−∞, y). This can be

translated into a probability statement:

As n → ∞,

Xn − rpn

Prob a < <b →

rpn (1 − pn )

b

1 x2

Prob a < Y < b) ≡ √ e− 2 dx,

a 2π

where Xn is a random variable with the distribution

binomial (n, pn ) and Y is a random variable that is

normally distributed with mean EY ≡ 0 and variance

V (Y ) = EY 2 − (EY )2 = 1. This is also referred to as

1

The great mathematician an example of the Central Limit Theorem1 (CLT) and

George Pólya coined the term could be thought of as a reﬁnement of the weak law of

'central', meaning fundamental.

large numbers which says: for each > 0,

An issue of Resonance, Vol.19,

No.4, 2014 is devoted to Pólya.

Xn

Prob − pn > → 0

n

as n → ∞ provided npn (1−pn ) → ∞. Later, both these

results were proved for a much larger class of distribu-

tions, than just the binomial (n, pn ) cases.

Kolmogorov’s Model

A mathematical theory as a basis for studying random

phenomena was provided by the great Russian mathe-

2

Resonance, Vol.3, No.4, 1998. matician A N Kolmogorov2 around 1930. About twenty

GENERAL ⎜ ARTICLE

years earlier, Henri Lebesgue of France extended the no- Kolmogorov saw in

tion of length of intervals in R, the real line, to a large Lebesgue's theory of

class M of sets in R, now called Lebesgue measurable measure on , an

sets. The extended function λ on M satisﬁed the condi- appropriate

tion that (R, M, λ) is a measure space, i.e., M is known mathematical model

now as a σ-algebra of subsets of R that included all inter- for studying random

vals and λ : M → [0, ∞] was such that λ is a measure. phenomena.

(See precise deﬁnition later.)

Kolmogorov saw in Lebesgue’s theory of measure on R,

an appropriate mathematical model for studying ran-

dom phenomena.

First, one identiﬁes the set Ω of possible outcomes asso-

ciated with the given random phenomena. This set Ω is

called the sample space and a typical individual element

ω in Ω called a sample point. Even though the outcome

of the experiment is not predictable ahead of time, one

may be able to determine the ‘chances’ that some par-

ticular statement about the outcome is valid. The set

of ω’s for which a given statement is valid is called an

event. Thus, an event is a subset of the sample space Ω.

After identifying the sample space, one identiﬁes a class

F of subsets of Ω (not necessarily all of P(Ω), the power

set of Ω, i.e., the collection of all possible subsets of Ω)

and then a set function P on F such that for an event

A in F , P (A) will represent the chance of the event A

happening. Thus, to a given random phenomenon, one

associates a triplet (Ω, F , P ) where Ω is the set of all

possible outcomes (called the sample space), a collection

F ⊂ P(Ω) (called the events collection) and a function

P on F to [0, ∞] (called a probability distribution). It

is reasonable to impose the following conditions on F

and P .

A, i.e., Ac = {ω : ω

∈ A}), i.e., if A is an event

then A not happening, i.e., Ac should also be an

event.

GENERAL ⎜ ARTICLE

A1 and A2 are events then at least one of the two

events A1 and A2 happening should also be an

event.

iii) For all A in F , P (A) should be in [0, 1] with P (Ω) =

1 and P (∅) = 0, where ∅ is the empty set.

iv) A1, A2 ∈ F , A1 ∩ A2 = ∅ should imply

P (A1 ∪ A2) = P (A1) + P (A2 )

i.e., if A1 and A2 are mutually exclusive events

then the probability of at least one of them hap-

pening should simply be the sum of the probabil-

ities of A1 and A2.

and P is a ﬁnitely additive set function on F , i.e., F is

closed under complementation and ﬁnite unions and

k k

P Ai = P (Ai )

i=1 i=1

i

= j.

Next, it is reasonable to require that F be closed under

monotone increasing unions and P be monotone con-

tinuous from below. That is, if {An }n≥1 is a sequence

of events in F such that An ⊂ An+1 for each n ≥ 1,

then the ‘event’ A ≡ ∞ n=1 An of at least one of the

An ’s happening should be in F and P (A) should equal

lim P (An ).

This requirement is imposed by the practical idea that if

A is a complicated subset of Ω but can be approximated

by a sequence {An }n≥1 of non-decreasing events such

that the above holds then A should be an event and

P (An ) should be close to P (A) for large n. Thus, in

addition to conditions (i−iv) on F and P , it is natural

to require the following:

GENERAL ⎜ ARTICLE

imply A ≡ ∞ n=1 An ∈ F and P (An ) ↑ P (A) as

n → ∞.

uous from below (mcfb). This last condition (v) looks

very natural but along with (i−iv) forces that (Ω, F , P )

be a measure space, i.e., the following holds:

mentation and countable unions) and P : F →

[0, 1] is a measure, i.e., P is countably additive,

i.e.,

∞ ∞

P Bn = P (Bn )

n=1 n=1

for any {Bn }n≥1 ⊂ F such that Bn ∩ Bm = ∅ for

n

= m.

bility space. Thus, Kolmogorov’s model for the study of

a random phenomena E is to determine Ω, the sample

space, the sets of all possible outcomes of E, a collection

F of events and a probability set function P mapping F

to [0, 1] so that the triplet (Ω, F , P ) is a measure space,

i.e., the condition (vi) holds with P (Ω) = 1.

Some Examples.

Example 1 (Finite Sample Space). Let Ω ≡ {ω1 , ω2 , · · · ,

ωk }, k < ∞, F ≡ P(Ω), the power set of Ω, i.e., the

collection of all possible subsets of Ω (show that there are

exactly 2k of them). Now every probability set function

P on F is necessarily of the form:

k

P (A) = pi IA (ωi ) ,

i=1

where {pi }ki=1 are such that pi ≥ 0 for all i and ki=1 pi =

1 and IA (ω) = 1 if ω is in A and 0 if ω is not in A.

GENERAL ⎜ ARTICLE

model for random ﬁnitely many possible outcomes. An important exam-

experiments with ple of this is in ﬁnite population sampling, used exten-

finitely many sively by the National Sample Survey Organization of

possible outcomes. the Government of India as well as many market re-

An important search groups.

example of this is in Let {U1 , U2 , · · · , UN } be a ﬁnite population of N units

finite population or objects. These could be individuals in a city, dis-

sampling, used tricts in a state, acreage under cultivation of some crops,

extensively by the etc. In a typical sample survey procedure, one chooses

National Sample a subset of size n (n usually small compared to N) and

Survey Organization makes measurements on the chosen subset and uses this

of the data to make inferences about the big population. Here,

Government of India each sample point is a subsetN of size n. Thus, the sample

N!

as well as many space Ω consists of k = n = n!(N −n)! sample points and

market research the probabilities pi of selecting the ith sample are deter-

groups. mined by a given sampling scheme. In the so-called sim-

ple random sampling without replacement (SRSWOR),

each pi = k1 , i = 1, 2, · · · , k, where k = Nn . Other

examples include coin tossing (ﬁnite number of times),

roll of dice, card games such as Bridge.

Another important example with a ﬁnite sample space

is from statistical mechanics in particle physics. Sup-

pose S ≡ {s = (i1 , i2, i3), ij ∈ {0, 1, −1}, j = 1, 2, 3} is

a set of sites. Note that there are 3 × 3 × 3 = 27 sites

in S. Suppose at each site s in S, there is a spin ω(s)

that could be +1 or −1. Consider the collection Ω of

all spin functions ω mapping S to {+1, −1}. Then the

size of Ω is 227 , a ﬁnite, but large number. Call a typ-

ical element ω in Ω a conﬁguration. Physicists assign

probabilities to any conﬁguration ω by using a param-

eter β, temperature T , and a function V (ω), called the

potential function. It is of the form

β

e− T V (ω)

p(ω) = ,

zβ,T

− Tβ V (ω )

where Zβ,T ≡ ω ∈Ω e is called the partition func-

GENERAL ⎜ ARTICLE

The probability distribution {p(ω) : ω ∈ Ω} is called Metropolis et al [2]

Gibbs distribution. Computing {p(ω)} is a very chal- invented a method in

lenging task since computing the partition function Zβ,T the early 1950's.

is quite diﬃcult. Even more so is computing the mean Statisticians

and variance of some function g : Ω → R with re- discovered this paper

spect to Gibbs distribution. That is, computing λ1 and in the early 1990's and

λ2 − λ21 where λk = Σω (g(ω))k p(ω), k a positive integer. coined the term

For this, the physicists Metropolis et al [2] invented a Markov Chain Monte

method in the early 1950’s. Statisticians discovered this Carlo (MCMC) and

paper in the early 1990’s and coined the term Markov since then this subject,

Chain Monte Carlo (MCMC) and since then this sub- i.e., MCMC, has seen

ject, i.e., MCMC, has seen some rapid growth. (see [1] some rapid growth.

Section 9.3, [2])

Example 2 (Countably Infinite Sample Space). Here,

Ω ≡ {ω1 , ω2 , · · · } is a countably inﬁnite set, F = P(Ω),

the power set of Ω,

∞

P (A) = pi IA (ωi ) ,

i=1

∞

where pi ≥ 0, i=1 pi = 1.

An example of this is the experiment of recording the

number of radioactive emissions during a given period

[0, T ] from a speciﬁed radioactive source. Here, Ω =

{0, 1, 2, · · · } and {pi }i≥0 is typically a Poisson distribu-

tion of the form

e−λλi

pi = , i = 0, 1, 2, · · · , 0 < λ < ∞.

i!

Example 3 (Real-Valued Random Variables). Let Ω ≡ R,

F ≡ B(R), the Borel σ-algebra in R, i.e., the smallest

σ-algebra containing all intervals. (See the deﬁnition

of a o-algebra given earlier.) Let F : R → [0, 1] be a

cumulative distribution function (CDF), i.e.,

i) x1 ≤ x2 ⇒ F (x1) ≤ F (x2),

GENERAL ⎜ ARTICLE

x→−∞

x→+∞

ability measure μF on (R, B(R)) such that

identity map, i.e. X(ω) = ω. This serves as a model

for a single real-valued random variable X. We give

below a number of examples of F ’s that are probability

distribution functions on R = (−∞, ∞).

x

1 (u−μ)2

F (x) = √ e− 2σ2 du;

2πσ −∞

−∞ < μ < ∞, 0 < σ < ∞.

⎧

⎨ 0,

x x≤0

F (x) = 1

⎩ e−αu up−1 du p , x>0

0 α Γ(p)

∞

where Γ(p) = e−u up−1 du.

0

⎧

⎪

⎪ 0, x≤0

⎨

x α−1 1

F (x) = y (1 − y)β−1dy B(α,β) , 0≤x≤1

⎪

⎪

0

⎩

1, x>1

1

where B(α, β) = y α−1 (1 − y)β−1dy.

0

GENERAL ⎜ ARTICLE

v) Cauchy (γ, σ): −∞ < γ < ∞, 0 < σ < ∞,

1 x 1 1

F (x) = dy, −∞ < x < ∞.

π −∞ σ ( σ ) + 1

y−γ 2

x < k + 1,

⎧

⎪

⎪ 0, x<0

⎪

⎨ [x]

n r

F (x) = p (1 − p)n−r , 0 ≤ x ≤ n

⎪

⎪ r

⎪

⎩ r=0

1, x>n

0, x<0

F (x) =

(1 − p)[x] p, x > 0

⎧

⎪

⎨ 0, x<0

[x]

F (x) = e−λ λr

⎪

⎩ , x≥0

r!

r=0

a function F : R → R that is non-decreasing, i.e., x1 ≤

x2 ⇒ F (x1) ≤ F (x2), there is a measure μF deﬁned on

the Borel σ-algebra B(R) of R such that

μF (a, b] = F (b+) − F (a+) ,

y↓x

the Borel σ-algebra in Rk , i.e., the smallest σ-algebra

containing all sets of the form

(a1, b2 ) × (a2, b2) × · · · × (an , bn ),

GENERAL ⎜ ARTICLE

B(Rk ) such that μ(Rk ) = 1.

Let F (

x) ≡ μ(−∞,

x]

where,

x = (x1, x2, · · · , xk ) ∈ Rk , ∞

= (∞, ∞, · · · , ∞), and

(−∞,

x] ≡ {

y :

y = (y1 , y2, · · · , yk ), −∞ < yi ≤ xi , i ≤

k}.

Then, F is called a k-variate CDF. It satisﬁes some well-

known conditions. Conversely, given such a F , there

exists a unique probability measure μF on (Rk , B(Rk )).

For details see [1], Section 1.3. The identity map X(ω) =

ω is a model for the notion of a random vector of k-

dimensions.

Example 5 (Random Sequences). Let Ω ≡ R∞ ≡ {ω :

ω : N → R}, N = {1, 2, 3, · · · }. Let ∀ k ∈ N, μk be

a probability measure on (Rk , B(R)) as in Example 4.

Suppose {μk }k≥1 satisﬁes μk+1 (A × R) = μk (A) for all

A ∈ B(Rk ). Let F be a σ-algebra generated by the class

C of ﬁnite dimensional sets of the form A × R × R ×

R × · · · , where A ∈ B(Rk ) for some 1 ≤ k < ∞. Then

by Kolmogorov’s consistency theorem [1], there exists a

probability measure μ on (R∞ , F ) such that for ∀ k <

∞, A ∈ B(Rk ), μ(A×R×R×R×· · · ) = μk (A). This is a

model for a sequence of random variables {Xk }k≥1 such

that for every 1 ≤ k < ∞, the probability distribution

of (X1 , X2 , · · · , Xk ) under μ will coincide with μk .

Example 6 (Random Functions). Let T be a nonempty

set. For example, T could be a ﬁnite set or a count-

able set or an interval or a subset of some Euclidean

space. Let Ω ≡ {f : T → R} be the set of all real-

valued functions ω on T . Suppose we want to model the

choice of an element ω from Ω by a random mechanism.

Kolmogorov proved a result known as the consistency

theorem to make this precise. Suppose for every ﬁnite

vector (t1, t2 , · · · , tk ), t < ∞, of elements from T there

GENERAL ⎜ ARTICLE

is a probability measure μ(t1 ,t2 ,··· ,tk )(·) on (Rk , B(Rk )).

Suppose this family of probability measures satisfy:

for every permutation π of (1, 2, · · · , k), where A1 , A2, · · · ,

Ak are Borel sets in R.

(ii) μ(t1 ,t2 ,··· ,tk ,tk+1 ) (A1 ×A2×· · ·×Ak ×R) =

Then, there exists a σ-algebra BT of subsets of Ω and a

probability measure μT on BT such that for any (t1, t2 , · · · ,

tk ) and A1, A2, · · · , Ak in B(R), the Borel σ-algebra of

R,

μT (ω(t1) ∈ A1, · · · , ω(tk ) ∈ Ak ) =

μ(t1 ,t2 ,··· ,tk ) (A1 × A2 × · · · × Ak ).

See [1], Section 1.3 for a proof and further details.

An example of this when T = [0, ∞) is the standard

Brownian motion. Here, for every t1 , t2, · · · , tk ∈ T =

[0, ∞), the probability distribution μt1 ,t2,··· ,tk (·) is that

of a k variate normal distribution with mean vector

(0, 0, · · · , 0) and covariance matrix σij ≡ min(ti , tj ) ([1],

Section 10.2).

If T is a singleton and Ω = R, then the random element

ω is called a random variable. If T is a ﬁnite set the

random element ω of ΩT ≡ RT , the set of all functions

from Ω to R is called a random vector. If T is a countable

set it is called a random sequence. If T is an interval, it

is called a random function. If T is a subset of Rk ,

it is called a random ﬁeld. Typically, Ω ≡ RT , the

collection of all real-valued functions on T is very large.

But the σ-algebra BT in Kolmogorov’s construction is

not very large. This makes many interesting quantities

M = sup{|ω(t)| : t ∈ T } not BT -measurable, i.e., {ω :

M(ω) ≤ a} need not be in BT for all a in R. Since the

GENERAL ⎜ ARTICLE

in BT , the probability that M ≤ m where m is a given

real number, can not be discussed. J L Doob devised a

method called separability to take care of this problem

[3].

Mean, Variance, Moments of a Random Variable

Let (Ω, F , P ) be a probability space. Then a function

X : Ω → R is called a random variable on (Ω, F , P ), if

sets of the form {ω : X(ω) ≤ a} ∈ F for each a ∈ R and

hence are events and one can talk about the probability

distribution of X, i.e., FX (a) ≡ P (ω : X(ω) ≤ a). This

FX (·) is called the cumulative distribution function of

X. It satisﬁes:

i) x1 ≤ x2 ⇒ FX (x1) ≤ FX (x2),

y↓x

and

x↓−∞

FX (∞) ≡ lim FX (x) = 1.

x↑∞

will coincide with μFX (A), where μFX (·) is the Stieltjes

measure on (R, B(R)) induced by FX (·).

If X is a simple random variable, i.e., it takes only

ﬁnitely many distinct real values {a1 , a2, · · · , ak } then

the expected value EX of X or the mean value of X is

deﬁned as

k

EX ≡ aipi .

i=1

be approximated by a sequence {Xn }n≥1 of simple ran-

dom variables such that for each sample point ω in Ω,

Xn (ω) ≥ 0, Xn (ω) ≤ Xn+1 (ω) ∀ n ≥ 1, lim Xn (ω) = X(ω).

n

GENERAL ⎜ ARTICLE

deﬁne the mean value of X, i.e., EX by setting it equal

to lim EXn . It can be shown [1] that {EXn }n≥1 is a non-

n

decreasing sequence in n and that lim EXn will be the

n

same for all admissible sequences. It could be +∞. Here

the properties that F is a σ-algebra and P is countably

additive are crucially used.

Next, for any real-valued random variable X on (Ω, F , P )

to R, let

X + (ω) ≡ max{X(ω), 0},

X − (ω) ≡ max{−X(ω), 0}.

Then it can be shown that both X + and X − are non-

negative random variables on (Ω, F , P ) and for every ω,

X(ω) = X + (ω) − X − (ω). So, it is natural to deﬁne

EX, the expected value of X as EX = EX + − EX −

provided at least one of the two quantities EX + , EX −

is ﬁnite. Typically one requires both EX + and EX −

to be ﬁnite. This renders E|X| < ∞. So we see that

EX is well deﬁned for any random variable X such that

E|X| < ∞.

For any random variable X, the kth moment of X for a

positive integer k is deﬁned as EX k provided E|X k | <

∞. The variance of a random variable X is deﬁned as

V (X) ≡ E(X − EX)2 provided EX 2 < ∞. It can be

seen that if EX 2 < ∞, then V (X) = EX 2 − (EX)2 .

The reader is invited to compute the mean EX and the

variance V X for random variables X with probability

distributions F (·) mentioned earlier in Example 3.

Laws of Large Numbers and CLT.

There are two results in probability theory that make

the subject very useful in applications. This area of

application of probability theory to the real world is of-

ten termed as the ﬁeld of statistics. It involves collect-

ing data (i.e., generating random variables) according to

GENERAL ⎜ ARTICLE

probability theory to inferences about the underlying population based on the

the real world (often data (referred to as statistical inference). A fundamental

termed the field of notion needed for these two results is that of the inde-

statistics) involves pendence of random variables. Let E be a random exper-

collection of data. iment, (Ω, F , P ) be a probability space associated with

That is, it involves E and X1 , X2 , · · · , Xk be k real-valued random variables

generating random (k < ∞) deﬁned on (Ω, F , P ). Recall that a real-valued

variables according to random variable X on a probability space (Ω, F , P ) is

well-defined rules of

simply a function X from Ω to R such that for each a in

probability theory and

R the set {ω : X(ω) ≤ a} is in F . This is often referred

to as X is a measurable function on (Ω, F ) to R. Note

then making

that X being measurable depends on F and not on P . It

inferences about the

can also be veriﬁed that X : Ω → R is a random variable

underlying population

on (Ω, F ), if and only if, {ω : X(ω) ∈ B} ∈ F for all B

based on the data –

in B(R), the Borel σ-algebra of R, i.e., the smallest σ-

this is called statistical

algebra containing intervals of the form (α, β), α, β ∈ R.

inference.

This property is called X is (F , B(R)) measurable [1].

A ﬁnite collection X1 , X2 , · · · , Xk , k < ∞, of real-valued

random variables on a space (Ω, F ) are said to be inde-

pendent with respect to the probability measure (distri-

bution) P if for any a1, a2 , · · · , ak in R,

= P {ω : X1 (ω) ≤ a1 } · · · P {ω : Xk (ω) ≤ ak },

P {ω : X1 (ω) ≤ a1, X2 (ω) ≤ a2, · · · , Xk (ω) ≤ ak }

k

functions, i.e., it equals i=1 FXi (ai), where FXi (ai ) ≡

P {ω : Xi (ω) ≤ ai }.

A family {Xt (ω) : t ∈ T } of real-valued random vari-

ables on a probability space (Ω, F , P ), where T is an

arbitrary index set is said to be independent with re-

spect to P if ∀ ﬁnite set {t1, t2, · · · , tk } ⊂ T , k < ∞,

GENERAL ⎜ ARTICLE

{Xt1 (ω), Xt2 (ω), · · · , Xtk (ω)} are independent with re-

spect to P .

An example of an inﬁnite sequence of independent ran-

dom variables is the following. Let Ω = [0, 1], F =

B[0, 1], the Borel σ-algebra of [0, 1], P = Lebesgue mea-

δi(ω)

sure. For each ω, let ω ≡ ∞ i=1 2i be the binary expan-

sion of ω in base 2. Then, it can be shown that for each

k < ∞, the functions δ1 (ω), · · · , δk (ω) are independent

on this (Ω, F , P ) with each δi having distribution

1

P {ω : δi (ω) = 0} = = P {ω : δi (ω) = 1},

2

called the Bernoulli( 12 ) distribution. One just needs to

verify that {ω : δ1 (ω) = s1, · · · , δk (ω) = sk } for any

given s1, · · · , sk ∈ {0, 1} is simply an interval of length

1

in [0, 1]. A similar result holds for expansion to base

2k

p, where p is an integer > 1.

The following results known as the ‘laws of large num-

bers’ are consequences of slightly more general results

due to Kolmogorov.

Theorem 1 (Weak) Law of Large Numbers. Let X1 , X2 ,

X3 , · · · , Xn be independent random variables on some

probability space (Ω, F , P ) such that

a} ≡ F (a), a ∈ R is the same for all i = 1, 2, · · · , n

∞ (see deﬁnition given earlier).

and EX1 is as defined earlier.

GENERAL ⎜ ARTICLE

· · · be a sequence of random variables on some probabil-

ity space (Ω, F , P ) such that for each n < ∞, X1 , X2 ,

· · · , Xn satisfy the hypothesis of Theorem 1. Then

P {ω : X n (ω) → EX1 as n → ∞} = 1.

subject of statistics very useful. If the mean value λ of

a random variable X is not known it can be estimated

from a sample data. More precisely, let X1 , X2 , · · · , Xn

be a sample of n independent copies of X, then by the

law of large numbers i.e., Theorem 1, the sample mean

Xn converges to λ as n tends to inﬁnity. This is called

the IID Monte Carlo IIDMC) method.

An example of this is opinion polls in election surveys.

Suppose there are two candidates A and B contesting

for a position in a city with a large electorate. Suppose

the organizers of the candidate A want to estimate the

support for A in that city. They choose a small sample

of people from that city. Find out the support that A

has in that sample. Use that to estimate the support A

enjoys in the whole city.

The estimate X n ≡ X1 + X2 n+ ··· + Xn based on n indepen-

dent observations X1 , X2 , · · · , Xn of a random variable

X is often referred to as a point estimate for the quan-

tity λ ≡ EX. Another kind of estimate called interval

estimate or a conﬁdence interval In for λ ≡ EX based

on the observation X1 , X2 , · · · , Xn is generated by the

use CLT in probability theory (referred to earlier in this

article). We give this below.

Central Limit Theorem: Let X1 , X2 , · · · , Xn , · · · be

independent identically distributed real-valued random

variables. Let EX12 < ∞. Let EX1 = μ and 0 <

V X1 ≡ EX12 − (EX1 )2 ≡ σ 2 < ∞. Let X n ≡ n1 ni=1 Xi

for n = 1, 2, · · · . Then, for any −∞ < a < b < ∞,

GENERAL ⎜ ARTICLE

√ (X n − μ) b

1 x2

i) lim P a ≤ n ≤b = √ e− 2 dx

n→∞ σ a 2π

√ (X n − μ) b

1 x2

ii) lim P a ≤ n ≤b = √ e− 2 dx

n→∞ σn a 2π

where

1 2

n

2

σn2 ≡ Xi − X n , n ≥ 1.

n i=1

y x2

The function Φ(y) ≡ −∞ √12π e− 2 dx, −∞ < y < ∞

is called the standard normal distribution function or

Gaussian distribution named after the great German

mathematician Carl F Gauss3 . The function φ(y) = 3

Resonance, Vol.2, No.6, 1997.

2

dΦ(y)

√1 e − y2

dy

= 2π

is called the standard normal probabil-

ity density function. The graph of the curve (x, φ(x)) as

x varies over (−∞, ∞) looks like a bell and is referred

to as bell curve.

Suppose given 0 < α < 1 one wants to produce an

interval In based on observations X1 , X2 , · · · , Xn such

that

P (μ ∈ In ) → (1 − α) as n → ∞,

where μ = EX1 . For this, one ﬁrst chooses a ∈ (0, ∞)

such that

+a

1 u2

Φ(a) − Φ(−a) = √ e− 2 du = (1 − α)

−a 2π

Note that if

aσn aσn

In ≡ X n − √ , X n + √

n n

√

(X n − μ) n

P (μ ∈ In ) = P − a ≤ ≤ b → (1 − α .)

σn

GENERAL ⎜ ARTICLE

conﬁdence interval of level (1 − α) for the parameter

μ = EX1 . Typically, one chooses α to be 0.05 and the

corresponding interval In is called a 95% level conﬁdence

interval.

It may be noted that CLT is a reﬁnement of the law

of large numbers which says that if E|X1 | < ∞ then

X n − μ → 0 as n → 0 with probability one. CLT says

that if EX12 < ∞ while (X n − μ) → 0, then (X n − μ)

decays at the rate of √1n .

∞. This requires the study of what is called stable dis-

tributions [1].

Address for Correspondence

K B Athreya Suggested Reading

Department of Mathematics

Iowa State University [1] K B Athreya and S N Lahiri, Measure Theory and Probability Theory,

Ames, Iowa, USA Springer, New York. (see also TRIM Series Vol.36 and 41, 2006).

Email: [2] K B Athreya, M Delampady and T Krishnan, MCMC Methods,

kbathreya@gmail.com Resonance, April, July, October, December, 2003.

[3] J L Doob, Stochastic Processes, John Wiley, New York, 1953.

- Introduction to Stochastic Processes with Applications in the BiosciencesDiunggah olehmushr00mcloud
- Probability of Simple EventsDiunggah olehRB Astillero
- Isi Mtech Cs 05Diunggah olehapi-26401608
- Tut 14 Binomial and Poisson Distn_SolutionsDiunggah olehselmerparis
- IE_5004_L1Diunggah olehYLBoo
- NotesDiunggah olehAnderson Vinícius
- Triola Cover&ContentsDiunggah olehHanaLe
- Excess Loss VariableDiunggah olehvbatta2
- 07 Mourelatos Bearings PaperDiunggah olehsameh5000
- mtechcs10Diunggah olehlegendarykx
- CTF_2Diunggah olehAnkur Agarwal
- 210 BookDiunggah olehAriana Tushin
- 2006EA-1Diunggah olehjaved765
- Go FinalDiunggah olehcesarneyit
- On Thinning a Spatial Point Process Into a Poisson Process Using the Papangelou IntensityDiunggah olehsakender
- games.pdfDiunggah olehguddubehala
- Recitation 1ADiunggah olehDylan Ler
- LPC lecture slidesDiunggah olehSri Ganesh
- Index.pdfDiunggah olehkattaswamy
- Sample Midterm Exam QuestionsDiunggah olehA K
- Important_Probability_Distributions.pdfDiunggah olehChouaib El Hajjaji
- Lab Report - Naman PujariDiunggah olehNaman Pujari
- law of large numbers.pdfDiunggah olehNamdev
- Case Study Stat(1)Diunggah olehPraval Sai
- Probability and Random Number a First Guide to RandomnessDiunggah olehVinko Zaninović
- Lecture 5&6Diunggah olehHimura
- HardyDiunggah olehCoie8t
- _24_._09_._2018_Economic DDiunggah olehMarius Ndong
- Public Relations of Life Insurance Corporation of INDIADiunggah olehRohit Patil
- Seminar 6 Waiting Lines SolutionsDiunggah olehAna Dinica

- RBIDiunggah olehjasminnee
- Hitting Times and Ruin ProbabilitiesDiunggah olehanon_596499269
- VarianceDiunggah olehFelix Ray Dumagan
- A Bayesian Approach to Fault IsolationDiunggah olehAlvaro Nuñez
- Homework 7 and 8Diunggah olehjhutchisonbmc
- Mathematics for Class 10Diunggah olehKulachi Hansraj Model School
- Probability & Random Process QBDiunggah olehwizardvenkat
- Banks QmblogDiunggah olehJohnCarter33
- A5Diunggah olehAayush Goel
- 03 HRA HandoutDiunggah olehmartinnovel
- Tales of the Unexpected, Taleb 2003Diunggah olehamerd
- CountingDiunggah olehJon Mactavish
- TABEL RANK SPEARMAN-spearman Ranked Correlation TableDiunggah olehHakim Tanjung Prayoga
- 10.1111@j.1539-6924.2008.01030.x.pdfDiunggah olehSajid Mohy Ul Din
- Final Report (1) (1)Diunggah olehAmey Vidvans
- [솔루션] Probability and Stochastic Processes 2nd Roy D. Yates and David J. Goodman 2판 확률과 통계 솔루션 433 4000Diunggah olehTaeho Lee
- Lecture 1 IntroductionDiunggah olehAnonymous ep7LE5ZdP5
- Learning Unit 9Diunggah olehBelindaNiemand
- Lock out Tag out ProceduresDiunggah olehDavid Arputharaj
- S1 Exercise.pdfDiunggah olehantonyluk
- [3] Convergence of Random VariablesDiunggah olehRadhikanair
- the Quantum Activist WorkbookDiunggah olehJana Dvorakova
- Practice Problems 1 - Applied StatisticsDiunggah olehKarthik Sharma
- Probability Theory Nate EldredgeDiunggah olehjoystick2in
- sta2023ch4slidesDiunggah olehapi-258903855
- Probability and statistics in criminal proceedingsDiunggah olehvthiseas
- Statsmidterm 01 KeyDiunggah olehfelipecam5
- EMATH10 Intro to Probability and StatisticsDiunggah olehSpearMint
- Explanatory Handbook to IRC-112-2011.pdfDiunggah olehNagesh Thota
- 4IQM_Syllabus (1)Diunggah olehHamis Mohamed