Anda di halaman 1dari 8

STAA 567 Lec 6: Monte Carlo Integration

(Instructor : Nishant Panda)


Additional References
1. (IMC) : Introducing Monte Carlo Methods with R, Christian P. Robert & George Casella, Springer
2. (MMT) : Monte Carlo Methods, Robert & Casella

Introduction

Last lecture we saw how to generate random samples from probability distributions. In this chapter, we
will use that tool to approximate integrals. Most statistical properties are expressed as an Expectation
which is an integral! In most cases, it is not possible to compute such an integral exactly. In this lecture
we will explore a stochastic technique for evaluating integrals called Monte Carlo Integration. As the name
suggests, it will involve sampling from probability distributions. Not that there are deterministic approaches
to approximate integrals that are studied in numerical analysis. Such methods generally cannot be extended
to higher dimensions. We will not study such methods even though they are an important part of the modern
Statisticians toolbox.

Expectation!, meet Monte Carlo Integration.

Let X f be a random variable (could be multidimensional!) and consider a real valued function g and the
corresponding random variable g(X). Then, the expectation of g(X) is given by,
Z
E [g(X)] = g(x)f (x)dx,
X

where X is the range of X (support of f (x)). Two common examples of g are g(x) = x and g(x) = (xE [X])2 .
In these cases, E [g(X)] is the expectation of X and the variance of X respectively.
Let = E [g(X)]. The Monte Carlo algorithm to compute an approximation to , denoted by d
M C is as
follows:
1. Generate n samples from f (x): X1 , X2 , . . . , Xn . Then,
2. the monte carlo approximation is given by,
n
1X
d
MC = g(Xi ).
n i=1

Example 1: Compute the integral of


2
g(x) = (cos(50x) + sin(20x)) ,

from 0 to 1.
A deterministic approximation (very good approximation!) to this integral can be computed using the
integrate function in R
g <- function(x){
(cos(50*x) + sin(20*x))^2
}
integrate(g, 0, 1)

1
## 0.9652009 with absolute error < 1.9e-10
Note that we are not explicitly given a density here! First realize that
Z1 Z1
g(x)dx = g(x)f (x)dx,
0 0

where f (x) = 1 is the uniform density! Thus, we can we view the integral above as the expectation E [g(X)],
where X U (0, 1). The Monte Carlo approximation can then be computed as follows:
n.sam <- 1000
# step 1: generate n i.i.d samples from f (in this case uniform(0,1))
x.sam <- runif(n.sam)

# compute the MC approximation


theta.mc <- sum(sapply(x.sam, g))/n.sam

print(theta.mc)
## [1] 0.9063239
Does increasing n.sam improve our approximation?
n.sam <- 10000
# step 1: generate n i.i.d samples from f (in this case uniform(0,1))
x.sam <- runif(n.sam)

# compute the MC approximation


theta.mc <- sum(sapply(x.sam, g))/n.sam

print(theta.mc)
## [1] 0.9780032
Definitely! Infact, by the strong law of large number

M C a.s.,
d
as n .

Example 2: Let X N (, 2 ) and let t be a real number. Approximate the probability P (X < t)
using Monte Carlo integration. For concreteness take, = 5, = 1.25 and t = 7.1.
Note that P (X < t) is given by the integral
Zt
1 (x)2
P (X < t) = e 22 dx.
2

Again we dont need to use Monte Carlo integration as this is just the pnorm(t) in R
pnorm(7.1, mean = 5, sd = 1.25)
## [1] 0.9535213
How do we use the Monte Carlo integration method here? We are not given a function g here. First note
that we can write
Zt Z
1 (x)2
22 1 (x)2
P (X < t) = 1 e dx + 0 e 22 dx
2 2
t

2
This gives us the following: Let g = I(X<t) be the indicator function for the set {X < t} and let f (x) be the
(x)2
normal density f (x) = 1 e 2 2 . Then,
2
Z
P (X < t) = g(x)f (x)dx = E [g(X)]
R

Thus our Monte Carlo approximation is given by


n
1X
d
MC = I(Xi <t)
n i=1

This is the emperical distribution (STAA 572)! Let us run this in R


g.ind <- function(x, t){
if (x < t) return(1)
else return(0)
}
n.sam <- 1000
# Step 1: Sample of size n from the density f (in this case N(5, 1.25))
x.sam <- rnorm(n.sam, mean = 5, sd = 1.25)

# Step 2: get the monte carlo estimate


theta.mc <- sum(sapply(x.sam, g.ind, t = 7.1))/n.sam

print(theta.mc)
## [1] 0.964
Very Close to the answer given by pnorm.

(Home Assignment!) Let X Exp(0.5) be an exponential random variable with rate = 0.5.
Approximate P (X > 2) using Monte Carlo approximation and check your answer using pexp.

Example 3: Let A = {(x, y) : x2 + y 2 1. Compute the Area(A) using Monte Carlo Integration.

Since, A is circle centered at the origin and of radius 1, we know that the answer is ! This is an integral,
given by Z
Area(A) = dxdy
A

How do we use Monte Carlo integration? We can think of X U (1, 1) and Y U (1, 1) as independent
random variables with the joint density given by 14 i.e

1
fX,Y (x, y) = for 1 x, y 1
4
Let B be the square 1 x, y 1. Then we can think of the Area(A)
Z
Area(A) = 4P (X, Y A) = 4 IA fX,Y dxdy = 4E [IA ]
B

Thus we need to sample from fX,Y and our function g is just the indicator of the set A, i.e g = IA . Let us
write the Monte Carlo method

3
x.sam <- runif(10000, min = -1, max = 1)
y.sam <- runif(10000, min = -1, max = 1)
joint.sam <- cbind(x.sam, y.sam)

g.indA <- function(point){


if ((point[1]^2 + point[2]^2) <= 1) return (1.0)
else return(0)
}

theta.mc <- sum(apply(joint.sam, 1, g.indA))/nrow(joint.sam)


4*theta.mc
## [1] 3.16
Let us visualize this!
joint.data <- data.frame(joint.sam)
colnames(joint.data) <- c("x", "y")
# create a factor variable to see which were accepted
joint.data$Accepted <- (joint.data$x^2 + joint.data$y^2 < 1)
library(ggplot2)
# we will create a scatter plot
# "x" values will be y (i.e samples from g)
# "y" values will be ue (u * e)

# If (u*e < f(y)) then samples are accepted otherwise rejected


# Hence, in scatter plot we will color by the Accepted variable

plot_ar <- ggplot(joint.data, aes(x=x, y=y)) +


geom_point(shape=20, aes(color=Accepted)) +
stat_function(fun = function(x) sqrt(1-x^2), size=1.5) +
stat_function(fun = function(x) -1*sqrt(1-x^2), size=1.5) +
coord_fixed()

# beautify : Increase font size in axes


plot_ar + theme(axis.text.x =
element_text(face = "bold",size = 12),
axis.text.y =
element_text(face = "bold", size = 12),
axis.line = element_line(colour = "black",
size = 1, linetype = "solid"))

4
1.0

0.5

Accepted
0.0 FALSE
y

TRUE

0.5

1.0
1.0 0.5 0.0 0.5 1.0
x

Statistical properties of the Monte Carlo Estimator


Note that dM C is an estimator for . What is the expected value of M C ?
d
h i 1
E dMC = nE [g(X)] = .
n
Thus, monte carlo estimator is an unbiased estimator. In order to talk about how good an estimate this is for
a given n, we need to figure out the variance of our monte carlo estimator. Assuming that g(X)2 has finite
expectation, let var [g(X)] = 2 . Then variance of d
M C is given by,
h i 1 2
var dM C = 2 n var [g(X)] =
n n
We can estimate 2 from the (monte carlo) sample using the sample variance i.e
n 2
1 X
S2 = g(Xi ) d
MC
n 1 i=1

Thus, the standard error (SE) of our Monte Carlo estimate is approximately given by,
r h i S
SE = var d MC
n
By, CLT, we can now build a confidence interval with an error (about the Monte Carlo estimate) that is
proportional to
S

n
Thus, larger the Monte Carlo sample, smaller is the error around our monte carlo estimate. However, note
1
that this error is proportional to n 2 . Hence, in order to reduce our error by half, we need to quadruple our
monte carlo samples!
Can we do better than the Monte Carlo estimate? To be precise, a better estimate should also be unbiased
(or asymtotically unbiased) but should have lower variance than the Monte Carlo estimator. First lets look at
Example 2 again and let us compute the probability P (X > 9.1) using Monte Carlo approximation.

5
Example 2 revisited with t = 9.1
g.ind2 <- function(x, t){
if (x > t) return(1)
else return(0)
}
n.sam <- 1000
# Step 1: Sample of size n from the density f (in this case N(5, 1.25))
x.sam <- rnorm(n.sam, mean = 5, sd = 1.25)

# Step 2: get the monte carlo estimate


theta.mc <- sum(sapply(x.sam, g.ind2, t = 9.1))/n.sam

print(theta.mc)
## [1] 0
1 - pnorm(9.1, mean = 5, sd = 1.25)
## [1] 0.0005190354
var.fun <- function(x, t, theta){
if (x > t) return (1 - theta)^2
else return (theta)^2
}
sample.var <- sum(sapply(x.sam, var.fun, t = 9.1, theta = theta.mc))/(n.sam - 1)

SE = round(sqrt(sample.var/n.sam), 4)
print(SE)
## [1] 0
Oops! 1000 samples wasnt able to locate this rare event. Let us see if quadrupling this helps
n.sam <- 4000
# Step 1: Sample of size n from the density f (in this case N(5, 1.25))
x.sam <- rnorm(n.sam, mean = 5, sd = 1.25)

# Step 2: get the monte carlo estimate


theta.mc <- sum(sapply(x.sam, g.ind2, t = 9.1))/n.sam

print(theta.mc)
## [1] 0
var.fun <- function(x, t, theta){
if (x > t) return (1 - theta)^2
else return (theta)^2
}
sample.var <- sum(sapply(x.sam, var.fun, t = 9.1, theta = theta.mc))/(n.sam - 1)

SE = round(sqrt(sample.var/n.sam), 4)
print(SE)
## [1] 0
Because this rare event is not sampled untill we hit a large n, we dont get a stable estimate of the monte
carlo variance. And even if we do, we need to increase n by a factor of 4 to reduce the error by a factor of 2.
One other way to reduce error in our estimate is by reducing (and hence S). Note that if we can reduce

6
by a factor of 2, our error reduces by a factor of 2 for the same n. Monte Carlo techniques that reduce
variance are called Variance Reduction techniques. One of the most popular of these techniques is important
sampling.

Importance Sampling

First some notation. Let f (x) be a probability density. We make a slight change in our notation for
expectation to emphasize the corressponding density.
If X f , then we denote the expectation of X by Ef [X] to emphasize the density is f .
The main idea here is to take samples from a different distribution so as to reduce variance of the estimator.
Consider the following motivating example (from MMT page 90)

Example 4: X f , where f is the Cauchy distribution Cauchy(0, 1). Let us compute P (X > 2)
in two different ways.
First way: Standard Monte Carlo (see Example 2)

P (X > 2) = Ef [IX>2 ] .

Our function g(X) = IX>2 and the variance of the estimator is

var [g(X)]
.
n

Note that g(X) is a Bernoulli random variable and its variance is given by p(1 p) with parameter
p = P (X > 2). In this toy problem we dont know this, so we will estimate it from the sample variance.
However, because this is a toy example let us cheat and use R
p <- 1 - pcauchy(2)
p*(1 - p)
## [1] 0.1258027
Hence, the variance of the usual monte carlo estimator is given by 0.126n . Note that we are sampling in (2, )
and this will contain a lot of rare events. Let us evaluate P (X > 2) in a different way using the fact that this
distribution is symmetric,
Z2
1 1
P (X > 2) = dx
2 (1 + x2 )
0

But Note that, we can write this as


Z2
1 2
P (X > 2) = h(x)dx
2 (1 + x2 )
0

where h(x) = 1/2 i.e h U (0, 2). Thus,


1
P (X > 2) = Eh [g(X)]
2

2
where g(x) = (1+x2 ) . Now, let us use Monte Carlo with this expectation! The variance of this estimator is
h i
2
var (1+X 2 )
n

7
0.0285
This can be shown to be equal to n .
0.126
Hence, we have reduced variance by a factor 0.0285 4 and thus our error is reduced by a factor of 2.
This will be our motivation when we delve deeper into importance sampling in the next Lecture.

Anda mungkin juga menyukai