Anda di halaman 1dari 141

# Lecture 7 Putting Model into

Matlab Program

(Institute)

April 2009

134 / 174

## Discretizing DP quick tour

Our rst task is to discretize the DP given in the Bellman Equation

(Institute)

April 2009

135 / 174

## Discretizing DP quick tour

Our rst task is to discretize the DP given in the Bellman Equation
V (x ) = max[r (x, u ) + V (x 0 ))]
u

x 0 = g (x, u )

(Institute)

April 2009

135 / 174

## Discretizing DP quick tour

Our rst task is to discretize the DP given in the Bellman Equation
V (x ) = max[r (x, u ) + V (x 0 ))]
u

x 0 = g (x, u )
To represent the system,we discretize the state x and control variable
u. If, both are scaler, the state space is represented by a colmun
vector of size n, i.e.,

(Institute)

April 2009

135 / 174

## Discretizing DP quick tour

Our rst task is to discretize the DP given in the Bellman Equation
V (x ) = max[r (x, u ) + V (x 0 ))]
u

x 0 = g (x, u )
To represent the system,we discretize the state x and control variable
u. If, both are scaler, the state space is represented by a colmun
vector of size n, i.e.,

fx min = x 1 , x 2 , ..., x n

(Institute)

1 , x max

= x n g0

April 2009

135 / 174

## Discretizing DP quick tour

Our rst task is to discretize the DP given in the Bellman Equation
V (x ) = max[r (x, u ) + V (x 0 ))]
u

x 0 = g (x, u )
To represent the system,we discretize the state x and control variable
u. If, both are scaler, the state space is represented by a colmun
vector of size n, i.e.,

fx min = x 1 , x 2 , ..., x n

1 , x max

= x n g0

Also use a row vector of size m to represent the choice available for
the control variable

(Institute)

April 2009

135 / 174

## Discretizing DP quick tour

Our rst task is to discretize the DP given in the Bellman Equation
V (x ) = max[r (x, u ) + V (x 0 ))]
u

x 0 = g (x, u )
To represent the system,we discretize the state x and control variable
u. If, both are scaler, the state space is represented by a colmun
vector of size n, i.e.,

fx min = x 1 , x 2 , ..., x n

1 , x max

fu min = u 1 , u 2 , ..., u m

1 , nmax

= x n g0

Also use a row vector of size m to represent the choice available for
the control variable

(Institute)

= xmg
April 2009

135 / 174

## Discretizing DP quick tour 2

Given these approximating discrete state and control variables, the
state and the choice of control variable is represented simply by a pair
of the current state and the choice of control variable, xi , uj .

(Institute)

April 2009

136 / 174

## Discretizing DP quick tour 2

Given these approximating discrete state and control variables, the
state and the choice of control variable is represented simply by a pair
of the current state and the choice of control variable, xi , uj .
Inserting these into the state transition function generates
x 0 = g (xi , uj )

(Institute)

April 2009

136 / 174

## Discretizing DP quick tour 2

Given these approximating discrete state and control variables, the
state and the choice of control variable is represented simply by a pair
of the current state and the choice of control variable, xi , uj .
Inserting these into the state transition function generates
x 0 = g (xi , uj )
This x 0 is approximated by choosing the closest element among
fx min = x 1 , x 2 , ..., x n 1 , x max = x n g0

(Institute)

April 2009

136 / 174

## Discretizing DP quick tour 2

Given these approximating discrete state and control variables, the
state and the choice of control variable is represented simply by a pair
of the current state and the choice of control variable, xi , uj .
Inserting these into the state transition function generates
x 0 = g (xi , uj )
This x 0 is approximated by choosing the closest element among
fx min = x 1 , x 2 , ..., x n 1 , x max = x n g0
Denote this approximation by approx. We have

(Institute)

April 2009

136 / 174

## Discretizing DP quick tour 2

Given these approximating discrete state and control variables, the
state and the choice of control variable is represented simply by a pair
of the current state and the choice of control variable, xi , uj .
Inserting these into the state transition function generates
x 0 = g (xi , uj )
This x 0 is approximated by choosing the closest element among
fx min = x 1 , x 2 , ..., x n 1 , x max = x n g0
Denote this approximation by approx. We have

x 0 = approx [g (xi , uj )]

= xk

(Institute)

April 2009

136 / 174

## Discretizing DP quick tour 3

The above opertaion can be used to rewrite the DP as

(Institute)

April 2009

137 / 174

## Discretizing DP quick tour 3

The above opertaion can be used to rewrite the DP as
V (x j ) = max[r (x j , u ) + Pjj 0 (u )V (x j 0 )]
j0

(Institute)

April 2009

137 / 174

## Discretizing DP quick tour 3

The above opertaion can be used to rewrite the DP as
V (x j ) = max[r (x j , u ) + Pjj 0 (u )V (x j 0 )]
j0

wherein Pjj 0 (u ) is m

(Institute)

April 2009

137 / 174

## Discretizing DP quick tour 3

The above opertaion can be used to rewrite the DP as
V (x j ) = max[r (x j , u ) + Pjj 0 (u )V (x j 0 )]
j0

wherein Pjj 0 (u ) is m

## m array such that

= 1
if
approx [g (xi , uj )] = xj 0
pjij 0

(Institute)

## Numerical methods using MATLAB

April 2009

137 / 174

Innite Horizon DP
Then solving DP is reduced to nding such P that maximizes V .To
obtain numerical solutions for Innite horizon DP, we have two basic
programming strategies. Both of them exploits the recursive nature of
the DP.

(Institute)

## Numerical methods using MATLAB

April 2009

138 / 174

Innite Horizon DP
Then solving DP is reduced to nding such P that maximizes V .To
obtain numerical solutions for Innite horizon DP, we have two basic
programming strategies. Both of them exploits the recursive nature of
the DP.
In one strategy, we exploits the recursive structure to improve the
value function itself, whereas in the second strategy, we try to
improve policy function.

(Institute)

## Numerical methods using MATLAB

April 2009

138 / 174

Innite Horizon DP
Then solving DP is reduced to nding such P that maximizes V .To
obtain numerical solutions for Innite horizon DP, we have two basic
programming strategies. Both of them exploits the recursive nature of
the DP.
In one strategy, we exploits the recursive structure to improve the
value function itself, whereas in the second strategy, we try to
improve policy function.
If the standard set of conditions (concave reward and convex
feasibility set and strictly positive discount rate) necessary to ensure
the concavity of the value function are satised , the rst approach is
guaranteed to converge because the problem is essentially to nd the
maximum of a function which is concave. [ See the bottom of section
2].

(Institute)

## Numerical methods using MATLAB

April 2009

138 / 174

Innite Horizon DP
Then solving DP is reduced to nding such P that maximizes V .To
obtain numerical solutions for Innite horizon DP, we have two basic
programming strategies. Both of them exploits the recursive nature of
the DP.
In one strategy, we exploits the recursive structure to improve the
value function itself, whereas in the second strategy, we try to
improve policy function.
If the standard set of conditions (concave reward and convex
feasibility set and strictly positive discount rate) necessary to ensure
the concavity of the value function are satised , the rst approach is
guaranteed to converge because the problem is essentially to nd the
maximum of a function which is concave. [ See the bottom of section
2].
The catch is, though, this convergence is often very very slow,
especially when the discount rate is close to zero.
(Institute)

## Numerical methods using MATLAB

April 2009

138 / 174

Innite Horizon DP 2

In the second approach, we convert the problem into the one of root
nding:

(Institute)

## Numerical methods using MATLAB

April 2009

139 / 174

Innite Horizon DP 2

In the second approach, we convert the problem into the one of root
nding:
v

(Institute)

x

## Numerical methods using MATLAB

April 2009

139 / 174

Innite Horizon DP 2

In the second approach, we convert the problem into the one of root
nding:
v

x

## Then, instead of recursion on v , we update the policy function h(x )

such that the expression above gets close and closer to zero.

(Institute)

## Numerical methods using MATLAB

April 2009

139 / 174

Innite Horizon DP 2

In the second approach, we convert the problem into the one of root
nding:
v

x

## Then, instead of recursion on v , we update the policy function h(x )

such that the expression above gets close and closer to zero.
Specically, follow:

(Institute)

## Numerical methods using MATLAB

April 2009

139 / 174

Innite Horizon DP 3

Specify h0 (x )
1

(Institute)

## Numerical methods using MATLAB

April 2009

140 / 174

Innite Horizon DP 3

Specify h0 (x )
1
1

## Use h0 (x ) to compute v1 : i.e.,

[I P ] 1 [v r (x, h0 (x )) P (g (x, h(x ))v ]

(Institute)

## Numerical methods using MATLAB

April 2009

140 / 174

Innite Horizon DP 3

Specify h0 (x )
1
1

## Use h0 (x ) to compute v1 : i.e.,

[I P ] 1 [v r (x, h0 (x )) P (g (x, h(x ))v ]
Compute h1 (x ) to maximize r (x, h1 (x )) + P (g (x, h1 (x ))v1

(Institute)

## Numerical methods using MATLAB

April 2009

140 / 174

Innite Horizon DP 3

Specify h0 (x )
1
1

2
3

## Use h0 (x ) to compute v1 : i.e.,

[I P ] 1 [v r (x, h0 (x )) P (g (x, h(x ))v ]
Compute h1 (x ) to maximize r (x, h1 (x )) + P (g (x, h1 (x ))v1
Use h1 (x ) to compute v2 ...

(Institute)

## Numerical methods using MATLAB

April 2009

140 / 174

Innite Horizon DP 3

Specify h0 (x )
1
1

2
3
4

## Use h0 (x ) to compute v1 : i.e.,

[I P ] 1 [v r (x, h0 (x )) P (g (x, h(x ))v ]
Compute h1 (x ) to maximize r (x, h1 (x )) + P (g (x, h1 (x ))v1
Use h1 (x ) to compute v2 ...
Until we get the convergence

(Institute)

## Numerical methods using MATLAB

April 2009

140 / 174

Innite Horizon DP 3

Specify h0 (x )
1
1

2
3
4

## Use h0 (x ) to compute v1 : i.e.,

[I P ] 1 [v r (x, h0 (x )) P (g (x, h(x ))v ]
Compute h1 (x ) to maximize r (x, h1 (x )) + P (g (x, h1 (x ))v1
Use h1 (x ) to compute v2 ...
Until we get the convergence

## Policy iterations are generally faster at least in the sense of the

number of iterations needed to obtain convergence, whereas you need
to compute [I P ] 1 which can be time consuming if the
discretized state space is large.

(Institute)

## Numerical methods using MATLAB

April 2009

140 / 174

Compecon Toolbox
The main program in Compecon used for solving Discrete Time,
Discrete State DP is ddpsolve.m. The main program can be used for
either {nite or innite}, {stochastic or deterministic}, DP using
{value function or policy function} iterations.

(Institute)

## Numerical methods using MATLAB

April 2009

141 / 174

Compecon Toolbox
The main program in Compecon used for solving Discrete Time,
Discrete State DP is ddpsolve.m. The main program can be used for
either {nite or innite}, {stochastic or deterministic}, DP using
{value function or policy function} iterations.
The typical program to obtain DP solution using this main program
involves the following steps.

(Institute)

## Numerical methods using MATLAB

April 2009

141 / 174

Compecon Toolbox
The main program in Compecon used for solving Discrete Time,
Discrete State DP is ddpsolve.m. The main program can be used for
either {nite or innite}, {stochastic or deterministic}, DP using
{value function or policy function} iterations.
The typical program to obtain DP solution using this main program
involves the following steps.
1

(Institute)

## Numerical methods using MATLAB

April 2009

141 / 174

Compecon Toolbox
The main program in Compecon used for solving Discrete Time,
Discrete State DP is ddpsolve.m. The main program can be used for
either {nite or innite}, {stochastic or deterministic}, DP using
{value function or policy function} iterations.
The typical program to obtain DP solution using this main program
involves the following steps.
1

(Institute)

## Numerical methods using MATLAB

April 2009

141 / 174

Compecon Toolbox
The main program in Compecon used for solving Discrete Time,
Discrete State DP is ddpsolve.m. The main program can be used for
either {nite or innite}, {stochastic or deterministic}, DP using
{value function or policy function} iterations.
The typical program to obtain DP solution using this main program
involves the following steps.
1

## For deterministic DP, compute transfunc using getindex. For

stochastic DP, rst compute and discretize transition probability
matrix, if necessary, for serially correlated error. If the errors are
serially correlated, augment the state vector by including the
realization of (discretized) error term. Obtain the transprob, P,array.

(Institute)

## Numerical methods using MATLAB

April 2009

141 / 174

Compecon Toolbox
The main program in Compecon used for solving Discrete Time,
Discrete State DP is ddpsolve.m. The main program can be used for
either {nite or innite}, {stochastic or deterministic}, DP using
{value function or policy function} iterations.
The typical program to obtain DP solution using this main program
involves the following steps.
1

## For deterministic DP, compute transfunc using getindex. For

stochastic DP, rst compute and discretize transition probability
matrix, if necessary, for serially correlated error. If the errors are
serially correlated, augment the state vector by including the
realization of (discretized) error term. Obtain the transprob, P,array.

(Institute)

April 2009

141 / 174

## Discretizing the state and control variables

Two key ingredients of DP are : (1) payo function, r (x, u ), and (2)
state transition equation xt +1 = g (xt , ut ). Dene

(Institute)

April 2009

142 / 174

## Discretizing the state and control variables

Two key ingredients of DP are : (1) payo function, r (x, u ), and (2)
state transition equation xt +1 = g (xt , ut ). Dene
P (x 0 jx, u ) = Pr[xt +1 = x 0 jxt = x, ut = u ]

(Institute)

April 2009

142 / 174

## Discretizing the state and control variables

Two key ingredients of DP are : (1) payo function, r (x, u ), and (2)
state transition equation xt +1 = g (xt , ut ). Dene
P (x 0 jx, u ) = Pr[xt +1 = x 0 jxt = x, ut = u ]
We approximate fxt g by, 1 by N column vector,

(Institute)

April 2009

142 / 174

## Discretizing the state and control variables

Two key ingredients of DP are : (1) payo function, r (x, u ), and (2)
state transition equation xt +1 = g (xt , ut ). Dene
P (x 0 jx, u ) = Pr[xt +1 = x 0 jxt = x, ut = u ]
We approximate fxt g by, 1 by N column vector,

fx min = x 1 , x 2 , ..., x N

(Institute)

1 , x max

= xN g

April 2009

142 / 174

## Discretizing the state and control variables

Two key ingredients of DP are : (1) payo function, r (x, u ), and (2)
state transition equation xt +1 = g (xt , ut ). Dene
P (x 0 jx, u ) = Pr[xt +1 = x 0 jxt = x, ut = u ]
We approximate fxt g by, 1 by N column vector,

fx min = x 1 , x 2 , ..., x N

1 , x max

= xN g

## Henceforth, we approximate the state by x j (j = 1, ...N ). In many

cases, the bounds (x min , x max ) are easy to determine. For example,
x min = 0 is natural candidate for most of economic state variables.
x max can often be imputed by taking the maximum possible value of
xt +1 given xt and nd

(Institute)

April 2009

142 / 174

## Discretizing the state and control variables

Two key ingredients of DP are : (1) payo function, r (x, u ), and (2)
state transition equation xt +1 = g (xt , ut ). Dene
P (x 0 jx, u ) = Pr[xt +1 = x 0 jxt = x, ut = u ]
We approximate fxt g by, 1 by N column vector,

fx min = x 1 , x 2 , ..., x N

1 , x max

= xN g

## Henceforth, we approximate the state by x j (j = 1, ...N ). In many

cases, the bounds (x min , x max ) are easy to determine. For example,
x min = 0 is natural candidate for most of economic state variables.
x max can often be imputed by taking the maximum possible value of
xt +1 given xt and nd
max[x = g (x, u )]
u

(Institute)

April 2009

142 / 174

(Institute)

April 2009

143 / 174

## Given this approximation, compute:

approx [g (xj , u )]

(Institute)

April 2009

143 / 174

## Given this approximation, compute:

approx [g (xj , u )]
wherein the operator approx [x ] assigns the nearest grid as the
0
approximate value of x .

(Institute)

April 2009

143 / 174

## Given this approximation, compute:

approx [g (xj , u )]
wherein the operator approx [x ] assigns the nearest grid as the
0
approximate value of x .
j 0 = approx [g (xj , u )]

(Institute)

April 2009

143 / 174

## Given this approximation, compute:

approx [g (xj , u )]
wherein the operator approx [x ] assigns the nearest grid as the
0
approximate value of x .
j 0 = approx [g (xj , u )]
Since this procedure assigns the index j 0 for given j and control
variable value u,it can be represented by a degenerate transition
probability matrix P (x 0 jx, u ) of the form:

(Institute)

April 2009

143 / 174

P (u ) =

(Institute)

April 2009

144 / 174

## State -space representation 3

P (u ) =
2
6
6
4

0
1fapprox [g (x0 , u )]g
0
0
..
..
1fapprox [g (xN +1 , u )]g
0

(Institute)

## Numerical methods using MATLAB

..
0
.. 1fapprox [g (x1 , u )]
..
..
..
0

April 2009

144 / 174

## State -space representation 3

P (u ) =
2
6
6
4

0
1fapprox [g (x0 , u )]g
0
0
..
..
1fapprox [g (xN +1 , u )]g
0

..
0
.. 1fapprox [g (x1 , u )]
..
..
..
0

(Institute)

April 2009

144 / 174

## State -space representation 3

P (u ) =
2
6
6
4

0
1fapprox [g (x0 , u )]g
0
0
..
..
1fapprox [g (xN +1 , u )]g
0

..
0
.. 1fapprox [g (x1 , u )]
..
..
..
0

## Now instead of deterministic DP, assume

xt +1 = g (xt , ut ) + t ,
t ~f

(Institute)

April 2009

144 / 174

## State -space representation 3

P (u ) =
2
6
6
4

0
1fapprox [g (x0 , u )]g
0
0
..
..
1fapprox [g (xN +1 , u )]g
0

..
0
.. 1fapprox [g (x1 , u )]
..
..
..
0

## Now instead of deterministic DP, assume

xt +1 = g (xt , ut ) + t ,
t ~f
Then we have

(Institute)

April 2009

144 / 174

## State -space representation 3

P (u ) =
2
6
6
4

0
1fapprox [g (x0 , u )]g
0
0
..
..
1fapprox [g (xN +1 , u )]g
0

..
0
.. 1fapprox [g (x1 , u )]
..
..
..
0

## Now instead of deterministic DP, assume

xt +1 = g (xt , ut ) + t ,
t ~f
Then we have
xt +1
(Institute)

g (xt , ut )~f

April 2009

144 / 174

## State -space representation 4

Which translates into

(Institute)

April 2009

145 / 174

## State -space representation 4

Which translates into

## Pjj 0 (u ) = Pr[xt +1 = x j 0 jxt = x j , ut = u ]

= Pr[approx [g (xj , u )] x j + t = x j 0 ]
= f [x j 0 approx [g (xj , u )] + x j ]

(Institute)

April 2009

145 / 174

## State -space representation 4

Which translates into

## Pjj 0 (u ) = Pr[xt +1 = x j 0 jxt = x j , ut = u ]

= Pr[approx [g (xj , u )] x j + t = x j 0 ]
= f [x j 0 approx [g (xj , u )] + x j ]

re-written as

(Institute)

April 2009

145 / 174

## State -space representation 4

Which translates into

## Pjj 0 (u ) = Pr[xt +1 = x j 0 jxt = x j , ut = u ]

= Pr[approx [g (xj , u )] x j + t = x j 0 ]
= f [x j 0 approx [g (xj , u )] + x j ]

## With this transition probability matrix, the Bellman equation is now

re-written as
Vt (x j ) = max[r (x j , u ) + Pjj 0 (u )Vt +1 (x j 0 )]
j0

(Institute)

April 2009

145 / 174

## State -space representation 5

Although not essential, the text book MATLAB programs discretize
the policy function also. To do so, let

(Institute)

April 2009

146 / 174

## State -space representation 5

Although not essential, the text book MATLAB programs discretize
the policy function also. To do so, let

(Institute)

April 2009

146 / 174

## State -space representation 5

Although not essential, the text book MATLAB programs discretize
the policy function also. To do so, let

## fu min = u 1 , u 2 , ..., u M , = u max g

be the discretized candidates for the u. [Notice, however, the control
variable may not have any immediate and obvious candidates for u min
or u max . In such a case, it is often worthwhile to compute:

(Institute)

April 2009

146 / 174

## State -space representation 5

Although not essential, the text book MATLAB programs discretize
the policy function also. To do so, let

## fu min = u 1 , u 2 , ..., u M , = u max g

be the discretized candidates for the u. [Notice, however, the control
variable may not have any immediate and obvious candidates for u min
or u max . In such a case, it is often worthwhile to compute:
u
e = g (x , u
e)

(Institute)

= g (x , ue)

April 2009

146 / 174

## State -space representation 5

Although not essential, the text book MATLAB programs discretize
the policy function also. To do so, let

## fu min = u 1 , u 2 , ..., u M , = u max g

be the discretized candidates for the u. [Notice, however, the control
variable may not have any immediate and obvious candidates for u min
or u max . In such a case, it is often worthwhile to compute:
u
e = g (x , u
e)

= g (x , ue)

e, as well as

(Institute)

April 2009

146 / 174

## State -space representation 5

Although not essential, the text book MATLAB programs discretize
the policy function also. To do so, let

## fu min = u 1 , u 2 , ..., u M , = u max g

be the discretized candidates for the u. [Notice, however, the control
variable may not have any immediate and obvious candidates for u min
or u max . In such a case, it is often worthwhile to compute:
u
e = g (x , u
e)

= g (x , ue)

## to obtain the steady state value of u

e, as well as
u
eu

(Institute)

= g (x u , ueu )
0 = g (x u , u
eu )
x u = g (x u , 0)

April 2009

146 / 174

## State -space representation 6

u
eu is the maximum value of u keeping the next state variable
nonnegative and x u is the maximum sustainable level of x.]

## In Compecon, gridmake.m can be used to generate discretized

state/control variable space.

(Institute)

April 2009

147 / 174

## State -space representation 6

u
eu is the maximum value of u keeping the next state variable
nonnegative and x u is the maximum sustainable level of x.]

## In Compecon, gridmake.m can be used to generate discretized

state/control variable space.

## Having discretized both the state and control variables, the DP

program is now reduced to choose an element in [M N ] to obtain
the next state, which is an element in N state vector.

(Institute)

April 2009

147 / 174

## State -space representation 6

u
eu is the maximum value of u keeping the next state variable
nonnegative and x u is the maximum sustainable level of x.]

## In Compecon, gridmake.m can be used to generate discretized

state/control variable space.

## Having discretized both the state and control variables, the DP

program is now reduced to choose an element in [M N ] to obtain
the next state, which is an element in N state vector.
I.e., Bellman equation is now represented as an operation in which
choice of a policy for a given state fully determines the probability
that the system will reside in the next period.

(Institute)

April 2009

147 / 174

## State -space representation 6

u
eu is the maximum value of u keeping the next state variable
nonnegative and x u is the maximum sustainable level of x.]

## In Compecon, gridmake.m can be used to generate discretized

state/control variable space.

## Having discretized both the state and control variables, the DP

program is now reduced to choose an element in [M N ] to obtain
the next state, which is an element in N state vector.
I.e., Bellman equation is now represented as an operation in which
choice of a policy for a given state fully determines the probability
that the system will reside in the next period.
In Compecon toolbox, this operation is handled in the following
manner. The choice (policy) is represented by an element in 3
dimensional array (m n n, m is the number of action alternatives,
and n is the number of grids for discretized state).
(Institute)

April 2009

147 / 174

sijk
sijk
sijk

## = fchoose action i at the state j g ! fleads to state k in the nextpe

= 1 for corresponding state/action, = 0 otherwise
= pijk (probability if DP is stochastic )

(Institute)

April 2009

148 / 174

## State -space representation 8

For example, recall the example of DP above:

(Institute)

April 2009

149 / 174

## State -space representation 8

For example, recall the example of DP above:
T

Maximize

t =0

kt +1 =

(Institute)

Akt

ct

April 2009

149 / 174

## State -space representation 8

For example, recall the example of DP above:
T

Maximize

t =0

kt +1 =

Akt

ct

## Then, the state variable is kt , the control variable is ct , and the

transition equation xt +1 = g (xt , ut ) is : kt +1 = Akt ct . Let c be
the discretized control variable vector

(Institute)

April 2009

149 / 174

## State -space representation 8

For example, recall the example of DP above:
T

Maximize

t =0

kt +1 =

Akt

ct

## Then, the state variable is kt , the control variable is ct , and the

transition equation xt +1 = g (xt , ut ) is : kt +1 = Akt ct . Let c be
the discretized control variable vector
c = (c1 , c2 , ...cm )0

(Institute)

April 2009

149 / 174

## State -space representation 8

For example, recall the example of DP above:
T

Maximize

t =0

kt +1 =

Akt

ct

## Then, the state variable is kt , the control variable is ct , and the

transition equation xt +1 = g (xt , ut ) is : kt +1 = Akt ct . Let c be
the discretized control variable vector
c = (c1 , c2 , ...cm )0
and k be the discretized state vector

(Institute)

April 2009

149 / 174

## State -space representation 8

For example, recall the example of DP above:
T

Maximize

t =0

kt +1 =

Akt

ct

## Then, the state variable is kt , the control variable is ct , and the

transition equation xt +1 = g (xt , ut ) is : kt +1 = Akt ct . Let c be
the discretized control variable vector
c = (c1 , c2 , ...cm )0
and k be the discretized state vector
k = (k1 , k2 , ...kn )0
(Institute)

April 2009

149 / 174

Compute:

(Institute)

April 2009

150 / 174

Compute:
approx [Ak

(Institute)

c]

April 2009

150 / 174

Compute:
approx [Ak

c]

## for all the possible congurations of c and k. Compecon has a

program called getindex.m which facilitates this operation.

(Institute)

April 2009

150 / 174

Compute:
approx [Ak

c]

## for all the possible congurations of c and k. Compecon has a

program called getindex.m which facilitates this operation.
Specically, what it does is to assign the closest element in discretized
state vector for g (k, u ).

(Institute)

April 2009

150 / 174

Compute:
approx [Ak

c]

## for all the possible congurations of c and k. Compecon has a

program called getindex.m which facilitates this operation.
Specically, what it does is to assign the closest element in discretized
state vector for g (k, u ).

kn = A k^alpha

(Institute)

April 2009

150 / 174

Compute:
approx [Ak

c]

## for all the possible congurations of c and k. Compecon has a

program called getindex.m which facilitates this operation.
Specically, what it does is to assign the closest element in discretized
state vector for g (k, u ).

kn = A k^alpha

## knext = getindex (kn, k )

wherein k is the discretized k vector.
(Institute)

April 2009

150 / 174

## State -space representation 10

To recap, at each moment, (action, state) can be represented by
(m n) matrix.

(Institute)

April 2009

151 / 174

## State -space representation 10

To recap, at each moment, (action, state) can be represented by
(m n) matrix.
The operation above gives us the approximated discretized value of
the next state, which can be represented by an element in k vector.

(Institute)

April 2009

151 / 174

## State -space representation 10

To recap, at each moment, (action, state) can be represented by
(m n) matrix.
The operation above gives us the approximated discretized value of
the next state, which can be represented by an element in k vector.
Hence the entire operation can be summarized by 3 dimensional array
P(m n n). For example, if, say, ct = c2 , kt = k3 , and
approx [Ak3 c2 ] = k7 , then P (2, 3, 7) = 1, P (2, 3, n) = 0 for n 6= 7.
Compute this for all possible congurations of (c, k) to obtain P.

(Institute)

April 2009

151 / 174

## State -space representation 10

To recap, at each moment, (action, state) can be represented by
(m n) matrix.
The operation above gives us the approximated discretized value of
the next state, which can be represented by an element in k vector.
Hence the entire operation can be summarized by 3 dimensional array
P(m n n). For example, if, say, ct = c2 , kt = k3 , and
approx [Ak3 c2 ] = k7 , then P (2, 3, 7) = 1, P (2, 3, n) = 0 for n 6= 7.
Compute this for all possible congurations of (c, k) to obtain P.
For scholastic DP. Suppose

(Institute)

April 2009

151 / 174

## State -space representation 10

To recap, at each moment, (action, state) can be represented by
(m n) matrix.
The operation above gives us the approximated discretized value of
the next state, which can be represented by an element in k vector.
Hence the entire operation can be summarized by 3 dimensional array
P(m n n). For example, if, say, ct = c2 , kt = k3 , and
approx [Ak3 c2 ] = k7 , then P (2, 3, 7) = 1, P (2, 3, n) = 0 for n 6= 7.
Compute this for all possible congurations of (c, k) to obtain P.
For scholastic DP. Suppose
kt +1 = Akt

(Institute)

ct + ut

April 2009

151 / 174

## State -space representation 10

To recap, at each moment, (action, state) can be represented by
(m n) matrix.
The operation above gives us the approximated discretized value of
the next state, which can be represented by an element in k vector.
Hence the entire operation can be summarized by 3 dimensional array
P(m n n). For example, if, say, ct = c2 , kt = k3 , and
approx [Ak3 c2 ] = k7 , then P (2, 3, 7) = 1, P (2, 3, n) = 0 for n 6= 7.
Compute this for all possible congurations of (c, k) to obtain P.
For scholastic DP. Suppose
kt +1 = Akt

ct + ut

(Institute)

April 2009

151 / 174

Then we have

(Institute)

April 2009

152 / 174

## State -space representation 11

Then we have

P (m, n, s ) = pr [u = ks

= (ks

(Institute)

approx [Akm

approx [Akm

cn ]]

cn ])

April 2009

152 / 174

## State -space representation 11

Then we have

P (m, n, s ) = pr [u = ks

= (ks

approx [Akm

approx [Akm

cn ]]

cn ])

e is given by
wherein

(Institute)

April 2009

152 / 174

## State -space representation 11

Then we have

P (m, n, s ) = pr [u = ks

approx [Akm

= (ks

approx [Akm

approx [Akm

cn ] < u < k s

cn ]]

cn ])

e is given by
wherein
e = pr [k s

= (k s )

(Institute)

(k s

approx [Akm

cn ]

1)

April 2009

152 / 174

## State -space representation 11

Then we have

P (m, n, s ) = pr [u = ks

approx [Akm

= (ks

approx [Akm

approx [Akm

cn ] < u < k s

cn ]]

cn ])

e is given by
wherein
e = pr [k s

= (k s )

(k s

approx [Akm

cn ]

1)

and k s is the upper bound of k for the s-th grid of k and is CDF
for the standard normal.
(Institute)

April 2009

152 / 174

## State -space representation 12 Serially correlated noise

Suppose the noise is serially correlated. In principle, the matter can
be handled simply by expanding the state space to include the
su cient statistic to predict the future noise. For example, if the
noise follows an AR(1) process:

(Institute)

April 2009

153 / 174

## State -space representation 12 Serially correlated noise

Suppose the noise is serially correlated. In principle, the matter can
be handled simply by expanding the state space to include the
su cient statistic to predict the future noise. For example, if the
noise follows an AR(1) process:
t

(Institute)

= t 1 + t
t ~N (0, 1)

April 2009

153 / 174

## State -space representation 12 Serially correlated noise

Suppose the noise is serially correlated. In principle, the matter can
be handled simply by expanding the state space to include the
su cient statistic to predict the future noise. For example, if the
noise follows an AR(1) process:
t

= t 1 + t
t ~N (0, 1)

Pr(t

(Institute)

= 0 j t 1 = )
= Pr( t = 0 )
= n(0 )

April 2009

153 / 174

## State -space representation 12 Serially correlated noise

Suppose the noise is serially correlated. In principle, the matter can
be handled simply by expanding the state space to include the
su cient statistic to predict the future noise. For example, if the
noise follows an AR(1) process:
t

= t 1 + t
t ~N (0, 1)

Pr(t

= 0 j t 1 = )
= Pr( t = 0 )
= n(0 )

## Discretize this AR process by transition probability matrix PE .Use

this matrix to construct a new matrix
(Institute)

April 2009

153 / 174

## Now in this case, state vector is 2- dimensional so that the transition

matrix should be modied accordingly.

(Institute)

April 2009

154 / 174

## Now in this case, state vector is 2- dimensional so that the transition

matrix should be modied accordingly.
In Compecon toolbox, there is a program called, gridmake.m to
construct k-dimensional discretized state vectors.

(Institute)

April 2009

154 / 174

## Now in this case, state vector is 2- dimensional so that the transition

matrix should be modied accordingly.
In Compecon toolbox, there is a program called, gridmake.m to
construct k-dimensional discretized state vectors.
The length of vectors constructed from this program will be the size
of the state matrix, n. Hence the P array can quickly become huge.

(Institute)

April 2009

154 / 174

## Now in this case, state vector is 2- dimensional so that the transition

matrix should be modied accordingly.
In Compecon toolbox, there is a program called, gridmake.m to
construct k-dimensional discretized state vectors.
The length of vectors constructed from this program will be the size
of the state matrix, n. Hence the P array can quickly become huge.
This is easier said than done. Suppose you discretize x by 102 discrete
points, and discrteize AR(1) by 52 approximate realizations of t .

(Institute)

April 2009

154 / 174

## Now in this case, state vector is 2- dimensional so that the transition

matrix should be modied accordingly.
In Compecon toolbox, there is a program called, gridmake.m to
construct k-dimensional discretized state vectors.
The length of vectors constructed from this program will be the size
of the state matrix, n. Hence the P array can quickly become huge.
This is easier said than done. Suppose you discretize x by 102 discrete
points, and discrteize AR(1) by 52 approximate realizations of t .
The new matrix now has the dimension of 5304 by 5304!... but most
of the elements in this gigantic matrix is zero, so we can use SPARSE
matrix presentation to save workspace, but still....

(Institute)

April 2009

154 / 174

Binomial Tree

## This is by far the most oft- used start-up model in computational

nance to numerically approximate various options.

(Institute)

April 2009

155 / 174

Binomial Tree

## This is by far the most oft- used start-up model in computational

nance to numerically approximate various options.
To start, we discretize and approximate continuos time continuous
state Brownian motion by adding up discrete time discretized
realizations of the standard normal random variable.

(Institute)

April 2009

155 / 174

Binomial Tree

## This is by far the most oft- used start-up model in computational

nance to numerically approximate various options.
To start, we discretize and approximate continuos time continuous
state Brownian motion by adding up discrete time discretized
realizations of the standard normal random variable.
The problem is to value the standard American put option: it is the
right to sell a security whose current price follows a Brownian motion
with drift on or before the expiration period T at the exercise (strike)
price K .

(Institute)

April 2009

155 / 174

Binomial Tree

## This is by far the most oft- used start-up model in computational

nance to numerically approximate various options.
To start, we discretize and approximate continuos time continuous
state Brownian motion by adding up discrete time discretized
realizations of the standard normal random variable.
The problem is to value the standard American put option: it is the
right to sell a security whose current price follows a Brownian motion
with drift on or before the expiration period T at the exercise (strike)
price K .
Let us discretize the time between now (0) and T by equi-spaced N
discrete periods.

(Institute)

April 2009

155 / 174

(Institute)

April 2009

156 / 174

## We also discretize the Brownian motion as follows.

p
u~ exp( t )
q = pr (p 0 = pu )
p
t
1 2
1
+
(r
)
=
2
2
2
pr (p 0 = p/u ) = 1 q
= exp( r t )
t = N/T

(Institute)

April 2009

156 / 174

## Example Binomial Tree 3

With this approximation, price either goes up by u% or goes down by
u% in each period. [The (log of) price follows a Brownian motion
with drift r ].

(Institute)

April 2009

157 / 174

## Example Binomial Tree 3

With this approximation, price either goes up by u% or goes down by
u% in each period. [The (log of) price follows a Brownian motion
with drift r ].
Hence the entire development of the price process from time zero
until N can be represented by a Binomial tree expanding p0 Nu, till
p0 + Nu.

(Institute)

April 2009

157 / 174

## Example Binomial Tree 3

With this approximation, price either goes up by u% or goes down by
u% in each period. [The (log of) price follows a Brownian motion
with drift r ].
Hence the entire development of the price process from time zero
until N can be represented by a Binomial tree expanding p0 Nu, till
p0 + Nu.
Obviously, the option is useless after T , therefore, the optimal choice
at T is (assuming you have not done so at that time) to exercise the
option as far as pT is less than or equal to K . Hence we have

(Institute)

April 2009

157 / 174

## Example Binomial Tree 3

With this approximation, price either goes up by u% or goes down by
u% in each period. [The (log of) price follows a Brownian motion
with drift r ].
Hence the entire development of the price process from time zero
until N can be represented by a Binomial tree expanding p0 Nu, till
p0 + Nu.
Obviously, the option is useless after T , therefore, the optimal choice
at T is (assuming you have not done so at that time) to exercise the
option as far as pT is less than or equal to K . Hence we have
VT (p ) = max[K

(Institute)

p, 0]

April 2009

157 / 174

## Example Binomial Tree 3

With this approximation, price either goes up by u% or goes down by
u% in each period. [The (log of) price follows a Brownian motion
with drift r ].
Hence the entire development of the price process from time zero
until N can be represented by a Binomial tree expanding p0 Nu, till
p0 + Nu.
Obviously, the option is useless after T , therefore, the optimal choice
at T is (assuming you have not done so at that time) to exercise the
option as far as pT is less than or equal to K . Hence we have
VT (p ) = max[K

p, 0]

This (expected) value can be easily computed from the binomial tree
for a given value of pT 1 . Denote this value by EVT (pT 1 ). As of
t = T 1, the Bellman equation is

(Institute)

April 2009

157 / 174

## Example Binomial Tree 3

With this approximation, price either goes up by u% or goes down by
u% in each period. [The (log of) price follows a Brownian motion
with drift r ].
Hence the entire development of the price process from time zero
until N can be represented by a Binomial tree expanding p0 Nu, till
p0 + Nu.
Obviously, the option is useless after T , therefore, the optimal choice
at T is (assuming you have not done so at that time) to exercise the
option as far as pT is less than or equal to K . Hence we have
VT (p ) = max[K

p, 0]

This (expected) value can be easily computed from the binomial tree
for a given value of pT 1 . Denote this value by EVT (pT 1 ). As of
t = T 1, the Bellman equation is
VT
(Institute)

1 (pT

1)

= max[K

pT

1 , EVT (pT

1 )]
April 2009

157 / 174

## Example Binomial Tree 4

Again we can use the Binomial tree to compute the expected value
EVT 1 (pT 2 ), and so on. The solution is complete after N such
backward iterations.

(Institute)

April 2009

158 / 174

## Example Binomial Tree 4

Again we can use the Binomial tree to compute the expected value
EVT 1 (pT 2 ), and so on. The solution is complete after N such
backward iterations.
In this example, actions and states are trivial.

(Institute)

April 2009

158 / 174

## Example Binomial Tree 4

Again we can use the Binomial tree to compute the expected value
EVT 1 (pT 2 ), and so on. The solution is complete after N such
backward iterations.
In this example, actions and states are trivial.

## action(control ) = fkeep the option = 1, exercise option = 2g

state = fpt g

Payo

(Institute)

pt if exercised, 0, otherwise

April 2009

158 / 174

(Institute)

April 2009

159 / 174

P (2, n, n) = 0

(Institute)

April 2009

159 / 174

## The P array for this example is

P (2, n, n) = 0
because the game ends once the option is used, and
P (1, i, min(i + 1, n)) = q
P (1, i, max(i

(Institute)

1, 1)) = 1

April 2009

159 / 174

## Example 2 Rusts model

These two examples (asset management models) are taken from Rust
(1987) in which he estimates the model for engine maintenance and
re-building schedule.

(Institute)

April 2009

160 / 174

## Example 2 Rusts model

These two examples (asset management models) are taken from Rust
(1987) in which he estimates the model for engine maintenance and
re-building schedule.
The rough idea runs as follows. The mechanic (his name appears in
the title of the paper) of a bus company has a eet of buses. He is
responsible for the maintenance. Periodically, bus engines break down
and the probability of the breakdown is a monotonic increasing
function of the time elapsed since the last overhaul and the age.

(Institute)

April 2009

160 / 174

## Example 2 Rusts model

These two examples (asset management models) are taken from Rust
(1987) in which he estimates the model for engine maintenance and
re-building schedule.
The rough idea runs as follows. The mechanic (his name appears in
the title of the paper) of a bus company has a eet of buses. He is
responsible for the maintenance. Periodically, bus engines break down
and the probability of the breakdown is a monotonic increasing
function of the time elapsed since the last overhaul and the age.
The action (policy) choice is therefore {do nothing, service(overhaul),
replace}.

(Institute)

April 2009

160 / 174

## Example 2 Rusts model 2

The model sets the maximum number of years that each engine can
be used, n. Older engines tend to get into problems more often and
incidences may result in delay or stoppage of bus services.

(Institute)

April 2009

161 / 174

## Example 2 Rusts model 2

The model sets the maximum number of years that each engine can
be used, n. Older engines tend to get into problems more often and
incidences may result in delay or stoppage of bus services.
The periodic services (overhaul), which costs k can reduce such
incidences.

(Institute)

April 2009

161 / 174

## Example 2 Rusts model 2

The model sets the maximum number of years that each engine can
be used, n. Older engines tend to get into problems more often and
incidences may result in delay or stoppage of bus services.
The periodic services (overhaul), which costs k can reduce such
incidences.
To replace the engine costs c. Denote by gross revenue from the
engine (bus) p (a, s ). Denote by f (a, s, x ) the reward from the engine
with age a, the total number of service done on the engine, s, and x
the current action.

(Institute)

April 2009

161 / 174

(Institute)

April 2009

162 / 174

## Hence the state transition is given by

g (a, s, x ) = (a + 1, s ), if x = no action

= (a + 1, s + 1), if x = service
= (1, 0), if x = replace

(Institute)

April 2009

162 / 174

(Institute)

April 2009

163 / 174

## f (a, s, x ) = p (a, s ), if x = no action

= p (a, s + 1) k, if x = service
= p (0, 0) c, if x = replace
p (n, s ) =

(Institute)

April 2009

163 / 174

## Example 2 Rusts model 5

The last equation forces the replacement of engines with age n. This
DP has two state variables, a, s and given by

(Institute)

April 2009

164 / 174

## Example 2 Rusts model 5

The last equation forces the replacement of engines with age n. This
DP has two state variables, a, s and given by

## V (a, s ) = max[p (a, s ) + V (a + 1, s ), p (a, 0)

p (0, 0)

(Institute)

k + V (a + 1, s + 1)

c + V (1, 1)]

April 2009

164 / 174

## Example 2 Rusts model 5

The last equation forces the replacement of engines with age n. This
DP has two state variables, a, s and given by

## V (a, s ) = max[p (a, s ) + V (a + 1, s ), p (a, 0)

p (0, 0)

k + V (a + 1, s + 1)

c + V (1, 1)]

## Rust uses the maintenance and replacement records of a bus company

and estimate the relevant parameters using the maximum likelihood
method employed directly on the solution of the DP.

(Institute)

April 2009

164 / 174

## In a nutshell, the numerical solution of a DP is a model prediction of

the state and control variables. In some exceptional cases, this is
evident as we obtain colosed form solution of these endogenous
variables as a function of the set of parameters of the model.

(Institute)

April 2009

165 / 174

## In a nutshell, the numerical solution of a DP is a model prediction of

the state and control variables. In some exceptional cases, this is
evident as we obtain colosed form solution of these endogenous
variables as a function of the set of parameters of the model.
In general, however, DP do not have closed form solutions but we still
have the dependence of the solution on parameters. Suppose we have
a record (data) of endogenous and key exogenous variables. Then
numerical solutions can be used as predictions of the model and as
such we can use DP for estimation.

(Institute)

April 2009

165 / 174

## In a nutshell, the numerical solution of a DP is a model prediction of

the state and control variables. In some exceptional cases, this is
evident as we obtain colosed form solution of these endogenous
variables as a function of the set of parameters of the model.
In general, however, DP do not have closed form solutions but we still
have the dependence of the solution on parameters. Suppose we have
a record (data) of endogenous and key exogenous variables. Then
numerical solutions can be used as predictions of the model and as
such we can use DP for estimation.
Crucially, however, we need to make it explicit the sense in which the
prediction of DP is not perfect, just as any parametric model used in
econometrics do not generally give us perfect t.

(Institute)

April 2009

165 / 174

## One way to introduce uncertainty (i.e., imperfect information for the

econometrician) is to allow error term in the reward function:

(Institute)

April 2009

166 / 174

## One way to introduce uncertainty (i.e., imperfect information for the

econometrician) is to allow error term in the reward function:
b
f (a, s, x ) = f (a, s, x ) + t

(Institute)

April 2009

166 / 174

## One way to introduce uncertainty (i.e., imperfect information for the

econometrician) is to allow error term in the reward function:
b
f (a, s, x ) = f (a, s, x ) + t

## Suppose we have a record of bus engines: overhaul and replacements

for each engine, but the econometricians do not know the replacement
costs, nor the objective (reward) function. The econometricians
postulates parametric functional form for the objective function and
use additional unknown constant to represent the replacement cost.

(Institute)

April 2009

166 / 174

## Example 2 Rusts model 7

Denote by Dti (ati ) the maintenance record of the bus i of age ati :

(Institute)

April 2009

167 / 174

## Example 2 Rusts model 7

Denote by Dti (ati ) the maintenance record of the bus i of age ati :
Dti (ati ) = 0 if no overhaul or replacement
ati

(Institute)

= 1 if replaced
= ati 1 + 1 unless Dti (ati
= 1 if Dti (ati 1 ) = 1

1)

=1

April 2009

167 / 174

## Example 2 Rusts model 7

Denote by Dti (ati ) the maintenance record of the bus i of age ati :
Dti (ati ) = 0 if no overhaul or replacement
ati

= 1 if replaced
= ati 1 + 1 unless Dti (ati
= 1 if Dti (ati 1 ) = 1

1)

=1

## Solving DP under a given set of parameters provide us with the

prediction which is in general a function of
b
f (a, s, x ) = f (a, s, x ) + t : i.e., it gives us the probability that Dti (ati )
takes 0, 1, or 2.

(Institute)

April 2009

167 / 174

## Example 2 Rusts model 7

Denote by Dti (ati ) the maintenance record of the bus i of age ati :
Dti (ati ) = 0 if no overhaul or replacement
ati

= 1 if replaced
= ati 1 + 1 unless Dti (ati
= 1 if Dti (ati 1 ) = 1

1)

=1

## Solving DP under a given set of parameters provide us with the

prediction which is in general a function of
b
f (a, s, x ) = f (a, s, x ) + t : i.e., it gives us the probability that Dti (ati )
takes 0, 1, or 2.
Rust used the maximum likelihood estimation method and obtained
these key parameters.Rust, John (1987), Optimal Replacement of
GMC Bus Engines: An Empirical Model of Harold Zucker,
Econometrica 55(5): 999-1033
(Institute)

April 2009

167 / 174