Anda di halaman 1dari 3

IE 528 Dynamic Programming Homework Assignment 1 Due October 4 1.

1. (10 points) In this class we will often optimize to maximize the expected value, but sometimes this seems counter-intuitive. For example, lets say I ip a fair coin ten times. I make you the following oer: Before I start ipping you can choose option one in which I agree to give you $2 if the rst ip is a head, or you can choose option two, in which I agree $1000 if all ten ips are tails. (a) What is the expected payo of options one and two? (b) If I permit you to decide on the option after you see the rst ip, and if you will choose an action that maximizes your expected payo, what is the value of information (in this case the value of the rst ip) for this example? (c) Many people would choose option two, why? (d) How can we incorporate such a phenomenon into our DP formulation (specically the g function) so that we take this into account? 2. (5 points) Recall our denition of a strategy, n : xn un (xn ), un (xn ) Un (xn ). This takes any state xn and maps it to an admissible action. What does it mean if under a specic strategy un (xn ) = un for all xn ? 3. (10 points) Show that if gN (x) gN (x) for all x, then Jn (x) Jn (x) for all n and x, where these systems dier only by their termination costs, as given above. 4. (15 points) Prove that xed lot policies are optimal (or rather non-xed lot policies are suboptimal) in the context of the Wagner-Whitin algorithm. 5. (10 points) (Bertsekas, Exercise 1.9.) The DP algorithm given in class is based on the assumption of additive cost function. Now develop a DP-like algorithm for the case where the cost function has the multiplicative form E{wk ,0kN 1} [gN (xN ).gN 1 (xN 1 , uN 1 , wN 1 ) . . . g0 (x0 , u0 , w0 )], and gk (xk , uk , wk ) 0, for all xk , uk , wk , and k.
1

6. (15 points) Formulate the inventory example given in class: In any period n I rst place an order un , have the order arrive, have a random demand materialize (dn from some distribution D), ll the order as much as possible (excess demands are lost) and then hold inventory for next period. I get r revenue for every item sold, pay c for every item ordered, and h for every item held. I start with no inventory, have an N period problem, and have salvage value s for unsold goods. (a) Dene xn , un , wn , fn , gn , and J for this example, for 0 n N . (b) For N = 1 (you have demands in periods 0 and 1), D discrete uniform between one and four, r = 10, c = 5, h = 1, and s = 2, nd the optimal policy if you start with no inventory at the start of period zero. HINT: Prove rst that to nd the optimal policy it is sucient to consider un such that 1 xn + un 4. Then remember to solve recursively, starting from the last period. (c) Assume for the inventory example that the demand follows the distribution dn+1 P oisson(n+1 ), where n+1 = n (xn + un dn ) . In other words, if there are lost sales in period n (i.e., customers are left unsatised), then this decreases the demand intensity for the next period. Dene xn , un , wn , fn , gn , and J for this example, for 0 n N . (If any are the same as above you may just say so.) 7. (15 points) (Bertsekas, Exercise 2.14.) Consider the shortest path problem in class, except that the number of nodes in the graph may be countably innite (although the number of outgoing arcs from each node is nite). We assume that the length of each arc is a positive integer. Furthermore, there is at least one path from the origin node s to the destination node t. Consider the label correcting algorithm as stated and initialized in class, except that UPPER is initially set to some integer that is an upper bound to the shortest distance from s to t. Show that the algorithm will terminate in a nite number of steps with UPPER equal to the shortest distance from s to t. HINT: Show that there is a nite number of nodes whose shortest distance from s does not exceed the initial value of UPPER.

8. (20 points) Consider the following graph with the arc lengths shown next to the arcs:

(a) Find a shortest path from each node to node 6 by using the DP algorithm. (b) Find a shortest path from node 1 to node 6 by using the label correcting method.

Anda mungkin juga menyukai