Dynamic Programming

dynamic programming
Definition: Solve an optimization problem by caching subproblem solutions (memoization) rather than recomputing them.
optimization problem
Definition: A computational problem in which the object is to find the best of all possible solutions. More formally, find a solution in the feasible region which has the minimum (or maximum) value of the objective function. Note: An optimization problem asks, what is the best solution? A decision problem asks, is there a solution with a certain characteristic? For instance, the traveling salesman problem is an optimization problem, while the corresponding decision problem asks if there is a Hamiltonian cycle with a cost less than some fixed amount k. From Algorithms and Theory of Computation Handbook, pages 29-20 and 34-17, Copyright 1999 by CRC Press LLC. Appearing in the Dictionary of Computer Science, Engineering and Technology, Copyright 2000 CRC Press LLC.
http://mat.gsia.cmu.edu/classes/dynamic/dynamic.html
First Example
Let's begin with a simple capital budgeting problem. A corporation has $5 million to allocate to its three plants for possible expansion. Each plant has submitted a number of proposals on how it intends to spend the money. Each proposal gives the cost of the expansion (c) and the total revenue expected (r). The following table gives the proposals generated:
Table 1: Investment Possibilities
Each plant will only be permitted to enact one of its proposals. The goal is to maximize the firm's revenues resulting from the allocation of the $5 million. We will assume that any of the $5 million we don't spend is lost (you can work out how a more reasonable assumption will change the problem as an exercise). A straightforward way to solve this is to try all possibilities and choose the best. In this case, there are only ways of allocating the money. Many of these are infeasible (for instance, proposals 3, 4, and 1 for the three plants costs $6 million). Other proposals are feasible, but very poor (like proposals 1, 1, and 2, which is feasible but returns only $4 million). Here are some disadvantages of total enumeration: 1. For larger problems the enumeration of all possible solutions may not be computationally feasible. 2. Infeasible combinations cannot be detected a priori, leading to inefficiency. 3. Information about previously investigated combinations is not used to eliminate inferior, or infeasible, combinations. Note also that this problem cannot be formulated as a linear program, for the revenues returned are not linear functions. One method of calculating the solution is as follows: Let's break the problem into three stages: each stage represents the money allocated to a single plant. So stage 1 represents the money allocated to plant 1, stage 2 the money to plant 2, and stage 3 the money to plant 3. We will artificially place an ordering on the stages, saying that we will first allocate to plant 1, then plant 2, then plant 3. Each stage is divided into states. A state encompasses the information required to go from one stage to the next. In this case the states for stages 1, 2, and 3 are

{0,1,2,3,4,5}: the amount of money spent on plant 1, represented as , {0,1,2,3,4,5}: the amount of money spent on plants 1 and 2 ( ), and {5}: the amount of money spent on plants 1, 2, and 3 ( ).
Unlike linear programming, the do not represent decision variables: they are simply representations of a generic state in the stage. Associated with each state is a revenue. Note that to make a decision at stage 3, it is only necessary to know how much was spent on plants 1 and 2, not how it was spent. Also notice that we will want to be 5. Let's try to figure out the revenues associated with each state. The only easy possibility is in stage 1, the states . Table 2 gives the revenue associated with .
Table 2: Stage 1 computations. We are now ready to tackle the computations for stage 2. In this case, we want to find the best solution for both plants 1 and 2. If we want to calculate the best revenue for a given , we simply go through all the plant 2 proposals, allocate the given amount of funds to plant 2, and use the above table to see how plant 1 will spend the remainder. For instance, suppose we want to determine the best allocation for state 2 we can do one of the following proposals: 1. 2. 3. 4. . In stage
Proposal 1 gives revenue of 0, leaves 4 for stage 1, which returns 6. Total: 6. Proposal 2 gives revenue of 8, leaves 2 for stage 1, which returns 6. Total: 14. Proposal 3 gives revenue of 9, leaves 1 for stage 1, which returns 5. Total: 14. Proposal 4 gives revenue of 12, leaves 0 for stage 1, which returns 0. Total: 12.
The best thing to do with four units is proposal 1 for plant 2 and proposal 2 for plant 1, returning 14, or proposal 2 for plant 2 and proposal 1 for plant 1, also returning 14. In either case, the revenue for being in state out similarly. is 14. The rest of table 3 can be filled
Table 3: Stage 2 computations.
We can now go on to stage 3. The only value we are interested in is . Once again, we go through all the proposals for this stage, determine the amount of money remaining and use Table 3 to decide the value for the previous stages. So here we can do the following at plant 3:

Proposal 1 gives revenue 0, leaves 5. Previous stages give 17. Total: 17. Proposal 2 gives revenue 4, leaves 4. Previous stages give 14. Total: 18.
Therefore, the optimal solution is to implement proposal 2 at plant 3, proposal 2 or 3 at plant 2, and proposal 3 or 2 (respectively) at plant 1. This gives a revenue of 18. If you study this procedure, you will find that the calculations are done recursively. Stage 2 calculations are based on stage 1, stage 3 only on stage 2. Indeed, given you are at a state, all future decisions are made independent of how you got to the state. This is the principle of optimality and all of dynamic programming rests on this assumption. We can sum up these calculations in the following formulas: Denote by cost. Let calculations the revenue for proposal be the revenue of state at stage j, and by the corresponding
in stage j. Then we have the following
and
All we were doing with the above calculations was determining these functions. The computations were carried out in a forward procedure. It was also possible to calculate things from the ``last'' stage back to the first stage. We could define

= amount allocated to stages 1, 2, and 3, = amount allocated to stages 2 and 3, and = amount allocated to stage 3.
This defines a backward recursion. Graphically, this is illustrated in Figure 1.
Figure 1: Forward vs. Backward Recursion Corresponding formulas are:

Let
be the optimal revenue for stage 3, given
, , and .
be the optimal revenue for stages 2 and 3, given
be the optimal revenue for stages 1, 2, and 3, given
The recursion formulas are: and If you carry out the calculations, you will come up with the same answer. You may wonder why I have introduced backward recursion, particularly since the forward recursion seems more natural. In this particular case, the ordering of the stages made no difference. In other cases, though, there may be computational advantages of choosing one over another. In general, the backward recursion has been found to be more effective in most applications. Therefore, in the future, I will be presenting only the backward recursion, except in cases where I wish to contrast the two recursions.
A second example
Dynamic programming may look somewhat familiar. Both our shortest path algorithm and our method for CPM project scheduling have a lot in common with it. Let's look at a particular type of shortest path problem. Suppose we wish to get from A to J in the road network of Figure 2.
Figure 2: Road Network The numbers on the arcs represent distances. Due to the special structure of this problem, we can break it up into stages. Stage 1 contains node A, stage 2 contains nodes B, C, and D, stage 3 contains node E, F, and G, stage 4 contains H and I, and stage 5 contains J. The states in each stage correspond just to the node names. So stage 3 contains states E, F, and G. If we let S denote a node in stage j and let the destination J, we can write be the shortest distance from node S to
where
denotes the length of arc SZ. This gives the recursion needed to solve this . Here are the rest of the calculations:
problem. We begin by setting Stage 4.
During stage 4, there are no real decisions to make: you simply go to your destination J. So you get:

by going to J, by going to J.
Stage 3. Here there are more choices. Here's how to calculate . From F you can either go to H or I. The immediate cost of going to H is 6. The following cost is . The total is 9. The immediate cost of going to I is 3. The following cost is for a total of 7. Therefore, if you are ever at F, the best thing to .
do is to go to I. The total cost is 7, so The next table gives all the calculations:
You now continue working back through the stages one by one, each time completely computing a stage before continuing to the preceding one. The results are: Stage 2.
Stage 1.
Common Characteristics
There are a number of characteristics that are common to these two problems and to all dynamic programming problems. These are: 1. The problem can be divided into stages with a decision required at each stage. In the capital budgeting problem the stages were the allocations to a single plant. The decision was how much to spend. In the shortest path problem, they were defined by the structure of the graph. The decision was were to go next. 2. Each stage has a number of states associated with it. The states for the capital budgeting problem corresponded to the amount spent at that point in time. The states for the shortest path problem was the node reached. 3. The decision at one stage transforms one state into a state in the next stage. The decision of how much to spend gave a total amount spent for the next stage. The decision of where to go next defined where you arrived in the next stage. 4. Given the current state, the optimal decision for each of the remaining states does not depend on the previous states or decisions. In the budgeting problem, it is not necessary to know how the money was spent in previous stages, only how much was spent. In the path problem, it was not necessary to know how you got to a node, only that you did. 5. There exists a recursive relationship that identifies the optimal decision for stage j, given that stage j+1 has already been solved. 6. The final stage must be solvable by itself. The last two properties are tied up in the recursive relationships given above. The big skill in dynamic programming, and the art involved, is to take a problem and determine stages and states so that all of the above hold. If you can, then the recursive relationship makes finding the values relatively easy. Because of the difficulty in identifying stages and states, we will do a fair number of examples.
The Knapsack Problem.

The knapsack problem is a particular type of integer program with just one constraint. Each item that can go into the knapsack has a size and a benefit. The knapsack has a certain capacity. What should go into the knapsack so as to maximize the total benefit?
As an example, suppose we have three items as shown in Table 4, and suppose the capacity of the knapsack is 5.
Table 4: Knapsack Items The stages represent the items: we have three stages j=1,2,3. The state at stage j represents the total weight of items j and all following items in the knapsack. The decision at stage j is how many items j to place in the knapsack. Call this value This leads to the following recursive formulas: Let capacity for items j and following. Let to a.

. units of
be the value of using
represent the largest integer less than or equal
An Alternative Formulation
There is another formulation for the knapsack problem. This illustrates how arbitrary our definitions of stages, states, and decisions are. It also points out that there is some flexibility on the rules for dynamic programming. Our definitions required a decision at a stage to take us to the next stage (which we would already have calculated through backwards recursion). In fact, it could take us to any stage we have already calculated. This gives us a bit more flexibility in our calculations. The recursion I am about to present is a forward recursion. For a knapsack problem, let the stages be indexed by w, the weight filled. The decision is to determine the last item added to bring the weight to w. There is just one state per stage. Let g(w) be the maximum benefit that can be gained from a w pound knapsack. Continuing to use as the weight and benefit, respectively, for item j, the following relates g(w) to previously calculated g values: and
Intuitively, to fill a w pound knapsack, we must end off by adding some item. If we add item j, we end up with a knapsack of size to fill. To illustrate on the above example:

g(0) = 0 g(1) = 30 add item 3. add item 1. ad d item 1 or 3.
add item 1.
add item 1 or 3. This gives a maximum of 160, which is gained by adding 2 of item 1 and 1 of item 3.
Equipment Replacement
In the network homework, you already saw how to formulate and solve an equipment replacement problem using a shortest path algorithm. Let's look at an alternative dynamic programming formulation. Suppose a shop needs to have a certain machine over the next five year period. Each new machine costs $1000. The cost of maintaining the machine during its ith year of operation is as follows: , , and . A machine may be kept up to ,
three years before being traded in. The trade in value after i years is , and period?
. How can the shop minimize costs over the five year
Let the stages correspond to each year. The state is the age of the machine for that year. The decisions are whether to keep the machine or trade it in for a new one. Let be the minimum cost incurred from time t to time 5, given the machine is x years old in time t. Since we have to trade in at time 5,
Now consider other time periods. If you have a three year old machine in time t, you must trade in, so
If you have a two year old machine, you can either trade or keep.

Trade costs you Keep costs you .
So the best thing to do with a two year old machine is the minimum of the two. Similarly
Finally, at time zero, we have to buy, so
This is solved with backwards recursion as follows: Stage 5.
Stage 4.
Stage 3.
Stage 2.
Stage 1.
Stage 0.
So the cost is 1280, and one solution is to trade in years 1 and 2. There are other optimal solutions.
The Traveling Salesperson Problem

We have seen that we can solve one type of integer programming (the knapsack problem) with dynamic programming. Let's try another. The traveling salesperson problem is to visit a number of cities in the minimum distance. For instance, a politician begins in New York and has to visit Miami, Dallas, and Chicago before returning to New York. How can she minimize the distance traveled? The distances are as in Table 5.
Table 5: TSP example problem. The real problem in solving this is to define the stages, states, and decisions. One natural choice is to let stage t represent visiting t cities, and let the decision be where to go next. That leaves us with states. Imagine we chose the city we are in to be the state. We could not make the decision where to go next, for we do not know where we have gone before. Instead, the state has to include information about all the cities visited, plus the city we ended up in. So a state is represented by a pair (i,S) where S is the set of t cities already visited and i is the last city visited (so i must be in S). This turns out to be enough to get a recursion. The stage 3 calculations are

For other stages, the recursion is
You can continue with these calculations. One important aspect of this problem is the so called curse of dimensionality. The state space here is so large that it becomes impossible to solve even moderate size problems. For instance, suppose there are 20 cities. The number of states in the 10th stage is more than a million. For 30 cities, the number of states in the 15th stage is more than a billion. And for 100 cities, the number of states at the 50th stage is more than 5,000,000,000,000,000,000,000,000,000,000. This is not the sort of problem that will go away as computers get better.
Nonadditive Recursions
Not every recursion must be additive. Here is one example where we multiply to get the recursion. A student is currently taking three courses. It is important that he not fail all of them. If the probability of failing French is , the probability of failing English is , and the
probability of failing Statistics is , then the probability of failing all of them is . He has left himself with four hours to study. How should he minimize his probability of failing all his courses? The following gives the probability of failing each course given he studies for a certain number of hours on that subject, as shown in Table 6.
Table 6: Student failure probabilities. (What kind of student is this?) We let stage 1 correspond to studying French, stage 2 for English, and stage 3 for Statistics. The state will correspond to the number of hours studying for that stage and all following stages. Let be the probability of failing t and all following courses, assuming x hours are available. Denote the entries in the above table as , the probability of failing course t given k hours are spent on it.
The final stage is easy:
The recursion is as follows:
We can now solve this recursion: Stage 3.
Stage 2.
So, the optimum way of dividing time between studying English and Statistics is to spend it all on Statistics. Stage 1.
The overall optimal strategy is to spend one hour on French, and three on Statistics. The probability of failing all three courses is about 29%.
Stochastic Dynamic Programming

In deterministic dynamic programming, given a state and a decision, both the immediate payoff and next state are known. If we know either of these only as a probability function, then we have a stochastic dynamic program. The basic ideas of determining stages, states, decisions, and recursive formulae still hold: they simply take on a slightly different form.
Uncertain Payoffs
Consider a supermarket chain that has purchased 6 gallons of milk from a local dairy. The chain must allocate the 6 gallons to its three stores. If a store sells a gallon of milk, then the chain receives revenue of $2. Any unsold milk is worth just $.50. Unfortunately, the demand for milk is uncertain, and is given in the following table:
The goal of the chain is to maximize the expected revenue from these 6 gallons. (This is not the only possible objective, but a reasonable one.) Note that this is quite similar to some of our previous resource allocation problems: the only difference is that the revenue is not known for certain. We can, however, determine an expected revenue for each allocation of milk to a store. For instance, the value of allocating 2 gallons to store 1 is:
We can do this for all allocations to get the following values:
We have changed what looked to be a stochastic problem into a deterministic one! We simply use the above expected values. The resulting problem is identical to our previous resource allocation problems. We have a stage for each store. The states for stage 3 are the number of gallons given to store 3 (0, 1, 2, 3); the states for stage 2 are the number of gallons given to stores 2 and 3 (0, 1, 2, 3, 4, 5, 6) and the state for stage 1 is the number of gallons given to stores 1, 2, and 3 (6). The decision at stage i is how many gallons to
give to store i. If we let the above table be represented by gallons to store i, then the recursive formulae are
(the value of giving k
If you would like to work out the values, you should get a valuation of $9.75, with one solution assigning 1 gallon to store 1, 3 gallons to store 2 and 2 gallons to store 3.
Uncertain States
A more interesting use of uncertainty occurs when the state that results from a decision is uncertain. For example, consider the following coin tossing game: a coin will be tossed 4 times. Before each toss, you can wager $0, $1, or $2 (provided you have sufficient funds). You begin with $1, and your objective is to maximize the probability you have $5 at the end. of the coin tosses. We can formulate this as a dynamic program as follows: create a stage for the decision point before each flip of the coin, and a ``final'' stage, representing the result of the final coin flip. There is a state in each stage for each possible amount you can have. For stage 1, the only state is ``1'', for each of the others, you can set it to ``0,1,2,3,4,5'' (of course, some of these states are not possible, but there is no sense in worrying too much about that). Now, if we are in stage i and bet k and we have x dollars, then with probability .5, we will have x-k dollars, and with probability .5 we will have x+k dollars next period. Let be the probability of ending up with at least $5 given we have $x before the ith coin flip. This gives us the following recursion:
Note that the next state is not known for certain, but is a probabilistic mixing of states. We can still easily determine from , and from and so on back to .
Another example comes from the pricing of stock options. Suppose we have the option to buy Netscape stock at $150. We can exercise this option anytime in the next 10 days (american option, rather than a european option that could only be exercised 10 days from now). The current price of Netscape is $140. We have a model of Netscape stock movement that predicts the following: on each day, the stock will go up by $2 with probability .4, stay the same with probability .1 and go down by $2 with probability .4.
Note that the overall trend is downward (probably conterfactual, of course). The value of the option if we exercise it at price x is x-150 (we will only exercise at prices above 150). We can formulate this as a stochastic dynamic program as follows: we will have stage i for each day i, just before the exercise or keep decision. The state for each stage will be the stock price of Netscape on that day. Let be the expected value of the option on day i given that the stock price is x. Then, the optimal decision is given by:
and
Given the size of this problem, it is clear that we should use a spreadsheet to do the calculations. There is one major difference between stochastic dynamic programs and deterministic dynamic programs: in the latter, the complete decision path is known. In a stochastic dynamic program, the actual decision path will depend on the way the random aspects play out. Because of this, ``solving'' a stochastic dynamic program involves giving a decision rule for every possible state, not just along an optimal path.
``Linear'' decision making

. Many decision problems (and some of the most frustrating ones), involve choosing one out of a number of choices where future choices are uncertain. For example, when getting (or not getting!) a series of job offers, you may have to make a decision on a job before knowing if another job is going to be offered to you. Here is a simplification of these types of problems: Suppose we are trying to find a parking space near a restaurant. This restaurant is on a long stretch of road, and our goal is to park as close to the restaurant as possible. There are T spaces leading up to the restaurant, one spot right in front of the restaurant, and T after the restaurant as follows:
Each spot can either be full (with probability, say, .9) or empty (.1). As we pass a spot, we need to make a decision to take the spot or try for another (hopefully better) spot. The value for parking in spot t is . If we do not get a spot, then we slink away in embarrasment at large cost M. What is our optimal decision rule?
We can have a stage for each spot t. The states in each stage are either e (for empty) or o (for occupied). The decision is whether to park in the spot or not (cannot if state is o). If we let and be the values for each state, then we have:
In general, the optimal rule will look something like, take the first empty spot on or after spot t (where t will be negative).
Dynamic Programming
http://www.sbc.su.se/~per/molbioinfo2001/dynprog/dynamic.html The following is an example of global sequence alignment using Needleman/Wunsch techniques. For this example, the two sequences to be globally aligned are G A A T T C A G T T A (sequence #1) G G A T C G A (sequence #2) So M = 11 and N = 7 (the length of sequence #1 and sequence #2, respectively) A simple scoring scheme is assumed where

Si,j = 1 if the residue at position i of sequence #1 is the same as the residue at position j of sequence #2 (match score); otherwise Si,j = 0 (mismatch score) w = 0 (gap penalty)
Three steps in dynamic programming

1. Initialization 2. Matrix fill (scoring) 3. Traceback (alignment)
Initialization Step
The first step in the global alignment dynamic programming approach is to create a matrix with M + 1 columns and N + 1 rows where M and N correspond to the size of the sequences to be aligned. Since this example assumes there is no gap opening or gap extension penalty, the first row and first column of the matrix can be initially filled with 0.
Matrix Fill Step

One possible (inefficient) solution of the matrix fill step finds the maximum global alignment score by starting in the upper left hand corner in the matrix and finding the maximal score Mi,j for each position in the matrix. In order to find Mi,j for any i,j it is minimal to know the score for the matrix positions to the left, above and diagonal to i, j. In terms of matrix positions, it is necessary to know Mi-1,j, Mi,j-1 and Mi-1, j-1. For each position, Mi,j is defined to be the maximum score at position i,j; i.e.
Mi,j = MAXIMUM[ Mi-1, j-1 + Si,j (match/mismatch in the diagonal), Mi,j-1 + w (gap in sequence #1), Mi-1,j + w (gap in sequence #2)]
Note that in the example, Mi-1,j-1 will be red, Mi,j-1 will be green and Mi-1,j will be blue. Using this information, the score at position 1,1 in the matrix can be calculated. Since the first residue in both sequences is a G, S1,1 = 1, and by the assumptions stated at the beginning, w = 0. Thus, M1,1 = MAX[M0,0 + 1, M1, 0 + 0, M0,1 + 0] = MAX [1, 0, 0] = 1. A value of 1 is then placed in position 1,1 of the scoring matrix.
Since the gap penalty (w) is 0, the rest of row 1 and column 1 can be filled in with the value 1. Take the example of row 1. At column 2, the value is the max of 0 (for a mismatch), 0 (for a vertical gap) or 1 (horizontal gap). The rest of row 1 can be filled out similarly until we get to column 8. At this point, there is a G in both sequences (light blue). Thus, the value for the cell at row 1 column 8 is the maximum of 1 (for a match), 0 (for a vertical gap) or 1 (horizontal gap). The value will again be 1. The rest of row 1 and column 1 can be filled with 1 using the above reasoning.
Now let's look at column 2. The location at row 2 will be assigned the value of the maximum of 1(mismatch), 1(horizontal gap) or 1 (vertical gap). So its value is 1. At the position column 2 row 3, there is an A in both sequences. Thus, its value will be the maximum of 2(match), 1 (horizontal gap), 1 (vertical gap) so its value is 2. Moving along to position colum 2 row 4, its value will be the maximum of 1 (mismatch), 1 (horizontal gap), 2 (vertical gap) so its value is 2. Note that for all of the remaining positions except the last one in column 2, the choices for the value will be the exact same as in row 4 since there are no matches. The final row will contain the value 2 since it is the maximum of 2 (match), 1 (horizontal gap) and 2(vertical gap).
Using the same techniques as described for column 2, we can fill in column 3.
After filling in all of the values the score matrix is as follows:
Traceback Step
After the matrix fill step, the maximum alignment score for the two test sequences is 6. The traceback step determines the actual alignment(s) that result in the maximum score. Note that with a simple scoring algorithm such as one that is used here, there are likely to be multiple maximal alignments. The traceback step begins in the M,J position in the matrix, i.e. the position that leads to the maximal score. In this case, there is a 6 in that location.
Traceback takes the current cell and looks to the neighbor cells that could be direct predacessors. This means it looks to the neighbor to the left (gap in sequence #2), the diagonal neighbor (match/mismatch), and the neighbor above it (gap in sequence #1). The algorithm for traceback chooses as the next cell in the sequence one of the possible predacessors. In this case, the neighbors are marked in red. They are all also equal to 5.
Since the current cell has a value of 6 and the scores are 1 for a match and 0 for anything else, the only possible predacessor is the diagonal match/mismatch neighbor. If more than one possible predacessor exists, any can be chosen. This gives us a current alignment of
(Seq #1) (Seq #2) A | A
So now we look at the current cell and determine which cell is its direct predacessor. In this case, it is the cell with the red 5.
The alignment as described in the above step adds a gap to sequence #2, so the current alignment is
(Seq #1) (Seq #2) T A | _ A
Once again, the direct predacessor produces a gap in sequence #2.
After this step, the current alignment is

(Seq #1) T T A | _ _ A
Continuing on with the traceback step, we eventually get to a position in column 0 row 0 which tells us that traceback is completed. One possible maximum alignment is :
Giving an alignment of :
G A A T T C A G T T A | | | | | | G G A _ T C _ G _ _ A
An alternate solution is:
Giving an alignment of :
G _ A A T T C A G T T A | | | | | | G G _ A _ T C _ G _ _ A
There are more alternative solutions each resulting in a maximal global alignment score of 6. Since this is an exponential problem, most dynamic programming algorithms will only print out a single solution.
Advanced Dynamic Programming Tutorial

If you haven't looked at an example of a simple scoring scheme, please go to the simple dynamic programming example The following is an example of global sequence alignment using Needleman/Wunsch techniques. For this example, the two sequences to be globally aligned are G A A T T C A G T T A (sequence #1) G G A T C G A (sequence #2) So M = 11 and N = 7 (the length of sequence #1 and sequence #2, respectively) An advanced scoring scheme is assumed where

Si,j = 2 if the residue at position i of sequence #1 is the same as the residue at position j of sequence #2 (match score); otherwise Si,j = -1 (mismatch score) w = -2 (gap penalty)
Initialization Step
The first step in the global alignment dynamic programming approach is to create a matrix with M + 1 columns and N + 1 rows where M and N correspond to the size of the sequences to be aligned. The first row and first column of the matrix can be initially filled with 0.
Matrix Fill Step

One possible (inefficient) solution of the matrix fill step finds the maximum global alignment score by starting in the upper left hand corner in the matrix and finding the maximal score Mi,j for each position in the matrix. In order to find Mi,j for any i,j it is minimal to know the score for the matrix positions to the left, above and diagonal to i, j. In terms of matrix positions, it is necessary to know Mi-1,j, Mi,j-1 and Mi-1, j-1. For each position, Mi,j is defined to be the maximum score at position i,j; i.e.
Mi,j = MAXIMUM[ Mi-1, j-1 + Si,j (match/mismatch in the diagonal), Mi,j-1 + w (gap in sequence #1), Mi-1,j + w (gap in sequence #2)]
Note that in the example, Mi-1,j-1 will be red, Mi,j-1 will be green and Mi-1,j will be blue. Using this information, the score at position 1,1 in the matrix can be calculated. Since the first residue in both sequences is a G, S1,1 = 2, and by the assumptions stated earlier, w = -2. Thus, M1,1 = MAX[M0,0 + 2, M1,0 - 2, M0,1 - 2] = MAX[2, -2, -2]. A value of 2 is then placed in position 1,1 of the scoring matrix. Note that there is also an arrow placed back into the cell that resulted in the maximum score, M[0,0].
Moving down the first column to row 2, we can see that there is once again a match in both sequences. Thus, S1,2 = 2. So M1,2 = MAX[M0,1 + 2, M1,1 - 2, M0,2 -2] = MAX[0 + 2, 2 - 2, 0 - 2] = MAX[2, 0, -2]. A value of 2 is then placed in position 1,2 of the scoring matrix and an arrow is placed to point back to M[0,1] which led to the maximum score.
Looking at column 1 row 3, there is not a match in the sequences, so S 1,3 = -1. M1,3 = MAX[M0,2 - 1, M1,2 - 2, M0,3 - 2] = MAX[0 - 1, 2 - 2, 0 - 2] = MAX[-1, 0, -2]. A value of 0 is then placed in position 1,3 of the scoring matrix and an arrow is placed to point back to M[1,2] which led to the maximum score.
We can continue filling in the cells of the scoring matrix using the same reasoning. Eventually, we get to column 3 row 2. Since there is not a match in the sequences at this positon, S3,2 = -1. M3,2 = MAX[ M2,1 - 1, M3,1 - 2, M2,2 - 2] = MAX[0 - 1, -1 - 2, 1 -2] = MAX[-1, -3, -1].
Note that in the above case, there are two different ways to get the maximum score. In such a case, pointers are placed back to all of the cells that can produce the maximum score.
The rest of the score matrix can then be filled in. The completed score matrix will be as follows:
Traceback Step
After the matrix fill step, the maximum global alignment score for the two sequences is 3. The traceback step will determine the actual alignment(s) that result in the maximum score. The traceback step begins in the M,J position in the matrix, i.e. the position where both sequences are globally aligned.
Since we have kept pointers back to all possible predacessors, the traceback step is simple. At each cell, we look to see where we move next according to the pointers. To begin, the only possible predacessor is the diagonal match.
This gives us an alignment of

A | A
Note that the blue letters and gold arrows indicate the path leading to the maximum score.
We can continue to follow the path using a single pointer until we get to the following situation.
The alignment at this point is

T C A G T T A | | | | T C _ G _ _ A
Note that there are now two possible neighbors that could result in the current score. In such a case, one of the neighbors is arbitrarily chosen.
Once the traceback is completed, it can be seen that there are only two possible paths leading to a maximal global alignment.
One possible path is as follows:
This gives an alignment of

G A A T T C A G T T A | | | | | | G G A _ T C _ G _ _ A
The other possible path is as follows:
This gives an alignment of
G A A T T C A G T T A | | | | | | G G A T _ C _ G _ _ A
Remembering that the scoring scheme is +2 for a match, -1 for a mismatch, and -2 for a gap, both sequences can be tested to make sure that they result in a score of 3.
G A A T T C A G T T A | | | | | | G G A _ T C _ G _ _ A + - + - + + - + - - + 2 1 2 2 2 2 2 2 2 2 2
2-1+2-2+2+2-2+2-2-2+2=3
G A A T T C A G T T A | | | | | | G G A T _ C _ G _ _ A + - + + - + - + - - + 2 1 2 2 2 2 2 2 2 2 2
2-1+2+2-2+2-2+2-2-2+2=3 so both of these alignments do indeed result in the maximal alignment score.
Dynamic programming: an introduction

by David K. Smith and PASS Maths
http://plus.maths.org/issue3/dynamic/ If XN appears as XN then your browser does not support subscripts or superscripts. Please use this alternative version. In this issue's article "Mathematics, marriage and finding somewhere to eat", David Smith investigated the problem of finding the best potential partner from a fixed number of potential partners using a technique known as "optimal stopping". Inevitably, mathematicians and mathematical psychologists have constructed other models of the problem...
Measuring potential partners

An alternative way of looking at the problem assumes that you know that each potential partner has a score. Helen of Troy is fabled to have had "a face that could launch a thousand ships". There's an old joke that beauty can therefore be measured: one helen is the beauty needed to launch a thousand ships, one millihelen is therefore the beauty required to launch just one ship!
Helen: her affair with Paris, Prince of Troy, started the Trojan wars. Suppose we use this scale to measure each potential partner's score from 0millihelens up to a maximum of 1000millihelens with all values equally likely. We must now search for a rule which will make sure that the average score of the partner we choose is as large as possible. The obvious way to look at this would be to think about what you would do when you met the first potential partner. Then you would compare this partner's score with what you might expect to get later on if you rejected them. Unfortunately, working out what you might expect later on is a complicated mixture of all the possible decisions you could make that becomes too much to work out.
Dynamic programming
Instead, a mathematical way of thinking about it is to look at what you should do at the end, if you get to that stage. So you think about the best decision with the last potential partner (which you must choose) and then the last but one and so on. This way of tackling the problem backwards is Dynamic programming.
The word Programming in the name has nothing to do with writing computer programs. Mathematicians use the word to describe a set of rules which anyone can follow to solve a problem. They do not have to be written in a computer language. Dynamic programming was the brainchild of an American Mathematician, Richard Bellman, who described the way of solving problems where you need to find the best decisions one after another. In the forty-odd years since this development, the number of uses and applications of dynamic programming has increased enormously. For example, in 1982 David Kohler used dynamic programming to analyse the best way to play the game of darts. In his paper, published in the Journal of the Operational Research Society, he acknowledges "The Wheatsheaf at Writtle and the Norfolk Hotel in Nairobi for making research facilities available to me".
In darts, each player must score exactly 301, starting and finishing on a double.
Solving the potential partner problem

The dynamic programming approach to the potential partner problem starts by thinking about what happens when faced with the last partner. If you need to make a decision about potential partner number N then you must accept their score (which we'll call XN) and live happily ever after! When you encounter potential partner number N-1 all that you know are the values XN-1 and what you expect to get if you wait. You expect to get the average value of XN, and that will be 500 because XN varies
from 0-1000. Common sense says that you take the better of these, so your rule will be: If XN-1 is more than 500, accept that potential partner. If not, go on to potential partner number N When you encounter potential partner number N-2, you know XN-2 and the average value of the score you will get by waiting. Half the time, waiting will mean that you accept potential partner number N-1, whose score will be between 500 and 1000, averaging 750; the other half of the time, you will pass over that potential partner and you will expect a score of 500. So, waiting will give you an average score of 625, and taking the better will give the rule: If XN-2 is more than 625, accept that potential partner If not, go on to potential partner number N-1 And so on. It is much simpler than starting with potential partner number 1 and trying to think of all the possible sequences of decisions, and working forwards. For each potential partner that you meet, the best set of decisions afterwards will give a critical value for comparison; if the potential partner does better than it, choose that partner. If not, go on, even though the future is not certain. The critical values when N=10 are:
One of the characteristics of dynamic programming is that the solution to smaller problems is built into that of larger ones. Thus, if you wanted to know the critical values when there are only 6 potential partners, all you need to do is look at the last 6 values in the table, 800, 775 and so on.
Acknowledgements
This article was based on material written by Dr David K. Smith, Mathematical Statistics and Operational Research Department, University of Exeter.
Dynamic Programming Algorithm (DPA) for Edit-Distance

http://www.csse.monash.edu.au/~lloyd/tildeAlgDS/Dynamic/Edit/ The words `computer' and `commuter' are very similar, and a change of LA home just one letter, p->m will change the first word into the second. The word Algorithms `sport' can be changed into `sort' by the deletion of the `p', or glossary equivalently, `sort' can be changed into `sport' by the insertion of `p'. Dynamic P' Edit dist' Hirschberg's The edit distance of two strings, s1 and s2, is defined as the minimum number of point mutations required to change s1 into s2, where a point Bioinformatics mutation is one of: 1. change a letter, 2. insert a letter or 3. delete a letter The following recurrence relations define the edit distance, d(s1,s2), of two strings s1 and s2:
d('', '') = 0 -- '' = empty string d(s, '') = d('', s) = |s| -- i.e. length of s d(s1+ch1, s2+ch2) = min( d(s1, s2) + if ch1=ch2 then 0 else 1 fi, d(s1+ch1, s2) + 1, d(s1, s2+ch2) + 1 )
The first two rules above are obviously true, so it is only necessary consider the last one. Here, neither string is the empty string, so each has a last character, ch1 and ch2 respectively. Somehow, ch1 and ch2 have to be explained in an edit of s1+ch1 into s2+ch2. If ch1 equals ch2, they can be matched for no penalty, i.e. 0, and the overall edit distance is d(s1,s2). If ch1 differs from ch2, then ch1 could be changed into ch2, i.e. 1, giving an overall cost d(s1,s2)+1. Another possibility is to delete ch1 and edit s1 into s2+ch2, d(s1,s2+ch2)+1. The last possibility is to edit s1+ch1 into s2 and then insert ch2, d(s1+ch1,s2)+1. There are no other alternatives. We take the least expensive, i.e. min, of these alternatives. The recurrence relations imply an obvious ternary-recursive routine. This is not a good idea because it is exponentially slow, and impractical for strings of more than a very few characters. Examination of the relations reveals that d(s1,s2) depends only on d(s1',s2') where s1' is shorter than s1, or s2' is shorter than s2, or both. This allows the dynamic programming technique to be used. A two-dimensional matrix, m[0..|s1|,0..|s2|] is used to hold the edit distance values:
m[i,j] = d(s1[1..i], s2[1..j]) m[0,0] = 0 m[i,0] = i, m[0,j] = j, i=1..|s1| j=1..|s2|
m[i,j] = min(m[i-1,j-1] + if s1[i]=s2[j] then 0 else 1 fi, m[i-1, j] + 1, m[i, j-1] + 1 ), i=1..|s1|, j=1..|s2|
m[,] can be computed row by row. Row m[i,] depends only on row m[i1,]. The time complexity of this algorithm is O(|s1|*|s2|). If s1 and s2 have a `similar' length, about `n' say, this complexity is O(n2), much better than exponential! YOU NEED A BROWSER WITH NETSC@PE'S JAVASCRIPT ON! Try `go', change the strings and experiment:

appropriate m-eaning ||||| ||||| ||| approximate matching d(s1,s2)=7
L . A l l i s o n
Complexity
The time-complexity of the algorithm is O(|s1|*|s2|), i.e. O(n2) if the lengths of both strings is about `n'. The space-complexity is also O(n2) if the whole of the matrix is kept for a trace-back to find an optimal alignment. If only the value of the edit distance is needed, only two rows of the matrix need be allocated; they can be "recycled", and the space complexity is then O(|s1|), i.e. O(n).
Variations
The costs of the point mutations can be varied to be numbers other than 0 or 1. Linear gap-costs are sometimes used where a run of insertions (or deletions) of length `x', has a cost of `ax+b', for constants `a' and `b'. If b>0, this penalises numerous short runs of insertions and deletions.
Longest Common Subsequence
The longest common subsequence (LCS) of two sequences, s1 and s2, is a subsequence of both s1 and of s2 of maximum possible length. The more alike that s1 and s2 are, the longer is their LCS.
Other Algorithms
There are faster algorithms for the edit distance problem, and for similar problems. Some of these algorithms are fast if certain conditions hold, e.g. the strings are similar, or dissimilar, or the alphabet is large, etc.. Ukkonen (1983) gave an algorithm with worst case time complexity O(n*d), and the average complexity is O(n+d2), where n is the length of the strings, and d is their edit distance. This is fast for similar strings where d is small, i.e. when d<<n.
Applications
File Revision
The Unix command diff f1 f2 finds the difference between files f1 and f2, producing an edit script to convert f1 into f2. If two (or more) computers share copies of a large file F, and someone on machine-1 edits F=F.bak, making a few changes, to give F.new, it might be very expensive and/or slow to transmit the whole revised file F.new to machine-2. However, diff F.bak F.new will give a small edit script which can be transmitted quickly to machine-2 where the local copy of the file can be updated to equal F.new. treats a whole line as a "character" and uses a special edit-distance algorithm that is fast when the "alphabet" is large and there are few chance matches between elements of the two strings (files). In contrast, there are many chance character-matches in DNA where the alphabet size is just 4, {A,C,G,T}.
diff
Try `man diff' to see the manual entry for diff.
Remote Screen Update Problem

If a computer program on machine-1 is being used by someone from a screen on (distant) machine-2, e.g. via rlogin etc., then machine-1 may need to update the screen on machine-2 as the computation proceeds. One approach is for the program (on machine-1) to keep a "picture" of what the screen currently is (on machine-2) and another picture of what it should become. The differences can be found (by an algorithm related
to edit-distance) and the differences transmitted... saving on transmission band-width.
Spelling Correction
Algorithms related to the edit distance may be used in spelling correctors. If a text contains a word, w, that is not in the dictionary, a `close' word, i.e. one with a small edit distance to w, may be suggested as a correction. Transposition errors are common in written text. A transposition can be treated as a deletion plus an insertion, but a simple variation on the algorithm can treat a transposition as a single point mutation.
Plagiarism Detection
The edit distance provides an indication of similarity that might be too close in some situations ... think about it.
Example
An example of a DNA sequence from `Genebank' can be The edit distance gives an indication of how found [here]. The `close' two strings are. Similar measures are used simple edit distance to compute a distance between DNA sequences algorithm would (strings over {A,C,G,T}, or protein sequences normally be run on (over an alphabet of 20 amino acids), for various sequences of at most purposes, e.g.: a few thousand bases. 1. to find genes or proteins that may have shared functions or properties 2. to infer family relationships and evolutionary trees over different organisms
Molecular Biology
Speech Recognition
Algorithms similar to those for the edit-distance problem are used in some speech recognition systems: find a close match between a new utterance and one in a library of classified utterances.
Notes
1. V. I. Levenshtein. Binary codes capable of correcting deletions, insertions and reversals. Doklady Akademii Nauk SSSR 163(4)
p845-848, 1965, also Soviet Physics Doklady 10(8) p707-710, Feb 1966.
Discovered the basic DPA for edit distance.
2. S. B. Needleman and C. D. Wunsch. A general method applicable to the search for similarities in the amino acid sequence of two proteins. Jrnl Molec. Biol. 48 p443-453, 1970.
Defined a similarity score on molecular-biology sequences, with an O(n2) algorithm that is closely related to those discussed here.
3. Hirschberg (1975) presented a method of recovering an alignment (of an LCS) in O(n2) time but in only linear, O(n)space; see [here]. 4. E. Ukkonen On approximate string matching. Proc. Int. Conf. on Foundations of Comp. Theory, Springer-Verlag, LNCS 158 p487-495, 1983.
Worst case O(nd)-time, average case O(n+d2)-time algorithm for edit-distance, where d is the edit-distance between the two strings.
5. See also exact, as opposed to approximate, (sub-)string [matching]. 6. More research information on "the" DPA and Bioinformatics [here]. 7. If your programming language does not support 2-dimensional arrays, and requires arrays or strings to indexed from zero upwards, some home-grown address translation will be needed to program the DPA defined above.
Exercises
1. Give a DPA for the longest common subsequence problem (LCS). 2. Modify the edit distance DPA to that it treats a transposition as a single point-mutation.

Dynamic Programming

Diunggah oleh

Informasi Dokumen

Deskripsi Asli:

Hak Cipta

Format Tersedia

Bagikan dokumen Ini

Bagikan atau Tanam Dokumen

Opsi Berbagi

Apakah menurut Anda dokumen ini bermanfaat?

Apakah konten ini tidak pantas?

Hak Cipta:

Format Tersedia

Dynamic Programming

Diunggah oleh

Hak Cipta:

Format Tersedia

dynamic programming

Table 1: Investment Possibilities

Table 3: Stage 2 computations.

in stage j. Then we have the following

This defines a backward recursion. Graphically, this is illustrated in Figure 1.

Figure 1: Forward vs. Backward Recursion Corresponding formulas are:

be the optimal revenue for stage 3, given

be the optimal revenue for stages 2 and 3, given

be the optimal revenue for stages 1, 2, and 3, given

problem. We begin by setting Stage 4.

The Knapsack Problem.

be the value of using

represent the largest integer less than or equal

g(0) = 0 g(1) = 30 add item 3. add item 1. ad d item 1 or 3.

Trade costs you Keep costs you .

Finally, at time zero, we have to buy, so

This is solved with backwards recursion as follows: Stage 5.

The Traveling Salesperson Problem

For other stages, the recursion is

The final stage is easy:

The recursion is as follows:

We can now solve this recursion: Stage 3.

Stochastic Dynamic Programming

We can do this for all allocations to get the following values:

(the value of giving k

``Linear'' decision making

Three steps in dynamic programming

Matrix Fill Step

After filling in all of the values the score matrix is as follows:

Once again, the direct predacessor produces a gap in sequence #2.

After this step, the current alignment is

An alternate solution is:

Advanced Dynamic Programming Tutorial

Matrix Fill Step

This gives us an alignment of

The alignment at this point is

One possible path is as follows:

This gives an alignment of

The other possible path is as follows:

This gives an alignment of

Dynamic programming: an introduction

Measuring potential partners

Solving the potential partner problem

Dynamic Programming Algorithm (DPA) for Edit-Distance

m[i,j] = d(s1[1..i], s2[1..j]) m[0,0] = 0 m[i,0] = i, m[0,j] = j, i=1..|s1| j=1..|s2|

Longest Common Subsequence

Try `man diff' to see the manual entry for diff.

Remote Screen Update Problem

to edit-distance) and the differences transmitted... saving on transmission band-width.

Anda mungkin juga menyukai