Anda di halaman 1dari 19

Exploring Algorithms*

Traveling Salesperson I: Brute Force, Greedy, and Heuristics

Introduction

Imagine you have the unfortunate job of traveling to 10 different towns in your area each
month in order to deliver something important. Needless to say, you'd like to make this
road trip as short as possible. Each town is a different distance away from your town and
from each other town. How do you figure out a route that will minimize the distance
traveled? You may recognize this as the famous Traveling Salesperson Problem (TSP).

TSP is a very important NP-complete problem. It maps to many real-world problems


making it one of the most thoroughly explored of the NP-complete problems. People
need to solve this problem, despite the fact that there is no known polynomial solution for
all instances. We will explore TSP in this module as a means for illustrating a host of
different algorithms and approaches to solving problems. Just about everything has been
tried with TSP!

How to Solve It?

One way to solve this problem (and any other NP-complete problem) is to enumerate all
possible routes, of which there are 10! (3,628,800) for 10 towns, and then choose the
shortest. This is a brute force algorithm. It would only take a couple seconds on a typical
PC to compute all the possible routes and distances for 10 towns. And you would only
need to do it once.

There are problems with brute force algorithms, however. One is combinatorial
explosion: if we add one more town, the increase in the number of routes we need to
compute increases by 1000%. If we go up to 20 towns, we could not possibly enumerate
all the routes in our lifetime!
It is often possible to apply some clever techniques to reduce the number of enumerations
that a brute force algorithm produces. Why not minimize the work involved if we can?
Greedy algorithms sometimes provide a way to reduce the work involved with a brute
force algorithm.

A greedy algorithm is one where we make the best choice at each stage of an algorithm
given the immediate information available. These choices do not take into account all the
data available from all stages of the algorithm. Sometimes the immediate information is
enough and an optimal solution is found; sometimes it's not enough, and non-optimal
solutions are found.

There is a simple greedy algorithm for solving TSP: Start at any city. From there, go to
the closest unvisited city. Continue until you have visited all the cities, at which point you
return to the starting city. In practice, this works fairly well, but it may or may not find an
optimal solution.

Check out this resource for some other interesting examples. Then continue on in this
resource through the discussion of greedy algorithms.

As shown in the counting change and knapsack examples, greedy algorithms are
interesting alternatives to brute force algorithms. As mentioned earlier, the difficulty with
greedy algorithms is there is no guarantee that you will find the optimal solution, whereas
with brute force an optimal solution is guaranteed. Greedy algorithms are useful
regardless, because they are usually easy to define and can provide good approximations
of an optimal solution. More important, they significantly reduce the work involved.
Which do you use in which situation? Here is a frequently-cited guideline: If the
computing time is not too long and you only need to compute the solution once, brute
force works fine. If the computing time is too long to do brute force and/or you need to
compute the solution often, it makes sense to try and reduce the number of solutions that
we need to process. This can be done with a greedy algorithm.

Still More Ways to Solve It

Another option for reducing work is heuristics. A heuristic is an informal, practical


method, like a guideline or "rule of thumb", to help solve a problem. As with greedy
algorithms, we use heuristics to reduce the work compared to a brute force algorithm.
Also like greedy algorithms, we are not guaranteed an optimal solution but it will likely
be a usable solution.

Consider the following approach to solving the Traveling Salesperson Problem. Start by
picking around 1000 random enumerations. Then of those 1000, pick the best one - that
would be the one with the shortest total distance. It's highly unlikely that the one you find
is the best. But it will be pretty good. Plus testing random orderings is easy. It does not
take much time to check 1000 of them.

Now comes the heuristic: the 2-opt technique. We randomly pick two links between cities
in our best random solution. We then remove those links and replace them with others
that keep the route connected.

In this illustration, we remove the red links 5-1 and 4-8 on the left, and replace them with
the blue links 4-5 and 8-1 on the right. Notice that we need to reverse some of the other
links to keep the route connected.

Next we check if this change improves the overall distance traveled. If it does, we use the
new ordering and try swapping other random links. We keep going until we have tried a
large number of swaps with no improvement, at which point we stop. The solution we
end with may not be optimal, but it's a very good one.
This heuristic can be generalized to k-opt, that is, you can rearrange three, four, or more
links in the same way we did two. The more links you remove, however, the more
complicated it is to put the route back together again so it's connected.

Another Greedy Example

The greedy algorithm we discussed above for solving TSP is called the nearest neighbor
algorithm. This is where we start at any city, and from there, go to the closest unvisited
city. Continue until you have visited all the cities, at which point you return to the starting
city. There is a more sophisticated version of this greedy algorithm which considers all
closest cities at each stage.

Here is an example which illustrates:

Name the cities x1, x2, x3, x4, x5, x6. We will start with x1. City x4 is closest to x1
according to the cost matrix, so we create a route consisting of those two cities: x1-x4-x1.
Now we consider the two cities in our route and find the closest city to one of these. City
x3 is closest to x4 so we put x3 before x4 in the new route = x1-x3-x4-x1.

There are now two cities each 3 units from cities in this route: x2 and x6. Suppose we
pick x2 and insert it before x3 to obtain the following: x1-x2-x3-x4-x1. City x6 is still 3
units away from x1, so we insert x6 before x1 to get: x1-x2-x3-x4-x6-x1. Finally, x5 is
within 4 units of x3 and x6 so we choose to insert it before x6. This gives us a near-
minimal solution of x1-x2-x3-x4-x5-x6-x1 with a cost of 19.

We can mix heuristics in with a greedy algorithm to see if we can improve on this
solution. One observation is the overall distance generally depends on the starting city.
Thus, by applying the algorithm to each city in turn as the starting point and taking the
shortest of the six solutions obtained, we could get an improvement.

Traveling Salesperson II: Divide and Conquer

In the previous lesson, we saw how brute force and greedy approaches could help us
solve TSP. Another way to solve this problem uses a technique with which you are
probably familiar: divide and conquer. This technique works by recursively breaking a
problem down into two or more sub-problems of the same (or related) type. This
"dividing" goes on until the sub-problems become simple enough to be solved directly.
Then we "conquer", that is, the solutions to the sub-problems are combined to give a
solution to the original problem.

You may have seen this approach in binary search:

int binsearch(int x, int v[], int low, int high)

/* recursive binary search: find x in v[low]..v[high]; return index of


location */

int mid, loc;

mid = (low + high) / 2;

if (x == v[mid])

return(mid);

else if ((x < v[mid]) && (low < mid))

return(binsearch(x, v, low, mid-1));

else if ((x > v[mid]) && (high > mid))

return(binsearch(x, v, mid+1, high));

else

return(-1);

And recursive sorting algorithms such as mergesort, or quicksort.


MergeSort(L)

if (length of L > 1) {

Split list into first half and second half

MergeSort(first half)

MergeSort(second half)

Merge first half and second half into sorted list

}
Divide and conquer algorithms tend to be inherently simple and efficient, if the sub-
problems have been properly divided and combined. It's the combination step that is
usually the most challenging to design. This technique also lends itself to parallel
implementation, where the sub-problems are run on separate processors.

There is a divide and conquer heuristic for TSP. This is a common approach for fleet and
transportation companies who have to solve TSP all the time! Basically, they take the
route map and divide the stops into smaller groups. Then, they build routes between these
groups. Note that this application of TSP removes the requirement of visiting each stop
exactly once.

Streets often divide into regions naturally, particularly if a highway, river or some other
natural barrier cuts through a region. It's also common to assign all the stops in a region
to one driver. This gives the driver an opportunity to become familiar with the area,
which increases speed and efficiency.

This is an interesting approach for TSP, but it does not "solve" it. Solving it requires that
we come up with a polynomial algorithm that works for all possible instances of the
problem. What divide and conquer provides is a heuristic that helps us get closer to a
solution by dividing the route map into smaller route maps, that are easier to solve.

Traveling Salesperson III: Branch and Bound

What If We Need an Optimal Solution?

In the previous lessons, we presented a number of ways to solve TSP, but none (except
brute force) guarantee an optimal solution. Brute force is only feasible for a small number
of cities. Is there a non-brute-force approach that guarantees an optimal solution?
Fortunately, there is.

To learn about this approach, we start with search trees and backtracking algorithms.
Backtracking algorithms try each possible solution until they find the right one. It is a
depth-first search of the set of possible solutions. During the search, if it turns out the
current alternative being considered doesn't work, the search backtracks to the place
where the different alternatives were presented. Then, it tries the next alternative. It keeps
going until all alternatives have been tried. Backtracking is commonly implemented
recursively.

A search tree is a way of representing the execution of a backtracking algorithm. We start


at the root and then add nodes to the tree to represent our exploration as we work towards
a solution. If we find we reach a dead-end or the leaf of the tree, we backtrack to explore
other paths. The classic example is a maze-search where the nodes generated from a
parent represent an attempt to move up, down, right, or left.
Branch and Bound Searching

Branch and bound searching is a variation of backtracking for problems where we are
looking for an optimal solution. The trick is to calculate for each new node in a search
tree a bound on what solutions this node will produce.

In other words, when we add a new node to the search tree, we somehow compute the
value of the best possible solution this node could produce. These bounds serve two
purposes. First, if we already have a solution that is better than the bound we calculate for
a node, we need not bother to explore this path. Second, we can use the bounds as a
heuristic to determine which nodes to expand first.

We will apply this idea to TSP. First we need to determine a systematic way of examining
every possible circuit. Consider the following map:
We can generate a tree with every possible combination of links, which represents a
brute-force approach. But we want to "prune" the tree somehow, that is, we don't want to
explore all possible paths in the tree. One way to do this is to provide for each node some
restrictions defining which links are included and excluded from our Traveling
Salesperson tour.

Let's say the root node has no restrictions. From the root node, we branch to two
possibilities: one including a first link, the other excluding it. (By "first", we assume that
the links are somehow ordered. Any order will do: just sort them alphabetically: a-b, a-c,
a-d...) We continue this recursively until every possibility has been generated. One of
these possibilities will be the optimal tour.

The basic rule we will use is: each node must have exactly two links incident upon it
(since we are building a tour). We’ll see how this is used in a moment.

A Bounding Function

Now we need a bounding function. It can be proven that no tour can cost less than half of
the total of the least-cost links incident on each node. For example, in the graph above the
least cost links incident on a are (a, d) and (a, b) which total 5. For node b we have (a, b)
and (b, c) for a total of 6. The lower bound on a complete tour for this map turns out to be
(5+6+8+7+9)/2 = 17.5.

Now consider a tour subject to some restrictions. Suppose the search constrains us to
include (a, e) and exclude (b, c). The two cheapest links incident on a are now (a, e) —
which we are stuck with — and (a, d) for a total of 6. For b we get (a, b) and (b, e) for a
cost of 7. For c we must select (a, c) and (c, d) for 13, and so on. The lower bound cost
subject to these constraints is 20.5.
Whenever a new node is added to our search space, we first infer whatever we can about
which links are included. For example, if we are constrained to include both (a, b) and (a,
c), then we know that we can't include (a, d) and (a, e). Otherwise we would have more
than two links incident on a.

Given these constraints, we will then compute the lower bound for a tour. If this bound is
worse than a solution we have already found, we can ignore this node and backtrack. If
not, we will generate the two children for this node, and use their bounds as a heuristic
for which to expand first. A picture of a complete trace of this algorithm is given below.
1. We begin the search with no constraints. The lower bound for the solution is 17.5.
2. The first link to consider is (a, b). Node B includes this link, node C does not. The
bounds for each of these nodes is computed. We will expand B first since it has a
better bound than C.
3. The next link to consider is (a, c). Node D includes (a, c), node E does not. For
node D we can infer that both (a, d) and (a, e) must be excluded. Again, we will
choose the more promising node to expand first. In this case we will expand E.
4. Node E produces F and G. F includes (a, d) and G does not. For node F, we can
infer that (a, e) must be excluded; for node G we must include (a, e). (Otherwise
G would have only one link incident on a.). We choose to expand F first.
5. The next link to consider is (b, c). Node H includes this link, Node I does not. In
both of these nodes we now have enough information to infer a complete tour.
The tour in H costs 23, the tour in I costs 21. This is the best tour found so far.
6. We now have completely expanded F, so we backtrack to E. But the other child of
E: G, produces a tour costing 23 in the best case. Since this is worse than the tour
found in I, we can prune this node.
7. We now backtrack to B, and discover that we can prune D as well. (20.5 rounds to
21, which is no better than our best tour.)
8. We now backtrack to A, where we consider its second child C.
9. At this level, we were considering link (a, c). Node J includes this link, K does
not. We may immediately prune K, since its lower bound is 21.
10. J produces L and M. M can also be immediately pruned.
11. L produces N and O. Both of these are complete tours. As it turns out, N produces
a tour costing only 19. This is now recorded as the best tour.

When these calls unwind, we discover that the entire tree has been searched, although
parts have been pruned. Thus, the cost of the best circuit is 19, given the starting point of
A.

Note that this algorithm produces the best possible solution to the traveling salesperson
problem. Since TSP is known to be NP-complete, no algorithm to solve TSP could run in
polynomial time. Although this algorithm is not polynomial, it is usually much faster than
simply exploring every possible circuit. In the worst case, however, this algorithm could
be exponential and mirror a brute-force approach.

Dynamic Programming Techniques

One of the most powerful techniques for solving problems is to break them down into
smaller, more easily solved parts. Smaller problems are not as overwhelming, and they
allow us to focus on details that can be lost when we are looking at the overall problem.
In addition, breaking a problem into smaller instances of the same type of problem often
suggests a recursive algorithm.

There are categories of algorithms based on breaking problems down into smaller
problems. In a previous lesson, we saw that divide and conquer breaks a problem into
parts. These algorithms typically split the problem in half, solve each half, then bring the
halves back together to form a full solution (for example, quicksort). Dynamic
programming typically breaks a problem down into subproblems, solves a smaller
problem, and then saves the solution to this smaller problem so we can use it later.

The Fibonacci sequence is often used to illustrate the power of dynamic programming.
The sequence is defined by the following recurrence relation:

F0 = 0
F1 = 1
Fn = Fn-1 + Fn-2

This very easily translates into a recursive function:

int Fibonacci(int n) {

if ((n == 0) || (n == 1))

return n;

else

return (Fibonacci(n-1) + Fibonacci(n-2));

Seems simple enough. What is the running time of Fibonacci()? Consider the call
Fibonacci(4).

You may remember from your study of binary trees that a nearly complete one, like the
one above, has around 2n nodes, where n is the height of the tree (number of levels - 1).
That means the number of function calls is around 2n, which is exponential given each
function call is O(1). That's very expensive for such a simple little function!

Dynamic programming provides a very elegant way to solve Fibonacci and other
expensive problems. Notice in the above tree all the redundant work being done. We
make multiple calls to F(0) and F(1). If we were to store the value of F(0) and F(1) and
just access that stored value when we needed it, rather than make another recursive call,
that would save a lot of time. That's exactly how dynamic programming works. Here is
how Fibonacci would be written using dynamic programming:
int fib(int n) {

int f[n+1];

f[1] = f[2] = 1;

for (int i = 3; i <= n; i++)

f[i] = f[i-1] + f[i-2];

return f[n];

The first thing to notice about this function is its running time is O( n ). It just has a for-
loop that iterates approximately n times. This is much more efficient than the recursive
version. In addition, we are storing successive values of Fibonacci in F[i], which is a key
characteristic of dynamic programming. We are trading space for time.

Problems that can be solved with dynamic programming have two important properties:

• Optimal substructure: A problem has this property if an optimal solution can be


constructed efficiently from optimal solutions to its subproblems.
• Overlapping subproblems: A problem can be broken down into subproblems
which are reused several times. This is what allows us to save the results and
reuse those.

We can apply dynamic programming in a top-down fashion: The problem is broken into
subproblems, and these subproblems are solved and the solutions saved, in case they are
needed. This is the approach that is implemented recursively.

We can also apply dynamic programming in a bottom-up fashion: The subproblems that
might be needed are solved in advance and then used to build up solutions to larger
problems. It's not always apparent which subproblems will be needed, which makes
bottom-up challenging. On the other hand, bottom-up is usually cheaper than top-down in
usage of space.

The Checkerboard problem is another excellent example of the application of dynamic


programming. TSP improvements have also been developed using this technique.

What to Use When?

In the previous lessons, we came up with different solutions to TSP (and many other
problems) based on a variety of algorithmic approaches. In this lesson, we summarize the
approaches and provide guidelines for which to use when, and then apply these
guidelines to some other interesting problems.

As a software engineer, it is important to have a deep understanding of these approaches,


and how and when to use them. In addition, software engineers and computer scientists
require a working knowledge of the most common algorithms in various categories of
problems. This resource provides a good catalog. Spend some time looking through the
categories to make sure you can use these algorithms in your own work.

Here is a summary of the approaches we have discussed thus far:

• Brute Force: This technique is often the first thing to try. These algorithms are
obvious, easy to implement, and can provide an optimal solution. On the
downside, they are usually slow, sometimes so slow that brute force may not be
workable. If you only need to compute the solution once and the running time is
reasonable, brute force may be a valid approach.

Example: For TSP, compute every possible tour and pick the one with the shortest
total distance. This is workable for n <= 10.

• Greedy: A greedy algorithm is one where we make the best choice at each stage of
an algorithm given the immediate information available. These choices do not
take into account all the data available from all stages of the algorithm.
Sometimes the immediate information is enough and an optimal solution is found;
sometimes it's not enough, and non-optimal solutions are found. As we have seen,
greedy algorithms do not guarantee an optimal solution.

In order for a greedy algorithm to work optimally, the problem must exhibit two
properties:

o Optimal substructure: A problem has this property if an optimal solution


can be constructed efficiently from optimal solutions to its subproblems.
o Greedy Choice: This refers to how a greedy algorithm makes the best
possible choice at each stage given the currently available information. It
iteratively makes one greedy choice after another, reducing each given
problem into a smaller one. If an optimal solution is found using a greedy
algorithm, then the information currently available at each stage was
sufficient to find the optimal solution.

An algorithm for finding a minimal spanning tree is an example of a greedy


algorithm that guarantees an optimal solution because the problem has the optimal
substructure and greedy choice properties. This algorithm was developed by
Robert Prim. Here is a page with more details on minimal spanning trees, and an
interesting animation of Kruskal's algorithm, another greedy algorithm that finds
an optimal solution.
TSP is a problem that does not have the optimal substructure property. There is a
requirement in building a TSP tour to visit each city exactly once and return
home. This "exactly once" makes breaking any TSP map into sub-maps often
impossible. For a greedy solution to work, the subproblems must be independent,
that is, the solution of one subproblem can not affect the solution to another
subproblem. This is not the case with sub-maps of TSP. Notice in the problem of
finding a minimal spanning tree, the subproblems are independent.

TSP does not have the greedy choice property either. Recall the algorithm for
performing a nearest neighbor search that we proposed for TSP. This is an
example of a greedy algorithm, but does not provide an optimal solution. The
nearest-neighbor algorithm for TSP is to start at any city; from there, go to the
closest unvisited city; continue until you have visited all the cities, at which point
you return to the starting city. The distances of the nearest neighbors to a
particular city is not enough information to find the minimal total distance.

How can you tell if a problem has these properties? This would be useful
information because the right greedy algorithm will give an optimal solution.
Unfortunately, there is no way to tell if a greedy strategy will result in an optimal
solution. A typical approach is to try some greedy algorithms and see how they
perform. Often, we can get a good sense of whether a problem has the two
required properties by trying to solve it with a greedy approach.

If a greedy algorithm produces an optimal solution, however, it is often possible


to prove that solution is optimal (proof that Prim's algorithm for finding an MST
produces an optimal solution).

• Heuristics: These are guidelines or "rules of thumb" that can be used to define an
algorithm or improve another algorithm. It's always a good idea to go the extra
step of looking for simple enhancements. There is usually a trade-off between the
degree of improvement and the cost in running time.

Example: If we use the greedy nearest-neighbor algorithm for finding a solution


to TSP, we often apply a heuristic to see if it provides any improvement. We start
at any city; from there, go to the closest unvisited city; continue until we have
visited all the cities, at which point we return to the starting city. The heuristic is
to run this algorithm n times (for n cities), giving each city a turn to be the starting
city. This will often find a shorter total distance.

• Divide and Conquer: This technique works by recursively breaking a problem


down into two or more subproblems of the same type, until these subproblems
become simple enough to be solved directly. The solutions to the subproblems are
then combined to give a solution to the original problem.

This is a common technique for sorts and searches. It can be very efficient for
certain types of problems, particularly when the subproblems do not overlap. If
they do, for example in the Fibonacci recurrence relation, divide and conquer is
often exponential, and we look to dynamic programming as an alternative.

Divide and conquer algorithms can provide elegant and efficient solutions for
problems that break down into subproblems naturally. For some problems, divide
and conquer is the only simple way to solve them. An excellent example of this is
the Towers of Hanoi.

Examples: Mergesort, Quicksort, Binary search

• Dynamic Programming: A divide and conquer technique where we save the


results of subproblems that repeat. Problems that can be solved with dynamic
programming have two important properties:
o Optimal substructure: A problem has this property if an optimal solution
can be constructed efficiently from optimal solutions to its subproblems.
o Overlapping subproblems: A problem can be broken down into
subproblems which are reused several times. This is what allows us to
save the results and reuse those.

We can apply dynamic programming in a top-down fashion: The problem is


broken into subproblems, and these subproblems are solved and the solutions
saved, in case they are needed. This is the approach that is implemented
recursively.

We can also apply dynamic programming in a bottom-up fashion: The


subproblems that might be needed are solved in advance and then used to build up
solutions to larger problems. It's not always apparent which subproblems will be
needed, which makes bottom-up challenging. On the other hand, bottom-up is
usually cheaper than top-down in usage of space.

We use dynamic programming when we apply a divide and conquer technique


and notice the repeating subproblems. It is also a good alternative when a greedy
approach is not providing an optimal enough solution.

Example: The iterative implementation of the Fibonacci sequence.

• Branch and Bound: This represents a depth-first search of a problem space. We


organize the search for a solution with a search tree, and try and prune the tree
with a bounding function related to the domain of the problem. This is an
effective way of searching for an optimal solution to NP-complete or other
sophisticated problems. In the worst case, depending on how the search is
organized, every possible solution might be examined, which will have the same
running time as brute force. Typically, this is not the case, but it usually isn't a
polynomial solution either.
Example: The branch and bound solution to TSP. Here are some other
applications of branch and bound.

The following problems are solved by applying the guidelines provided above. The goal
is to help you develop problem-solving skills and instincts on which techniques work best
in different situations. Before reading through these problems, check out this resource
which provides a good starting point for solving any problem.

Problem #1

Let's start with a problem that allows us to use common algorithms. The problem is to
find the kth smallest element in a list of integers.

Here are some possible algorithms that solve this problem:

• Sorting: We could just sort the list and then extract the kth smallest element from
the start of the list. The running time for the most efficient sort is O( n log n )
(QuickSort, MergeSort, HeapSort).
• We can iterate over the entire list keeping track of the smallest element found thus
far. The running time for this is O( n ).
• Do an incomplete selection sort. That is, find the smallest value and move it to the
beginning of the list. Find the next smallest value and move it to the second
position. Keep doing this until you have moved the kth element into position. The
running time is O( k*n ).
• Use the Hoare Selection Algorithm. This is based on QuickSort:
• function select(list, k, left, right) {

• choose a pivot value list[pivotIndex];

• pivotNewIndex := partition(list, left, right, pivotIndex)

• if k = pivotNewIndex

• return list[k]

• else if k < pivotNewIndex

• return select(list, k, left, pivotNewIndex-1)

• else

• return select(list, k-pivotNewIndex, pivotNewIndex+1, right)

• }

The running time is just like QuickSort which is usually O( n log n ), but can run
in O( n2 ) time if bad partition values are consistently chosen.
• We can use a data structure, like a binary search tree to minimize the search time
for the kth element. The running time here is O( log n ), but we have the overhead
of inserting and deleting such that the tree remains balanced and in search tree
form.

So which one do we choose? There are many factors to consider in deciding on an


appropriate algorithm.

• How often will the algorithm be run? If it's just once, then a linear time solution is
probably fine. If we have to make this selection again and again, it might be better
to make the up-front investment of doing the sort so the selection can be done in
O( 1 ) time, assuming the input does not change.

• How big is the input? Will it change? If so, how big will the input grow or shrink
over time? The input complicate things. If it is very large, the sort becomes
expensive. Maybe the incomplete selection sort might be a good compromise. If
the input changes frequently, the sort will have to be re-done, but perhaps not
every time if the element to be inserted is after the kth element. What's the cost in
making that determination?

• How important is speed? If it's really not that important (maybe this process can
be run in the background), then a linear solution may be fine.

• We refer you again to this resource, which has a great set of questions to consider
in both designing and choosing an algorithm.

Problem #2

We often need to calculate binomial coefficients in combinatorial problems. In


mathematics, a binomial coefficient is a coefficient of any of the terms in the expansion
of the binomial (x+y)n.

Here is an example of how binomial coefficients are used in combinatorics. Let's say
there are n ice cream toppings to choose from. If one wishes to create an ice cream
sundae with exactly k toppings, then the binomial coefficient expresses how many
different types of such k-topping ice cream sundaes are possible.

If it has been a while since you have worked with binomial coefficients, here is a quick
review.

We are interested in designing an algorithm to calculate a binomial coefficient. This time,


we present the solutions in order of increasing efficiency.

• Just apply the formula for C( n,k ): n! / ((n - k)! * k!).

To avoid extra calculations, we might use the trick described in this resource.
• If we only have to compute one binomial coefficient, brute force calculation
works fine. Another idea, if we don't have to do the calculation often, is to use the
recursive definition, assuming we have a factorial function:

choose(m, n) = fact( m ) / (fact( n ) * fact(m-n))

As you can imagine, this is extremely slow, particularly if we use the recursive
version of factorial!

• If we have to do it frequently, we can compute Pascal's triangle once and do


searches. This is a dynamic programming approach, and is described here.
• Finally, the most elegant solution of all is one defined by Lilavati, over 850 years
ago, and described here. This runs in O( n ) time.

* Except as otherwise noted, the content of this presentation is licensed under the
Creative Commons Attribution 2.5 License.

Anda mungkin juga menyukai