Anda di halaman 1dari 7

Catalan Numbers

Let’s start with a strange-sounding question: in how many ways can a collection of
parentheses appear in a sentence? Say, for example, that a sentence has exactly one pair
of parentheses. Ignoring the rest of the text, there’s no question: the parentheses must
appear in the order (). But suppose now that a sentence has two pairs of parentheses.
It might have two separate parenthetical statements, so that the parentheses appear like
this:
()()
or it could have one parenthetical statement included in another, so that if you stripped
away all the text the parentheses would be in this order:

(())

These are the only possibilities, so we would say there are two ways for two pairs of
parentheses to appear in a sentence.
What about three pairs? At this point, you should stop reading and try to work out
on your own how many ways five pairs of parentheses might appear in a sentence; we’ll
tell you the answer on the next page.

1
The answer is that there are five ways they might appear:

()()() ()(()) (())() (()()) and ((()))

Of course, we’re talking mathematics here, not literary style: you won’t find too many
sentences (in non-technical writing at least) with three pairs of parentheses arranged as in
second of these configurations (and if you did it would (probably) be a fairly convoluted
sentence)—this is QR, not Expos. For four pairs it turns out there are 14 ways:

(((()))) ((()())) ((())()) ((()))() (()(())) (()()()) (())(())


(())(()) (())()() ()((())) ()(()()) ()(())() ()(()()) and ()()()()

In general, we ask: in how many ways can n pairs of parentheses appear in a sentence?
The answer is called the nth Catalan number, and is denoted cn . By convention, we say
that the 0th Catalan number c0 is 1. Thus we have seen that

c0 = 1
c1 = 1
c2 = 2
c3 = 5 and
c4 = 14

and the next few are


c5 = 42
c6 = 132 and
c7 = 429.

The Catalan numbers are a fascinating sequence of numbers that arise in a surprising
variety of counting problems. In the remainder of this discussion, we’ll describe a pair
of formulas for the Catalan numbers—one a recursive formula (one that expressed each
Catalan number in terms of the previous ones) and the other a direct one—and also talk
about some of the ways they arise.

A recursion relation
The Catalan numbers, like the Fibonacci numbers, satisfy a recursion relation: that
is, there is a rule for finding each Catalan number if you know all the preceding ones. The
rule, though, is very different from that for the Fibonacci numbers.
To derive it, let’s go back to the list of the (14) ways that four pairs of parentheses
can appear, but let’s try to do it systematically. Of course, any such sequence has to start
with a left parenthesis. We ask: where in the sequence is its mate?—that is, the right

2
parenthesis that closes the parentheticl clause it begins. For example, one possibility is
that its mate comes right after it: in other words, that the second symbol in the sequence
is a right parenthesis, immediately ending the clause begun by the first. What follows then
would be simply an arrangement of the remaining three pairs of parentheses, so that the
whole sequence would appear as ³ ´
{3 pairs}.

The number of such sequences is just the number of arrangements of three pairs of paren-
theses, that is, c3 = 5. Explicitly, these are:
³´ ³´ ³´ ³´ ³´
()()() ()(()) (())() (()()) and ((()))

Next, it could be that between the initial left parenthesis and its mate there is exactly one
pair of parentheses, and after its mate there are two pairs: schematically,
³ ´
{1 pair} {2 pairs}.

Now, there is no choice about the arrangement of the single pair between the initial open
parenthesis and its mate; but we do have c2 = 2 choices for the last two pairs. There are
thus altogether 2 sequences of this type: explicitly,
³ ´ ³ ´
() ()() and () (()).

The third possibility is that between the initial left parenthesis and its mate there are
exactly two pairs of parentheses, and after its mate there is one pair: schematically,
³ ´
{2 pairs} {1 pair}.

As before, there is no choice about the arrangement of the single pair following the initial
open parenthesis and its mate; but we do have c2 = 2 choices for the two pairs nested
inside them. There are thus altogether 2 sequences of this type: explicitly,
³ ´ ³ ´
()() () and (()) ().

Finally, the last possibility is that the mate of the initial open parentheses is the final
right parenthesis: in other words, the entire sequence is part of one parenthetical clause,
or schematically, ³ ´
{3 pairs} .

There are c3 = 5 such sequences, corresponding to the five possible arrangements of the
three pairs of parentheses in the middle: explicitly,
³ ´ ³ ´ ³ ´ ³ ´ ³ ´
()()() ()(()) (())() (()()) and ((())) .

3
We add up all these possibilities, and we see that c4 = 5 + 2 + 2 + 5 = 14.
We can use the same approach to calculate any Catalan number cn assuming we know
all the ones before it. Basically, we do exactly what we’ve just done here for c4 : we break
up all possible sequences according to where the mate of the initial parenthesis appears—
that is, how many pairs appear between them. In other words, every sequence of n pairs
of parentheses has schematically the form

( {i pairs} ) {n − i − 1 pairs}.

for some i (including i = 0 and i = n − 1 as possibilities). To specify a sequence of this


form we have to choose one of the ci ways of arranging the i pairs nested inside the initial
parenthesis and its mate, and to choose one of the cn−i−1 arrangements of the parentheses
following. There are thus ci · cn−i−1 sequences of this form, and so we arrive at the formula

cn = c0 cn−1 + c1 cn−2 + c2 cn−3 + . . . + cn−2 c1 + cn−1 c0 .

To express this in words: suppose you write out the Catalan numbers from c0 to
cn−1 —for example, if n = 5, this would be the sequence

1 1 2 5 14

Now write the same sequence in reverse directly below this one:

1 1 2 5 14
14 5 2 1 1

Now multiply each pair of numbers lying directly over one another:

14 5 4 5 14

and add up these products:

14 + 5 + 4 + 5 + 14 = 42

The result will then be the next Catalan number!—in this case, c5 = 42. Which you have
to admit is a lot simpler than writing out all 42 ways.
Exercise. Check that this rule works for the first four Catalan numbers (starting with c1 ),
and use it to calculate the next Catalan number c6 .

Another interpretation
Here is another interesting interpretation of the Catalan numbers. Remember¡k+`¢ when
we first introduced binomial coefficients, we described the binomial coefficient k as the

4
number of possible paths leading from the lower left to the upper right corner of a k × `
grid, where at each junction the path went either up (north) or to the right (east). The
reason was, we could describe such a path by a sequence of letters consisting of k N’s and
` E’s: a sequence such as
E N E E N E N

would correspond to the directions, “go right, then up, then right and right again, then
up, then right and finally up,” or in other words to this path:

Now imagine that we have n pairs of parentheses appearing in a sentence—say

(()()())

Suppose we replace each open parenthesis with an “N” and each close parenthesis with an
“E”. The resulting sequence

N N E N E N E E

corresponds to a set of directions taking you from one corner to the opposite corner of a
4 × 4 grid:

5
But what does it mean to say that these directions correspond to a “grammatical”
sequence of parentheses? That’s simple enough: it means you can’t close a pair of paren-
thesis before you open them—that is, at every point in the sequence you must have had
at least as many open (left) parentheses as close (right) parentheses. In terms of the path
corresponding to the sequence, this just means that at every stage you must have gone at
least as far north as east. In other words, the path must stay above the diagonal. We can
thus say that the nth Catalan number cn counts the number of paths from one corner of
an n × n triangle to the other. For example, the number of paths through the grid

finish

start

would be the 6th Catalan number c6 .

An actual formula
Consider for a moment all possible sequences of n left parentheses and n right paren-
theses, without regard to logic
¡2nor
¢ grammar. We know that the number of such sequences is
just the binomial coefficient n . We might phrase the question: among all ways in which
n left parentheses and n right parentheses might appear in a sequence, what fraction of
these ways are grammatically possible? This suggests an experiment: let’s compare the
Catalan numbers 1, 1, 2, 5, 14, 42, . . . with the corresponding binomial coefficients, and see
if we can see a pattern in their ratios. Here are the results:
¡2n¢
n cn n ratio

0 1 1 1:1
1 1 2 1:2
2 2 6 1:3
3 5 20 1:4
4 14 70 1:5
5 42 252 1:6

Well, that’s clear enough: based on the evidence so far, it seems natural to think that

6
the nth Catalan number cn is given by the formula
µ ¶
1 2n
cn =
n+1 n

and so it is. We’re not going to prove that here, but it’s a fun problem to think about:
why should this formula hold?
¡ ¢
One way to express this formula is that say that, of the 2n n paths leading from the
lower left to the upper right corner of a n × n grid, the fraction of those that stay above the
1
diagonal is n+1 . If you think in those terms, you’ll be able to solve the following problem:

Exercise. Let’s say 20 people show up to go see the $5 matinee at the local theater.
Suppose that 10 of these people have exact change, but the other 10 have only a $10 bill,
and will require change. Unfortunately, on this particular day the cashier has forgotten to
stop at the bank and so he has no change to start with. What are the odds that he will
be able to sell all 20 people tickets without running out of change?

Anda mungkin juga menyukai