Anda di halaman 1dari 29

Probability of Winning at Tennis I.

Theory and Data


By Paul K. Newton and Joseph B. Keller

The probability of winning a game, a set, and a match in tennis are computed,
based on each players probability of winning a point on serve, which we
assume are independent identically distributed (iid) random variables. Both
two out of three and three out of five set matches are considered, allowing a
13-point tiebreaker in each set, if necessary. As a by-product of these formulas,
we give an explicit proof that the probability of winning a set, and hence a
match, is independent of which player serves first. Then, the probability of each
player winning a 128-player tournament is calculated. Data from the 2002 U.S.
Open and Wimbledon tournaments are used both to validate the theory as well
as to show how predictions can be made regarding the ultimate tournament
champion. We finish with a brief discussion of evidence for non-iid effects in
tennis, and indicate how one could extend the current theory to incorporate
such features.

1. Introduction
We wish to calculate the probability that one player, A, wins a tennis match
against another player B. It is not enough to know the rankings of A and B,
because there is no unambiguous way to translate rankings into probabilities
of winning [1, 2]. However, it does suffice to know the probability pAR that A
Address for correspondence: Paul K. Newton, Department of Aerospace and Mechanical Engineering
and Department of Mathematics, University of Southern California, Los Angeles, CA 90089-1191;
e-mail: newton@spock.usc.edu

241
STUDIES IN APPLIED MATHEMATICS 114:241269
2005 by the Massachusetts Institute of Technology
Published by Blackwell Publishing, 350 Main Street, Malden, MA 02148, USA, and 9600 Garsington
Road, Oxford, OX4 2DQ, UK.


C

242

P. K. Newton and J. B. Keller

wins a rally when A serves, and the probability pBR that B wins a rally when
B serves. Such probabilities have been used to calculate the probability of
winning a game in other racquet sports, such as racquetball [3], squash [4],
and badminton [5]. Models of this type for tennis were first considered by Hsi
and Burych [6], followed by Carter and Crews [7], and Pollard [8]. All of
these analyses, including ours, treat points in tennis as independent identically
distributed (iid) random variables, hence pAR and pBR are taken as constant
throughout a match. A recent statistical analysis of 4 years of Wimbledon data
[9] shows that although points in tennis are not iid, for most purposes this
is not a bad assumption as the divergence from iid is small. Other aspects
of tennis that have been analyzed using probabilistic models include optimal
serving strategies [10], the efficiency of various scoring systems [11], and the
question of which is the most important point [12]. Statistical methods have
also been used to study the effects of new balls [13], service dominance [14],
and the probabilities of winning the final set of a match [15].
Our formulation unifies and extends some of the previous treatments by the
use of hierarchical recurrence relations whose solutions yield the probability
that A wins a game, a set, or a match against B in terms of pAR and pBR . We then
calculate the probability that a player in a 128 player single elimination
tournament reaches the second, third, . . . , or final round, and the probability
that a player who has reached the nth round will win the tournament. We also
provide an explicit proof, based on the solutions of our recurrence relations,
that the probability of winning a set or match does not depend on which player
serves first.
Of course the probability pAR that A wins a rally on serve depends upon the
opponent B as well as upon A. If data are not available for A serving to B, then
data for A playing against players similar to B can be used. We illustrate
this point with data from the 2002 U.S. Open Mens and Womens Singles
Tournaments, and from the 2002 Wimbledon Mens and Womens Singles
Tournaments. The data, shown in Tables 1 and 2, and in Figure 1, agree well
with our theoretical calculation of pAG , the probability that A wins a game
when A serves. In a companion paper (part II), we will compare the theory
with Monte Carlo simulations.
A game in tennis is played with one player serving. The game is won by the
first player to score four or more points and to be at least two points ahead of
the other player. In a set, the players serve alternate games until a player wins at
least six games and is ahead by at least two games. If the game score reaches
66, a 13-point tiebreaker is used to determine who wins the set, with the player
who started serving the set serving the first point of the tiebreaker.1 Then, the

1 In

the U.S. Open, a tiebreaker is used in every set, whereas in Wimbledon, in the French Open, and in
the Australian Open, tiebreakers are not used in the third set of a two out of three set match (womens
format), or the fifth set in a three out of five set match (mens format).

Probability of Winning at Tennis

243

Table 1
Data for the Semifinalists in the 2002 U.S. Open Tournament
Player
Women
S. Williams
V. Williams
L. Davenport
A. Mauresmo
Men
P. Sampras
A. Agassi
L. Hewitt
S. Schalken

240
270
206
287

349
428
301
457

52
56
45
58

57
70
53
75

0.69
0.63
0.68
0.63

0.91
0.8
0.85
0.77

0.89
0.79
0.88
0.79

573
443
436
519

781
676
654
768

124
96
91
107

130
110
107
119

0.73
0.66
0.67
0.68

0.95
0.87
0.85
0.9

0.93
0.85
0.86
0.88

Column A: points won on serve; Column B: total points served; Column C:


games won on serve; Column D: total games served; Column E: empirical
probability pAR of winning a rally on serve = A/B; Column F: empirical
probability pAG of winning a game on serve = C/D; Column G: theoretical
probability pAG of winning a game on serve, given by (5), with pAR from
Column E.
Table 2
Data for the Semifinalists in the 2002 Wimbledon Tournament
Player
Women
S. Williams
V. Williams
J. Henin
A. Mauresmo
Men
L. Hewitt
D. Nalbandian
T. Henman
X. Malisse

276
273
252
241

390
352
427
378

57
51
48
50

64
62
66
57

0.71
0.67
0.59
0.64

0.89
0.82
0.73
0.88

0.91
0.86
0.71
0.81

450
516
457
483

646
847
683
721

96
94
92
101

107
128
110
114

0.70
0.61
0.67
0.67

0.90
0.73
0.84
0.89

0.90
0.76
0.86
0.86

Column A: points won on serve; Column B: total points served; Column C:


games won on serve; Column D: total games served; Column E: empirical
probability pAR of winning a rally on serve = A/B; Column F: empirical
probability pAG of winning a game on serve = C/D; Column G: theoretical
probability pAG of winning a game on serve, given by (5), with pAR from
Column E.

244

P. K. Newton and J. B. Keller


1

2002 Wimbledon semifinalists


First round opponents (women)
First round opponents (men)
2002 US Open semifinalists
First round opponents (women)

0.9

0.8

0.7

First round opponents (men)

0.6

pG

0.5

0.4

0.3

0.2

0.1

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

pR
A

Figure 1. The probability pAG of A winning a game when A serves, i.e., of holding
serve, as a function of pAR based on (6). The open circles correspond to data from eight
semifinalists in the 2002 U.S. Open Mens and Womens Singles Tournaments and the open
stars correspond to data from eight semifinalists in the 2002 Wimbledon Mens and Womens
Singles Tournaments. The four left most data points represent the combined data from the
semifinalists first round opponents in each tournament.

players alternate serves, each serving two consecutive points, until someone
wins at least seven points, and is ahead by at least two points. The winner of
the tiebreaker wins the set with seven games to the opponents six games. To
win a match, a player must win two out of three sets (womens format), or win
three out of five sets (mens format), with the two players serving alternate
games throughout the match. The initial server in the match is determined by a
coin toss, with the winner given the choice of serving first or receiving first.
2. Probability of winning a game
Player A can win a game against player B by a score of (4, 0), (4, 1) or (4,
2), or else the score can become (3, 3), called deuce. Then, A can win by

Probability of Winning at Tennis

245

getting two points ahead of B, with a score of (n + 5, n + 3) with n 0. To


calculate the probability pAG that A wins a game when A serves, we assume
that pAR is the probability that A wins a rally when A serves, and set qAR =
1 pAR , qAG = 1 pAG . We also denote by pAG (i, j) the probability that the
score reaches i points for A and j points for B when A serves. Upon summing
the probabilities of the different ways in which A can win, we get
p GA =

2


p GA (4, j) + p GA (3, 3)

p ADG (n + 2, n).

(1)

n=0

j=0

Here, pDG
A (n + 2, n) is the probability that A wins by scoring n + 2 while B
scores n after deuce has been reached, with A serving. It is given by
p ADG (n + 2, n) =

n


j=0

= p AR

p AR q AR

2 

 j  R R n j
qA pA

 R 2
n!
p
j!(n j)! A

n
p AR q AR 2n .

(2)

Upon using (2) in (1), and summing the geometric series, we get
p GA =

2


1
 2 
p GA (4, j) + p GA (3, 3) p AR 1 2 p AR q AR .

(3)

j=0

Elementary combinatorial analysis yields


 4
p GA (4, 0) = p AR ,
p GA (3, 3) =

 4
p GA (4, 1) = 4 p AR q AR ,

p GA (4, 2) =

5 4  R 4  R 2
pA qA ,
2

6!  R R 3
p q .
(3!)2 A A

(4)

Now using (4) in (3) gives the probability that A wins a game when A serves,
i.e., that A holds serve:

3  2 
1
 4 
 2 
p GA = p AR 1 + 4q AR + 10 q AR + 20 p AR q AR p AR 1 2 p AR q AR .

(5)

This equation agrees with that given in [7]. Figure 1 shows pAG as a function of
pAR , based upon (5). The open circles in the figure are data for the semifinalists
in the 2002 U.S. Open Mens and Womens Singles Tournaments, shown in
Table 1, while the stars are data for the semifinalists in the 2002 Wimbledon
Mens and Womens Singles Tournaments shown in Table 2. The left most four
points are totals for their first round opponents in both tournaments. They all
lie close to the theoretical curve.

246

P. K. Newton and J. B. Keller

3. Probability of winning a set


3.1. Recursion equations
pAS

Let
denote the probability that player A wins a set against player B, with A
serving first, and qAS = 1 pAS . To calculate pAS in terms of pAG and pBG , we
define pAS (i, j) as the probability that in a set, the score becomes i games for
A, j games for B, with A serving initially. Then,
p SA =

4


p SA (6, j) + p SA (7, 5) + p SA (6, 6) p TA .

(6)

j=0

Here, pAT is the probability that A wins a 13-point tiebreaker with A serving
initially, and qAT = 1 pAT .
To calculate pAS (i, j), needed in (6), we use the following recursion formulas
and initial conditions:
For 0 i, j 6:
if i 1 + j is even: p SA (i, j) = p SA (i 1, j) p GA + p SA (i, j 1)q AG
omit i 1 term if j = 6, i 5;
omit j 1 term if i = 6, j 5

(7)

if i 1 + j is odd: p SA (i, j) = p SA (i 1, j)q BG + p SA (i, j 1) p GB


omit i 1 term if j = 6, i 5;
omit j 1 term if i = 6, j 5

(8)

Initial conditions:
p SA (0, 0) = 1;

p SA (i, j) = 0

if i < 0, or j < 0.

(9)

In terms of pAS (6, 5) and pAS (5, 6), we have


p SA (7, 5) = p SA (6, 5)q BG ;

p SA (5, 7) = p SA (5, 6) p GB .

(10)

The explicit solution of (7)(10) is given in the Appendix.


3.2. Probability of winning a tiebreaker
To calculate pAT in terms of pAR and pBR , we define pAT (i, j) to be the probability
that the score becomes i for A, j for B in a tiebreaker with A serving initially.
Then,
p TA

5

j=0

p TA (7,

j) +

p TA (6, 6)


n=0

p TA (n + 2, n).

(11)

Probability of Winning at Tennis

247

Because the sequence of serves in a tiebreaker is A, BB, AA, BB, etc., we have
p TA (n + 2, n) =

n



p AR p BR

 j  R R n j
qA qB

j=0

= p AR p BR + q AR q BR

n

n!
pRq R
j!(n j)! A B

p AR q BR .

(12)

Using (12) in (11) and summing yields


p TA =

5



1
p TA (7, j) + p TA (6, 6) p AR q BR 1 p AR p BR q AR q BR

(13)

j=0

To calculate pAT (i, j), we use the recursion formulas:


For 0 i, j 7:
if i 1 + j = 0, 3, 4, . . . , 4n 1, 4n, . . .
p TA (i, j) = p TA (i 1, j) p AR + p TA (i, j 1)q AR
omit j 1 term if i = 7, j 6
omit i 1 term if j = 7, i 6

(14)

if i 1 + j = 1, 2, 5, 6, . . . , 4n + 1, 4n + 2, . . .
p TA (i, j) = p TA (i 1, j)q BR + p TA (i, j 1) p BR
omit j 1 term if i = 7, j 6
omit i 1 term if j = 7, i 6

(15)

Initial conditions:
p TA (0, 0) = 1;

p TA (i, j) = 0

if i < 0, or j < 0.

(16)

The solution of (14)(16) is given in the Appendix.


Next we calculate pAT by using the solution of (14)(16) in (13). Now we
can calculate pAS by using the solution of (7)(9), and (10), with the result for
pAT , in (6).
Figure 2 shows the probability of player A winning a set against player B
plotted as a function of pAR [0, 1] for the full range values of pBR in increments
of 0.1. The data shown are compiled from the 2002 U.S. Open Mens Singles
event. Of the 117 completed matches played, there were 9 matches in which a
player (designated player B) had a value of pBR = 0.50 0.01, 33 matches in
which pBR = 0.60 0.01, and 20 matches with pBR = 0.70 0.01. Because each
match involves three, four, or five sets, it is necessary to combine data from
several matches to get meaningful statistics. Hence, each data point shown in
the figure represents a compilation of several matches grouped according to
corresponding values of pAR . Each of the three data points associated with the

248

P. K. Newton and J. B. Keller


1

p R = .50 +
- .01
B

p R = .60 +
- .01

0.9

p R = .70 +
- .01
B

0.8

= .9
B

pR

= .8
B

pR

pR

= .7

= .6
B

pR

pR

= .4
pR

pR

= .3

= .2
B

0.5

pR

pR

pS

= .1

0.6

= .5

0.7

0.4

0.3

0.2

0.1

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

pR
A

Figure 2. The probability pAS of player A winning a set plotted as a function of pAR for
various values of pBR . Compiled data from the 2002 U.S. Open Mens Singles event are shown
for the values pBR = 0.50 0.01, pBR = 0.60 0.01, and pBR = 0.70 0.01.

curve marked pBR = 0.50 represents three matches, each of the seven data points
associated with the curve marked pBR = 0.60 represents approximately five
matches, while each of the three data points associated with the curve marked
pBR = 0.70 represents a compilation of approximately seven matches. Given
the relatively small number of sets underlying each of the data points, the
data fits the theoretical curves reasonably well. Figure 3 shows the probability
of player A winning a tiebreaker against player B plotted as a function of
pAR [0, 1] for the full range values of pBR in increments of 0.1.
3.3. Serving or receiving first
In this section, we prove that there is no theoretical advantage to serving first by
showing that the probability of player A winning the set when serving first, pAS , is
equal to his probability of winning the set when receiving first, qBS . For this, we
need formula (6) for pAS , along with the corresponding formula for qBS given by

Probability of Winning at Tennis

249

0.9

0.8

0.7

0.6

= .9
B

pR

pR

= .8

= .7
B

pR

= .6
B

pR

= .5
B

pR

= .4
B

pR

pR

= .2

= .1
pR

0.4

= .3

0.5

pR

pT

0.3

0.2

0.1

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

pR
A
Figure 3. The probability pAT of player A winning a tiebreaker plotted as a function of pAR for
various values of pBR .

q BS =

4


p BS ( j, 6) + p BS (5, 7) + p BS (6, 6)q BT .

(17)

j=0

We obtain the terms pBS ( j, i) in (17) from pAS (i, j) given in the Appendix, by
interchanging pAG qBG , pBG qAG . From (A.1) and (A.6) it is immediate that
p SA (6, 0) = p BS (0, 6)

(18)

p SA (7, 5) = p BS (5, 7).

(19)

It is also clear from (A.7) that


p SA (6, 6) = p BS (6, 6).

(20)

Thus, it remains to show that


4

j=1

p SA (6,

j) =

4

j=1

p BS ( j, 6)

(21)

250

P. K. Newton and J. B. Keller

and that
p TA = q BT .

(22)

p SA (6, 1) + p SA (6, 2) = p BS (1, 6) + p BS (2, 6)

(23)

To prove (21), we show that

and
p SA (6, 3) + p SA (6, 4) = p BS (3, 6) + p BS (4, 6).

(24)

By using formulas (A.2)(A.5), and replacing


=1
= 1 pBG ,
we can write
6+2n
 6+2n

 i   j
p SA (6, 2n 1) + p SA (6, 2n) =
aijS (n) p GA p GB
(25)
qAG

pAG ,

qBG

i=0 j=0

1, 6) +

p BS (2n

p BS (2n, 6)

6+2n
 6+2n


 i   j
bijS (n) p GA p GB

(26)

i=0 j=0

for n = 1, 2. Then, it can be shown that the coefficients of each are equal,
i.e., aijS (n) = bijS (n). The values are listed in the Appendix. Figure 4 shows
the probability of obtaining each of the scores that are independent of which
player serves first for the case of evenly matched players.
To prove that pAT = qBT , we use the formula (11) for pAT and the corresponding
one for qBT
q BT

5


p TB (

j, 7) +

p TB (6, 6)

p TB (n, n + 2).

(27)

n=0

j=0

We obtain the terms pBT ( j, i) in (27) from pAT (i, j) given in the Appendix, by
interchanging pAR qBR , pBR qAR . From (A.14) it is clear that pAT (6, 6) =
pBT (6, 6). Furthermore, from the symmetry under exchanging pAR qBR , pBR
qAR in (12), we have that
p TA (n + 2, n) = p TB (n, n + 2).

(28)

Thus, it remains to show that


5

j=0

p TA (7, j) =

5


p TB ( j, 7).

(29)

j=0

To prove this, we show that


p TA (7, 0) + p TA (7, 1) = p TB (0, 7) + p TB (1, 7),

(30)

p TA (7, 2) + p TA (7, 3) = p TB (2, 7) + p TB (3, 7),

(31)

Probability of Winning at Tennis

251

0.9

0.8

0.7

0.6

pR
B

0.5

(e)

0.4

0.3

(d)

0.2

0.1

(c)

(b)
(a)

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

pR
A

Figure 4. Set scores that are independent of which player serves first, plotted for two equal
players pAR = pBR . (a) pAS (6, 0), (b) pAS (6, 1) + pAS (6, 2), (c) pAS (6, 3) + pAS (6, 4), (d) pAS (7, 5),
and (e) pAS (6, 6).

p TA (7, 4) + p TA (7, 5) = p TB (4, 7) + p TB (5, 7).

(32)

By using formulas (A.8)(A.13) and replacing qAR = 1 pAR , qBR = 1 pBR , we


can write
p TA (7, 2n) + p TA (7, 2n + 1) =

4
4 


 i   j
aijT (n) p AR p BR

(33)

 i   j
bijT (n) p AR p BR

(34)

i=0 j=0

p TB (2n, 7) + p TB (2n + 1, 7) =

4
4 

i=0 j=0

for n = 0, 1, 2. Then, it can be shown that the coefficients are equal, i.e.,
aijT (n) = bijT (n). The values are listed in the Appendix. Figure 5 shows the
probability of obtaining each of the tiebreaker scores that are independent of
which player serves first, for equally matched players.

252

P. K. Newton and J. B. Keller


1

0.9

0.8

0.7

0.6

pR
B

0.5

0.4

0.3

(d)
0.2

(c)

(b)
0.1

(a)
0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

pR
A

Figure 5. Tiebreaker scores that are independent of which player serves first, plotted for two
equal players pAR = pBR . (a) pAT (7, 0) + pAT (7, 1), (b) pAT (7, 2) + pAT (7, 3), (c) pAT (7, 4) +
pAT (7, 5), and (d) pAT (6, 6).

The question of whether to serve or receive first has received some attention
in the literature. In an interesting combinatorial analysis of Kingston [16]
(followed by a note [17]), a simplified scoring system (which he calls a short
set) is considered in which player A serves the first game of a match consisting
of the best N of 2N 1 games. His striking result is that it does not matter
whether the rules are such that the players alternate serves after each game, or
whether the winner of the previous game continues to serve the next game.
In either case, player A has the same probability of winning. At the end of
the article, he asks how many games need to be played to give two equal
players a reasonably equal chance of winning, whoever starts serving. As a
consequence of the central limit theorem, player As (approximate) probability
of winning a short set is 12 + 12 ( p AR 12 )[ p AR (1 p AR )(N 1)]1/2 . Figure 2
in his paper shows the slow convergence to 12 as N , giving player A a
distinct advantage, for finite N, if he serves first and pAR > 0.5. Thus, for best
N of 2N 1 scoring, there is a theoretical advantage to serving first. For

Probability of Winning at Tennis

253

tennis scoring, the paper of Pollard [8] considers both classical scoring (no
tiebreakers) and tiebreaker scoring, and implicit in his calculations (see, for
example, his Tables 2 and 3) is the fact that pAS = qBS , although the result is not
proven. There are other ways of proving and generalizing the result that do not
rely on the explicit solutions for pAS and qBS as our proof does. In fact, one can
prove that as long as the scoring system is such that the number of games
served by player A minus the number of games served by player B is 1, 0,
or 1, there is no advantage or disadvantage to serving first. Such scoring
systems are termed service neutral and are discussed in [18].

4. Probability of winning a match


We now calculate pM
A , the probability that player A wins a match against player
M
B, with player A serving initially, and qM
A = 1 pA . To do so we define
M
pAB (i, j) to be the probability that in a match, the score becomes i sets for A
and j sets for B, with A serving initially and B serving finally. We define
pM
AA (i, j) similarly, but with A serving initially and finally.
M
To formulate recursion equations for pM
AB (i, j) and pAA (i, j), we introduce
S
S
S
S
S
pAB , pAA , pBA , and pBB . Here, pXY is the probability that X wins a set when X
serves the first game and Y serves the last game, where X and Y are A or B.
To get an expression for pSAA we note that when A serves the first and last
games, the total number of games must be odd. Then, by restricting the right
side of (6) to odd numbers of games, we get

S
pAA
=
p SA (6, j) + p SA (6, 6) p TA .
(35)
j=1,3

Similarly when A serves the first game and B serves the last game, the total
number of games is even. For even numbers of games, the right side of (6) yields

S
=
p SA (6, j) + p SA (7, 5).
(36)
pAB
j=0,2,4

Then, (6) is written


S
S
+ pAB
.
p SA = pAA

We also define qSAA and qSAB as



S
qAA
=
p SA ( j, 6) + p SA (6, 6)q AT ,

(37)

(38)

j=1,3
S
=
qAB


j=0,2,4

p SA ( j, 6) + p SA (5, 7).

(39)

1.0000
0.5000
0.0940
0.0045
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000

0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
1.0000
0.9060
0.5000
0.0864
0.0019
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000

0.2
1.0000
0.9956
0.9136
0.5000
0.0621
0.0007
0.0000
0.0000
0.0000
0.0000
0.0000

0.3
1.0000
1.0000
0.9981
0.9380
0.5000
0.0487
0.0005
0.0000
0.0000
0.0000
0.0000

0.4
1.0000
1.0000
1.0000
0.9993
0.9513
0.5000
0.0487
0.0007
0.0000
0.0000
0.0000

0.5
1.0000
1.0000
1.0000
1.0000
0.9995
0.9513
0.5000
0.0621
0.0019
0.0000
0.0000

0.6

Values of pAR are along the top row and values of pBR are down the left column.

Indicates that the match cannot end for these values.

PBR

0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0

0.1

0.0

pAR

1.0000
1.0000
1.0000
1.0000
1.0000
0.9993
0.9380
0.5000
0.0864
0.0045
0.0000

0.7

0.8
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
0.9981
0.9136
0.5000
0.0940
0.0000

Table 3
Probability pM
of
Player
A
Winning
a Match of Three Sets out of Five
A

1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
0.9956
0.9060
0.5000
0.0000

0.9

1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000

1.0

254
P. K. Newton and J. B. Keller

Probability of Winning at Tennis

255

Then,
S
S
q AS = qAA
+ qAB
.

(40)

To get pSBA , pSBB , qSBA , and qSBB , we interchange A and B in (35)(40). Note that
because pAS + qAS = 1 and pBS + qBS = 1, we have
S
S
S
S
pAA
+ qAA
+ pAB
+ qAB
= 1,

(41)

S
S
S
S
+ qBB
+ pBA
+ qBA
= 1.
pBB

(42)

Now we can write the recursion equations satisfied by pM


AB (i, j) and
j) as follows, for i + j > 1:

pM
AA (i,

M
S
M
S
M
pAB
(i, j) = pAB
(i 1, j) pAB
+ pAA
(i 1, j)qBB
M
S
M
S
+ pAB
(i, j 1)qAB
+ pAA
(i, j 1) pBB
,

(43)

M
S
M
S
M
(i, j) = pAB
(i 1, j) pAA
+ pAA
(i 1, j)qBA
pAA
M
S
M
S
+ pAB
(i, j 1)qAA
+ pAA
(i, j 1) pBA
.

(44)

The initial conditions are


M
(0, 0) = 1;
pAA

M
pAA
(i, j) = 0

if i < 0 or j < 0

(45)

M
(0, 0) = 1;
pAB

M
pAB
(i, j) = 0

if i < 0 or j < 0

(46)

M
S
pAB
(1, 0) = pAB
;

M
S
pAB
(0, 1) = qAB
;

M
S
pAA
(1, 0) = pAA
;

M
S
pAA
(0, 1) = qAA
.

(47)
For the mens format of three sets out of five, (43)(47) must be solved for
i, j = 0, 1, 2, 3. When j = 3, the i 1 terms must be omitted; when i = 3,
the j 1 terms must be omitted. The probability that player A wins a three
out of five set match when serving first is given by
p AM =

2




M
M
(3, j) + pAB
(3, j) .
pAA

(48)

j=0

For a match of two sets out of three, (35) and (36) must be solved for i,
j = 0, 1, 2. When j = 2, the i 1 terms must be omitted; when i = 2, the
j 1 terms must be omitted. Then, the probability that player A wins a two
out of three set match when serving first is
p AM =

1


j=0


M
M
pAA
(2, j) + pAB
(2, j) .

(49)

256

P. K. Newton and J. B. Keller

M
By using the solutions of (43) and (44) for pM
AA (2, j) and pAB (2, j) and
taking advantage of (37) and (40), we can write (49) as
S
S
S
S
S
S
S
S
q BS + pAB
p SA + pAA
pBA
q BS + pAA
pBB
p SA + pAB
qAA
q BS
p AM = pAA
S
S
S
S S
S
S
S
S
S
S
+ pAB
qAB
p SA + qAA
qBA
q B + qAA
qBB
p SA + qAB
pAA
q BS + qAB
pAB
p SA .

(50)
Note that because the probability of winning a set is independent of which
player serves first, the above formula (50) reduces to
 2
 2
p AM = p SA + 2 p SA p BS
(51)
for the two out of three set format, and
 3
 3
 3  2
p AM = p SA + 3 p SA p BS + 6 p SA p BS

(52)

for the three out of five set format.


Table 3 shows pM
A for a match of three sets out of five based upon (48), and
Table 4 shows pM
for
a match of two sets out of three based upon (40). In both
A
cases the results are shown as functions of pAR and pBR , ranging from 0 to 1 at
intervals of 0.1. Figure 6 shows the data from the 2002 U.S. Open Mens
Singles event as well as the theoretical curves for pAG , pAS , and pM
A (three out of
five set format) corresponding to the value pBR = 0.60. To obtain meaningful
statistics for the three data points associated with the pM
A curve, the 33 matches
were grouped in clusters of approximately 11 matches per cluster.
5. Probability of winning a tournament
5.1. The 128-player tournament
We now consider a single elimination tournament of 128 = 27 players numbered
i = 1, . . . , 128. We assume that we know the probability pM
ij for player i to
defeat player j in a match. We introduce the column vector of probabilities
p(n) R 1128 ;
(n)
p
1(n)
p
2

(n)
(53)
p = p3(n) .
.
.
.
(n)
p128
(n)

Here, pi is the conditional probability that player i wins a match in the nth
round, provided that he or she survives to that round of the tournament. From
(48) or (49), we know pM
ij , the probability that player i beats player j, which we
write more simply as P ij .

1.0000
0.5000
0.1461
0.0180
0.0005
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000

0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
1.0000
0.8539
0.5000
0.1376
0.0103
0.0001
0.0000
0.0000
0.0000
0.0000
0.0000

0.2
1.0000
0.9820
0.8624
0.5000
0.1091
0.0053
0.0000
0.0000
0.0000
0.0000
0.0000

0.3
1.0000
0.9995
0.9898
0.8909
0.5000
0.0922
0.0039
0.0000
0.0000
0.0000
0.0000

0.4
1.0000
1.0000
0.9999
0.9947
0.9079
0.5000
0.0922
0.0053
0.0001
0.0000
0.0000

0.5

pAR

1.0000
1.0000
1.0000
1.0000
0.9961
0.9079
0.5000
0.1091
0.0103
0.0005
0.0000

0.6
1.0000
1.0000
1.0000
1.0000
1.0000
0.9947
0.8909
0.5000
0.1376
0.0180
0.0000

0.7

0.8
1.0000
1.0000
1.0000
1.0000
1.0000
0.9999
0.9898
0.8624
0.5000
0.1461
0.0000

Table 4
of Player A Winning a Match of Two Sets out of Three

Values of pAR are along the top row and values of pBR are down the left column.

Indicates that the match cannot end for these values.

PBR

0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0

0.1

0.0

Probability

pM
A

1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
0.9995
0.9820
0.8539
0.5000
0.0000

0.9

1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000
1.0000

1.0

Probability of Winning at Tennis


257

258

P. K. Newton and J. B. Keller

2002 US Open Men's Singles


1

p R = .60 +
- .01

0.9

0.8

Game data
0.7

Set data
Match data

0.6

0.5

0.4

0.3

pG

0.2

pM

pS

0.1

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

pR
A
Figure 6. Theoretical curves for pAG (dotted), pAS (dashed), and pM
A (solid) corresponding to
values pBR = 0.60. Compiled data from the 2002 U.S. Open Mens Singles event are shown for
all matches in which pBR = 0.60 0.01.

p(n) satisfies the recursion formula



1
1

1
p(n) = Pn p(n1)
p(0) =
. ,
..
1

(n = 1, . . . , 6).

(54)

Here, Pn is a 128 128 matrix with block diagonal structure made up of 27n
(k)
blocks. We label them Pn , 1 k 27n , and then P n is given by
(1)

0
0 ...
0
Pn

0 P(2)
0 ...
0
n

.
(3)
.. .
Pn =
(55)
...
0 Pn
0

..
..
..
..
..
.
.
.
.

...

(27n )

Pn

Probability of Winning at Tennis

259

(k)

Pn is a 2n 2n off-diagonal block matrix:




(n,k)
0
P,
(k)
Pn =
,
(n,k)
P,
0

(56)

(n,k)

where = (k 1)2n + 1, = k2n . The P, are 2n1 2n1 matrices of the


form

P,+12n1
...
P,1
P,
P
...
P+1,1
P+1,

+1,+12n1
(n,k)
.

P, =
(57)
..
..
..
..

.
.
.
.

P+2n1 1,+12n1 P+2n1 1,1 P+2n1 1,


The entries of this matrix, P ij , are obtained
As an example, for n = 1, (55) becomes
(1)
P1
0
0

(2)
0 P1
0

(3)
P1 =
0 P1
0
.
..
..
.
.
.
.
0

from (48) or (49).


...

...

0
..
.
..
.

...
..
.

(58)

(64)

. . . P1

(k)

P1 is a 2 2 matrix:

(k)

P1 =

P2k,2k1

Explicitly (59) yields





0
P12
0
(1)
(2)
, P1 =
P1 =
P21 0
P43


P2k1,2k
.
0



P34
0
(64)
, . . . , P1 =
0
P128,127

(59)

P127,128
.
0
(60)

The probability that player i ultimately becomes the tournament champion,


which we denote pTC
i , is the product of the conditional probabilities of winning
each of the rounds. In vector form, this is given by

7
(n)
TC
p
p1
n=1 1

7
TC

p2 n=1 p2(n)

7
(n)
pT C
TC

p 3 =
(61)
n=1 p3 .

..
.

.
.


TC
(n)
7
p128
n=1 p128

260

P. K. Newton and J. B. Keller

The factors in the last column are obtained by solving (54). Note that the
components of the vector pTC must sum to unity.
5.2. Predicting the fate of the semifinalists
Suppose that after the quarterfinal round, we wish to predict the probability of
each of the four semifinalists becoming the tournament champion. We use
the preceding recursion method, introducing the vectors p(0) , p(1) , and p(2) of
probabilities of winning the quarterfinal, semifinal, and final round


(n)
p1
1

1
p (n)

2
(0)
(n)
p = , p = (n) , (n = 1, 2).
(62)
p
1
3
(n)
1
p4
The matrices P1 and P2 are given by

0
P12
P
21 0
P1 =
0
0
0
0

0
0

P2 =
P31
P41

0
0
P32
P42

0
0
0
P43

0
0

,
P34
0

(63)

P13
P23
0
0

P14
P24

.
0
0

(64)

The probability that player i wins a semifinal match is the ith component of

P12
P
21
p(1) = P1 p(0) = .
(65)
P34
P43
The probability that player i wins the final match if he or she plays in it is the
ith component of

P13 P34 + P14 P43


P P + P P
24 43
23 34
(66)
p(2) = P2 p(1) =
.
P31 P12 + P32 P21
P41 P12 + P42 P21

Probability of Winning at Tennis

261

The vector of probabilities that each semifinalist wins the tournament is


obtained by using (65) and (66) in (61):

P12 (P13 P34 + P14 P43 )


P (P P + P P )
24 43
21 23 34
pT C =
(67)
.
P34 (P31 P12 + P32 P21 )
P43 (P41 P12 + P42 P21 )
6. 2002 U.S. Open and Wimbledon data
We now use the results of the 2002 U.S. Open and 2002 Wimbledon Singles
events to show how the previous method can be applied to predict the
tournament champion after the quarterfinal round (n = 5), based on the
accumulated data through this round. Let i (n) be the total number of points
won on serve by player i in round n, and let i (n) be the total number of
points served by player i in round n. Then, the empirical probability of player
i winning a point on serve in round n is i (n)/ i (n). The corresponding
probability of winning a rally on serve in rounds 1n is

n
n

R
pi (n) =
i ( j)
i ( j).
(68)
j=1

j=1

We use this with n = 5 in (5) for each player in the semifinals and then
compute their empirical probabilities of winning a match against any of the
other remaining players. This allows us to compute the entries of the matrices
P1 and P2 in (63), (64), and arrive at values for pTC in round n = 6 for each of
the four semifinalists. To calculate pTC for the two finalists after the semifinal
round match, we repeat the same steps for the two finalists, using (68) with
n = 6. The same method of calculating pTC could be applied after round n = 1,
and after each subsequent round as the tournament progresses to make running
projections regarding tournament outcomes. Other forecasting methods which
allow point by point updates as the match unfolds are described in [19].
6.1. Womens Tennis Association (WTA) data
Figure 7 shows the 2002 U.S. Open Womens Singles Draw from the semifinal
round. Under each player, we show the value of piR (5), piR (6), and piR (7).
Next to each players name is their empirical probability of winning the
upcoming match, P ij , as well as their empirical probability of becoming
the tournament champion, pTC
i . After the quarterfinal round matches, L.
Davenport would have been the slight favorite to win the tournament ( p TC
2 =
TC
0.3599), followed by V. Williams ( p TC
=
0.3047),
S.
Williams
(
p
=
4
1
0.2872) and A. Mauresmo ( p TC
=
0.0482),
while
after
the
semifinal
round
3

262

P. K. Newton and J. B. Keller

S. Williams P12 = .4582 pTC = .2872


1

pR (5) = 158/225 = .7022


1

S. Williams P14 = pTC = .6527


1

pR (6) = 208/301 = .6910


1

L. Davenport P21 = .5418 pTC = .3599


2

pR (5) = 170/239 = .7113


2

S. Williams

pTC = 1
1

pR (7) = 240/349 = .6877


1

A. Mauresmo P34 = .2559 pTC = .0482


3

pR (5) = 237/374 = .6337


3

V. Williams P41 = pTC = .3473


4

pR (6) = 235/357= .6583


4

V. Williams P43 = .7441

pTC
4

= .3047

pR (5) = 176/256 = .6875


4

Figure 7. The probability P ij of each of the four semifinalists in the 2002 U.S. Open
Womens Singles tournament winning her match, and her probability pTC
of becoming the
i
tournament champion.

matches, S. Williams ( p TC
1 = 0.6527) was the favorite and ultimately won the
tournament. Figure 8 shows the 2002 Wimbledon Womens Singles Draw from
the semifinal round. Here, V. Williams ( p TC
1 = 0.4784) was the favorite to win
the tournament after the quarterfinal round match, followed by S. Williams
TC
TC
( p TC
4 = 0.3834), A. Mauresmo ( p 3 = 0.1233), and J. Henin ( p 2 = 0.0150),
TC
while S. Williams ( p 4 = 0.5866) was the favorite after the semifinal round
match and ultimately won the tournament.
6.2. Association of Tennis Professionals (ATP) data
Figure 9 shows the 2002 U.S. Open Mens Singles Draw. After the quarterfinal
round matches, P. Sampras was the heavy favorite to win the tournament
TC
TC
( p TC
1 = 0.6747), followed by L. Hewitt ( p 4 = 0.1457), A. Agassi ( p 3 =
0.0945), and S. Schalken ( p TC
2 = 0.0851). Sampras chances of winning the
tournament increased after his semifinal round match ( p TC
1 = 0.8856) and
he ultimately won the tournament. Figure 10 shows the results from the
2002 Wimbledon Mens Singles event. After their quarterfinal round matches,
X. Malisse ( p TC
3 = 0.4573) was favored to win the tournament, followed by
TC
L. Hewitt ( p TC
1 = 0.3364), T. Henman ( p 2 = 0.1815), and D. Nalbandian
TC
( p 4 = 0.0247). After the semifinal round matches, it was L. Hewitt, the
ultimate tournament champion, who was the heavy favorite ( p TC
1 = 0.8698).

Probability of Winning at Tennis

263

V. Williams P12 = .8896 pTC = .4784


1

pR (5) = 162/230 = 0.7043


1

V. Williams P14 = pTC = .4134


1

pR (6) = 200/286 = .6993


1

J. Henin

P21= .1104 p TC = .0150


2

p (5) = 220/364 = .6044

S. Williams

pTC = 1
4

pR (7) = 276/390 = .7077


A. Mauresmo

P34 = .3184 pTC = .1233


3

pR (5) = 212/317 = .6688


3

S. Williams P41 = pTC = .5866


4

pR (6) = 232/323 = .7183


4

TC

S. Williams P43 = .6816 p

= .3834

p (5) = 202/285 = .7088


4

Figure 8. The probability P ij of each of the four semifinalists in the 2002 Wimbledon
Womens Singles tournament winning her match, and her probability pTC
of becoming the
i
tournament champion.
P. Sampras

P12 = .8257 pTC = .6747


1

pR (5) = 392/524 = .7481


1

P. Sampras

P14 = pTC = .8856


1

pR (6) = 469/629 = .7456


1

S. Schalken P21= .1743 p TC = .0851


2

pR (5) = 447/655 = .6824

P. Sampras

pTC = 1
1

A. Agassi

p (7) = 573/781 = .7337

P34 = .4386 pTC = .0945

pR (5) = 285/420 = .6786


3

A. Agassi

P41 = pTC = .1144


4

pR (6) = 365/551 = .6624


4

L. Hewitt

P43 = .5614 pTC = .1457


4

pR (5) = 370/537 = .6890


4

Figure 9. The probability P ij of each of the four semifinalists in the 2002 U.S. Open Mens
Singles tournament winning his match, and his probability pTC
of becoming the tournament
i
champion.

264

P. K. Newton and J. B. Keller

L. Hewitt

P12= .6039 pTC = .3364


1

pR (5) = 336/477 = .7044


1

L. Hewitt

P14 = pTC = .8698


1

pR (6) = 399/567 = .7037


1

T. Henman

TC

P21 = .3961 p = .1815


2

p (5) = 405/590 = .6864

L. Hewitt

pTC = 1
1

pR (7) = 450/646 = .6966


X. Malisse

P34 = .8540 pTC = .4573


3

pR (5) = 389/553 = .7034


3

D. Nalbandian P41 = pTC= .1302


4

pR (6) = 477/758 = .6293


4

D. Nalbandian

P43 = .1460

pTC
4

= .0247

pR (5) = 389/614 = .6336


4

Figure 10. The probability P ij of each of the four semifinalists in the 2002 Wimbledon Mens
Singles tournament winning his match, and his probability pTC
of becoming the tournament
i
champion.

7. Capturing non-iid effects


There are several papers documenting effects that cannot be captured with the
assumption that points are independent and identically distributed. For example,
Magnus and Klassen [20] analyze 90,000 points played at Wimbledon, and
find evidence of a first game effect, i.e., that the first game of a match is
the hardest one to break. This indicates that it may be desirable to allow pAG
and pBG to vary from game to game and perhaps depend on the specific pair
of players who are competing. Jackson and Mosurski [21] give compelling
evidence which indicates that points may not be independent. This includes
what is commonly called the hot-hand phenomenon in which winning a
previous point, game, or set, increases ones chances of winning the next, and
the opposite of this, called the back-to-the-wall effect in which playing
from behind can sometimes be a psychological advantage. From the analysis
of Klassen and Magnus [9], one can assume that although these effects
may be small when analyzing large heterogeneous data sets, they may be
more important when analyzing specific head-to-head match-ups between two
players, as, for example, the famous McEnroeBorg series of matches [21] in
which a back-to-the-wall phenomenon seems to be present.

Probability of Winning at Tennis

265

A more refined analysis than the one described in this paper could incorporate
these and other higher-order effects by allowing pAR and pBR to vary from point
to point as the match unfolds, depending on the points importance [12]
or by taking into consideration more detailed player characteristics such as
rallying ability or strength of return of serve. For example, we could define the
probability that player A wins a point on serve as


R
p AR = p AR + pAB
(i, j),
0 p AR 1
(69)
where pAR is constant throughout the match, pRAB (i, j) represents player As
probability of winning a point on serve against player B, when the score is i
points for A and j points for B, and
1 is a small parameter reflecting
the fact that, in most cases, the deviation from iid is small. The goal then
would be to calculate the corresponding formulas for game, set, and match for
each player, i.e., p GA , p SA , p AM , and p GB , p BS , p BM . The leading-order theory
( = 0) is the one described in this paper based on the iid assumption, while
higher-order corrections could be treated perturbatively.

Acknowledgments
This work is supported by the National Science Foundation grants NSF-DMS
9800797 and NSF-DMS 0203581. Useful comments and observations by
J. DAngelo and G.H. Pollard on an early draft of the manuscript are gratefully
acknowledged. The first author also thanks Andres Figueroa for skillfully
performing Matlab calculations on the models developed in this manuscript as
part of a summer undergraduate research project.

Appendix
The solution of (7)(10) is
3

p SA (6, 0) = p GA q BG
 3  3
 4  2
p SA (6, 1) = 3 p GA q AG q BG + 3 p GA p GB q BG
 3
 3
 2  2  4
p SA (6, 2) = 12 p GA q AG p GB q BG + 6 p GA q AG q BG
 4  2  2
+ 3 p GA p GB q BG
 3  2  3
 4  2  2
p SA (6, 3) = 24 p GA q AG p GB q BG + 24 p GA q AG p GB q BG
 2  3  4
 5  3
+ 4 p GA q AG q BG + 4 p GA p GB q BG

(A.1)
(A.2)

(A.3)

(A.4)

266

P. K. Newton and J. B. Keller

 3  2  2  3
 2  3  4
p SA (6, 4) = 60 p GA q AG p GB q BG + 40 p GA q AG p GB q BG
 4  3  2
 4  5
+ 20 p GA q AG p GB q BG + 5 p GA q AG q BG
 5  4
+ p GA p GB q BG
(A.5)
















3
3
2
4
4
2
3
3
p SA (7, 5) = 100 p GA q AG p GB q BG + 100 p GA q AG p GB q BG
 2  4  5
 5  4  2
+ 25 p GA q AG p GB q BG + 25 p GA q AG p GB q BG
 5  6  6  5
+ p GA q AG q BG + p GA p GB q BG .
(A.6)
To obtain pAS (i, j) from pAS ( j, i), we interchange pAG qAG and pBG qBG in
(A.1)(A.6). Finally, pAS (6, 6) in (6) is given by


4

 S

S
S
S
S
p A (i, 6) + p A (6, i) + p A (7, 5) + p A (5, 7) . (A.7)
p A (6, 6) = 1
i=0

The solution of (14)(16) yields:


 3  4
p TA (7, 0) = p AR q BR
 3  4
 4  3
p TA (7, 1) = 3 p AR q AR q BR + 4 p AR p BR q BR
 3
 4
 5  2  2
p TA (7, 2) = 16 p AR q AR p BR q BR + 6 p AR p BR q BR
 3  2  4
+ 6 p AR q AR q BR
 3  2  4
 2  3  5
p TA (7, 3) = 40 p AR q AR p BR q BR + 10 p AR q AR q BR
 5  3  2
 4  2  3
+ 4 p AR p BR q BR + 30 p AR q AR p BR q BR
 4  3  3
 5  4  2
p TA (7, 4) = 50 p AR q AR p BR q BR + 5 p AR p BR q BR
 2  3  5
 4  6
+ 50 p AR q AR p BR q BR + 5 p AR q AR q BR
 3  2  2  4
+ 100 p AR q AR p BR q BR
 5  6
 2  4  5
p TA (7, 5) = 30 p AR q AR p BR q BR + p AR q AR q BR
 4  2  3  3
 5  4  2
+ 200 p AR q AR p BR q BR + 75 p AR q AR p BR q BR
 3  3  2  4
 6  5
+ 150 p AR q AR p BR q BR + 6 p AR p BR q BR .

(A.8)
(A.9)

(A.10)

(A.11)

(A.12)

(A.13)

To obtain pAT ( j, i) from pAT (i, j), we interchange pAR qAR and pBR qBR in
(A.9)(A.13). Finally, pAT (6, 6) in (13) is given by


5

 T

T
T
p A (i, 7) + p A (7, i) .
(A.14)
p A (6, 6) = 1
i=0

Probability of Winning at Tennis

267

The nonzero coefficients aijS (n) = bijS (n) are given by


S
a20
(1) = 6,

S
S
a21
(1) = 24, a22
(1) = 36,

S
S
(1) = 9, a31
(1) = 51,
a30
S
(1) = 3,
a40

S
S
a23
(1) = 24, a24
(1) = 6,

S
S
a32
(1) = 99, a33
(1) = 81,

S
S
a41
(1) = 24, a42
(1) = 60,

S
a34
(1) = 24,

S
S
a43
(1) = 60, a44
(1) = 21.

(A.15)
S
(2) = 5,
a10

S
a11
(2) = 25,

S
(2) = 25,
a14

S
a15
(2) = 5,

S
a20
(2) = 16,

S
a21
(2) = 124,

S
a12
(2) = 50,

S
a13
(2) = 50,

S
S
a22
(2) = 336, a23
(2) = 424,

S
S
(2) = 256, a25
(2) = 60
a24
S
a30
(2) = 18,

S
S
a31
(2) = 198, a32
(2) = 696,

S
(2) = 774,
a34

S
a35
(2) = 210

S
(2) = 8,
a40

S
a41
(2) = 124,

S
a33
(2) = 1080,

(A.16)

S
S
a42
(2) = 560, a43
(2) = 1060,

S
S
a44
(2) = 896, a45
(2) = 280
S
(2) = 1,
a50

S
a51
(2) = 25,

S
(2) = 350,
a54

S
a55
(2) = 126.

S
a52
(2) = 150,

S
a53
(2) = 350,

The nonzero coefficients aijT (n) = bijT (n) are given by


T
(1) = 4,
a30

T
T
a31
(1) = 16, a32
(1) = 24,

T
T
a40
(1) = 3, a41
(1) = 16,

T
T
a33
(1) = 16, a34
(1) = 4,

T
T
a42
(1) = 30, a43
(1) = 24,

T
(2) = 10,
a20

T
a21
(2) = 50,

T
a24
(2) = 50,

T
a25
(2) = 10

T
a30
(2) = 24,

T
a31
(2) = 166,

T
a22
(2) = 100,

T
a44
(1) = 7.
(A.17)

T
a23
(2) = 100,

T
T
a32
(2) = 424, a33
(2) = 516,

T
T
a34
(2) = 304, a35
(2) = 70
T
a40
(2) = 18,

T
T
a41
(2) = 166, a42
(2) = 530,

T
(2) = 532,
a44

T
a45
(2) = 140

T
a50
(2) = 4,

T
a51
(2) = 50,

T
T
a54
(2) = 280, a55
(2) = 84.

T
a43
(2) = 774,

T
T
a52
(2) = 200, a53
(2) = 350,

(A.18)

268

P. K. Newton and J. B. Keller

T
a10
(3) = 6,

T
a11
(3) = 36,

T
a12
(3) = 90,

T
a14
(3) = 90,

T
a15
(3) = 36,

T
a16
(3) = 6

T
a20
(3) = 25,

T
a21
(3) = 230,

T
a22
(3) = 775,

T
T
a24
(3) = 1175, a25
(3) = 550,

T
a26
(3) = 105

T
a30
(3) = 40,

T
a31
(3) = 510,

T
a32
(3) = 2200,

T
a34
(3) = 4800,

T
T
a35
(3) = 2590, a36
(3) = 560

T
a40
(3) = 30,

T
a41
(3) = 510,

T
a13
(3) = 120,

T
a23
(3) = 1300,

T
a33
(3) = 4500,

T
T
a42
(3) = 2750, a43
(3) = 6750,

T
T
a44
(3) = 8400, a45
(3) = 5180,

T
a46
(3) = 1260

T
a50
(3) = 10,

T
a51
(3) = 230,

T
a52
(3) = 1550,

T
a54
(3) = 6580,

T
T
a55
(3) = 5620, a56
(3) = 1260,

T
a60
(3) = 1,

T
a61
(3) = 36,

T
a62
(3) = 315,

T
T
a64
(3) = 1890, a65
(3) = 1512,

T
a53
(3) = 4550,

T
a63
(3) = 1120,

T
a66
(3) = 462.

(A.19)
References
1.
2.
3.
4.
5.
6.
7.
8.
9.

10.
11.

S. R. CLARKE and D. S. DYTE, Using official tennis ratings to estimate tournament


chances, preprint, 2002.
R. T. STEFANI, Survey of the major world sports rating systems, J. Appl. Stat.
24(6):635646 (1997).
J. B. KELLER, Probability of a shutout in racquetball, SIAM Rev. 26:267268 (1984).
J. RENICK, Optimal strategies at decision points in singles squash, Res. Quart. Exercise
Sport 47:562568 (1976).
J. B. KELLER, Tie point strategies in badminton, preprint, 2003.
B. P. HSI and D. M. BURYCH, Games of two players, Appl. Stat.: J. R. Stat. Soc. C
22(1):8692 (1971).
W. H. CARTER and S. L. CREWS, An analysis of the game of tennis, Am. Stat.
28(4):130134 (1974).
G. H. POLLARD, An analysis of classical and tie-breaker tennis, Austr. J. Stat. 25:496505
(1983).
F. J. G. M. KLAASSEN and J. R. MAGNUS, Are points in tennis independent and identically
distributed? Evidence from a dynamic binary panel data model, J. Am. Stat. Assoc.
96(454):500509 (2001).
S. L. GEORGE, Optimal strategy in tennis: A simple probabilistic model, Appl. Stat.: J. R.
Stat. Soc. C 22(1):97104 (1973).
R. E. MILES, Symmetric sequential analysis: The efficiencies of sports scoring systems
(with particular reference to those of tennis), J. R. Stat. Soc. B 46(1):93108 (1984).

Probability of Winning at Tennis


12.
13.
14.
15.
16.
17.
18.

19.
20.
21.

269

C. MORRIS, The most important points in tennis, in Optimal Strategies in Sport (S. P.
Ladany and R. E. Machol, Eds.), pp. 131140, Amsterdam; North-Holland, 1977.
J. R. MAGNUS and F. J. G. M. KLAASSEN, The effect of new balls in tennis: Four years at
Wimbledon, The Statistician 48:239246 (1999).
F. J. G. M. KLAASSEN and J. R. MAGNUS, How to reduce the service dominance in
tennis? Empirical results from four years at Wimbledon, preprint, 2003.
J. R. MAGNUS and F. J. G. M. KLAASSEN, The final set in a tennis match: Four years at
Wimbledon, J. Appl. Stat. 26(4):461468 (1999).
J. G. KINGSTON, Comparison of scoring systems in two-sided competitions, J. Comb.
Theory A 20:357362 (1976).
C. L. ANDERSON, Note on the advantage of first serve, J. Comb. Theory A 23:363 (1977).
P. K. NEWTON and G. H. POLLARD, Service neutral scoring strategies for tennis, in
Proceedings of the Seventh Autralasian Conference on Mathematics and Computers in
Sport, 2004.
F. J. G. M. KLAASSEN and J. R. MAGNUS, Forecasting in tennis, preprint, 2003.
J. R. MAGNUS and F. J. G. M. KLAASSEN, On the advantage of serving first in a tennis
set: Four years at Wimbledon, The Statistician 48:247256 (1999).
D. JACKSON and K. MOSURSKI, Heavy defeats in tennis: Psychological momentum or
random effects, Chance 10:2734 (1997).
UNIVERSITY OF SOUTHERN CALIFORNIA
STANFORD UNIVERSITY
(Received July 21, 2004)

Anda mungkin juga menyukai