Please cite this article in press as: Cyran, K.A., Kimmel, M., Alternatives to the Wright–Fisher model: The robustness of mitochondrial Eve dating. Theoretical Population
Biology (2010), doi:10.1016/j.tpb.2010.06.001
ARTICLE IN PRESS
2 K.A. Cyran, M. Kimmel / Theoretical Population Biology ( ) –
time to the MRCA, is a powerful tool based on the reverse- This could be answered only by time-forward simulation of the
time analysis of the Wright–Fisher model of genetic drift. For full branching process genealogy and then by comparing the actual
large populations, coalescent models are equivalent to diffusion distributions with limiting results. The growing interest in studies
process approximations, which depend only on the mean and concerning genealogies of branching processes is reflected among
on the variance of offspring number distribution. Consequently, others by the studies of Klebaner and Sagitov (2002) focused on
coalescent models are robust for large populations, but during the the geometric distribution of progeny, and by the work of Lambert
population bottlenecks, the diffusion approximation may fail. (2003) focused on subcritical cases. Still, we consider the O’Connell
How relevant for the model predictions are departures from a model as a standard because of its independence of the offspring
panmictic population (in the case of the Wright–Fisher model) and number distribution and our interest in supercritical processes
from large population size (the case of the coalescent method)? which can model the long-term growth of the human population
To answer this question we compare three different models for size.
calculating the distribution of the time to coalescence of a pair of In contrast to the O’Connell model, the Wright–Fisher model is
chromosomes. These include not limited to any specific growth pattern. Yet except for some
early classical work, such as that of Nagylaki (1990), relatively
• the Wright–Fisher model with discrete generations,
little effort has been expended in analyzing its relationship to
• the coalescent-based model with continuous time scaled by the
other models in terms of sensitivity to departures from the models
size of population variable in time, and
assumptions. Addressing this problem, this paper compares
• the less known O’Connell model based on branching processes coalescence distributions under a range of Wright–Fisher models
(see O’Connell, 1995).
(including those which arose from the time continuous coalescent)
All three models are applied to stochastic population growth to distributions obtained from the O’Connell model. Finally, using
approximated by a slightly supercritical Galton–Watson branching computer software designed by us for this purpose, the results
process. To be able to compare these methodologies reliably, of all these models are compared with the actual distributions
we designed a computational framework to estimate the two- obtained from simulations of several thousand full genealogies.
chromosome coalescence distribution in any of these models as As a real world application of our results, we report estimations
well as in a model based on a full record of the population history. of the time to mitochondrial MRCA of modern humans to show
After simulating several thousand genealogies we estimated how sensitive these estimates are to the assumptions made by the
parameters with great accuracy. various compared models.
There might be some doubt whether the use of the time to
coalescence of two chromosomes is an adequate tool to estimate 2. Models
the age of the MRCA. Therefore, we also considered coalescence in
a sample of n chromosomes randomly chosen from a population. The two population genetics models we employ are defined
However, the computational difficulties of the recursion involved further on in Sections 2.1 and 2.2. Two approaches will be used
as well as the difficulty of linking the results with the known by us to study how sensitive is the distribution of the time to
genetic indices make the approach troublesome. Analysis of the coalescence to specific model assumptions. The first approach
aforementioned approach is beyond the scope of this article and requires storing the entire simulated genealogy of a population.
will be discussed in a separate paper. Then, by averaging over genealogies, the experimental distribution
We consider models suitable for the analysis of data having the of the times to coalescence can be found and then compared to
form of numbers of pairwise differences between DNA sequences those obtained in the Wright–Fisher and the coalescent models.
in the sample. Although phylogenetic methods, attempting to use This approach is very general since it is not limited to any
all genetic information contained in a sample to build the geneal- generation-to-generation sampling scheme or assumption about
ogy (e.g. Griffiths and Tavaré, 1995, Griffiths’ Genetree software, large population size. In particular it can be applied for an arbitrary
available from http://www.stats.ox.ac.uk/~griff/software.html), progeny distribution (possibly changing in time) used to model
tend to give estimates with a smaller variance than those based on the evolution of the population as a branching process. However,
pairwise differences, they cannot be directly compared with the with the exception of small population sizes, it requires a large
O’Connell model playing an important role in our paper. amount of memory for storing information about each generation,
This model was originally proposed by O’Connell (1995) and therefore it seems practically infeasible for simulations of the
for dating the mitochondrial Eve (mtEve) based on a sample required number of generations.
of mtDNA of humans and chimpanzees. The O’Connell’s limit In the alternative approach, the population history is simulated
results are based on the assumption that the population is and only the population size is recorded as a function of time.
growing as a slightly supercritical branching process with progeny Using this information and approaches such as in Bobrowski and
distributions homogeneous in time. Though these are not quite Kimmel (2004) we compute the coalescence distributions of the
realistic assumptions, especially that of time homogeneity, the pairs of sequences. The results of such experiments were reported
model is important as an alternative to the Wright–Fisher model, by Cyran and Kimmel (2004) and Cyran (2007). They were also
since it does not assume any particular offspring distribution. used in Cyran and Kimmel (2005) for conservative estimation of
Moreover, asymptotically, for a given expected number of the parameter α (see Section 2.3 for the definition of how this
offspring, the O’Connell model is independent of the shape parameter influences the expansion rate of the population) in a
of the progeny number distribution, and in particular of its problem of hypothetical Neanderthal contribution to the modern
variance as long as this variance is bounded. This property is human mtDNA gene pool. However, this methodology lacks one
interesting in the light of classical results in which the short- important feature which could be taken into consideration only
term inbreeding effective population size is proportional to the in the first approach. Namely, by regarding only the sizes of the
variance of the offspring number distribution, and therefore the population and not its full genealogy, it is impossible to distinguish
variance influences the shape of the coalescence distribution. The between the length of time of the entire simulation started from
invariance of the offspring number distribution in the O’Connell the ancestral individual, and the length of time to the MRCA. The
model is theoretically valid in a limit. It remains unknown how problem becomes clear if we realize that the founder of the process,
fast, in the terms of number of generations, the coalescence the common ancestor of the population evolved, is rarely the
distributions in a real population converge to O’Connell’s limit. most recent common ancestor of the extant individuals. Having no
Please cite this article in press as: Cyran, K.A., Kimmel, M., Alternatives to the Wright–Fisher model: The robustness of mitochondrial Eve dating. Theoretical Population
Biology (2010), doi:10.1016/j.tpb.2010.06.001
ARTICLE IN PRESS
K.A. Cyran, M. Kimmel / Theoretical Population Biology ( ) – 3
possibility to distinguish between the two, we assumed in earlier we propagated this result to arbitrarily many generations using the
studies (Cyran and Kimmel, 2004, 2005) that the time between the equation
founder and the MRCA is relatively short compared to the time
TMRCA
of the whole process. Therefore we treated both as identical, not E K0 = 1 davg
T
having information as to what extent this simplification can be davg
T̂MRCA_y = = . (3)
justified.
δE T2c T2c
δ
T
K0 = 1 E K = 1
But now, with the increase of computational power and mem- T TMRCA T 0
descendants are presently alive. Assuming that TMRCA(y) = λTMRCA where T denotes the number of generations we consider, and
is the equivalent of TMRCA expressed in years, the moment-based for mathematical consistency we let q−1 = 0 and p−1 = 1.
estimate of TMRCA(y) is The average pairwise mutation difference within a sample, after
scaling by the mutation rate µ, corresponds to the expectation
davg of the coalescence distribution (2), and moreover, the discrete
T̂MRCA(y) = . (2) nature of generations makes it easy to simulate the demography.
δE T2c
K0 = 1
TMRCA
Therefore, using Monte Carlo techniques it is possible to estimate
the unconditional coalescence distribution by averaging the
In the Wright–Fisher model-based computations, we obtain the conditional one using a series of Nt realizations required to
expectation E (T2c /T | K0 = 1) by performing computer simulations compute parameters in (4).
of a branching process starting from one individual and calculating
the required ratio of T2c and TMRCA . After simulating several 2.2. Coalescent model
thousand processes we planned to obtain the expectation of
Let us assume a coalescent model with population size Nτ
the ratio. However, only in the model with the record of a full
variable in time and continuous time τ measured backwards.
genealogy, the times T2c and TMRCA were explicitly given and (2)
Suppose also that λ(τ ) = N0 /Nτ and that τ2c is the time to
could be applied directly. In other models, only the time T2c could
coalescence of a pair of chromosomes observed over N0 genera-
be computed and the time TMRCA was not available directly. Instead tions. Then, the tail of the distribution of τ2c is given by
we have at our disposal time T , i.e., duration of the process. Z τ
Certainly, time T , being the time to the only individual initiating the
P (τ2c > τ ) = exp − λ(u)du , (5)
branching process, is the time to the common ancestor of the whole 0
evolved population. However, as mentioned, it is rarely also the which is the continuous analog of (4). To ensure existence of a
time to the most recent common ancestor. Nevertheless, we were unique common ancestor, λ(t ) must satisfy
able to estimate the ratio of TMRCA and T in simulations with a fully Z ∞
recorded genealogy, and moreover we confirmed that the limiting λ(u)du = ∞. (6)
properties of coalescence distribution in the O’Connell model are 0
valid for as little as 102 generations for which we could perform For the stochastic Nt , and therefore λ(τ ), the right side of the
the full genealogy simulations. In this way we could relate in the Eq. (5) should be averaged over the process realizations. In the
O’Connell model TMRCA to T and T2c , and applying the limit theorem context of our study it is also worth noticing that the continuous
Please cite this article in press as: Cyran, K.A., Kimmel, M., Alternatives to the Wright–Fisher model: The robustness of mitochondrial Eve dating. Theoretical Population
Biology (2010), doi:10.1016/j.tpb.2010.06.001
ARTICLE IN PRESS
4 K.A. Cyran, M. Kimmel / Theoretical Population Biology ( ) –
coalescence model correctly approximates the discrete coalescent Moreover, the expectation of the ratio TMRCA /T in (13) is obtained
model as long as 1 − 1/Nτ ≈ exp(−1/Nτ ), which certainly is not from the simulations with recorded full genealogies, and x̂
true in the early phase of the branching process, when Nt is not denotes the estimate of parameter x. Therefore, to calculate the
large and undergoes stochastic fluctuations. This fact is reflected in estimated MRCA time T̂MRCA(y) from the genetic variation data, we
the differences between experimental distributions of the time to need α̂ . However, from the simulation results concordant with
coalescence in the coalescent model and the O’Connell branching limiting properties of the O’Connell model we can have the ratio
process model. E (TMRCA /T |K0 = 1). Therefore, we can simultaneously estimate
TMRCA(y) and α . From (8), if ZT is substituted as an estimate of its
2.3. O’Connell model expected value, it follows that
K0 = 1 σ T̂MRCA(y) exp α̂ − 1
2
Consider a slightly supercritical time-homogeneous branching T
ZT = E (15)
process with expected number of offspring E (ξ0 ) = 1 + α/T + TMRCA 2λα̂
o(1/T ) and variance Var(ξ0 ) = σ 2 + O(1/T ). For this model,
and estimates of TMRCA(y) and α are solutions of the system of Eqs.
an asymptotic property of the probability P x (Zt > 0) where
(13) and (15) for given short-term inbreeding effective population
P x denotes probabilities for the process started by x individuals,
size of females ZT , and genetic data davg and δ .
satisfies the O’Connell (1995) formula
2α x 2.4. Simulations
P x (Zt > 0) ∼ , as T → ∞. (7)
σ 1 − exp −α Tt
2
From this it follows (Cyran and Kimmel, 2004) that Simulation mode 1
The first simulation mode implements the full genealogy
σ 2T recording in the branching process model, thus allowing explicit
E (ZT |ZT > 0, Z0 = x) ∼ [exp (α) − 1] , as T → ∞ (8) access to any desired feature of the model. In particular, it is
2α
possible to trace back the genealogy of a pair of individuals and
where symbol ∼ denotes asymptotic equivalence. Let us express
to find their MRCA, and therefore the actual time of coalescence.
the time interval [0, T ] of variable t as a unit interval [0, 1] of
By randomly choosing from simulated population a sample of
variable r = t /T . Then (O’Connell, 1995), as corrected in Kimmel
about 100 individuals and determining coalescence of each pair
and Axelrod (2002), for times T long enough, we have the following
in a sample we obtain a histogram HT 2c |tree of the times to
equation describing the tail of the distribution of DT , the time of
coalescence, conditionally on the simulated tree. This histogram
death of the last common ancestor of two individuals living at T ,
is the experimental approximation of the conditional coalescence
given that we start the population history from x individuals having
distribution P (T2c = t |tree) in the full genealogy model. Having
descendants at T
obtained the distribution P (T2c = t |tree) we also compute its
DT expected value E (T2c |tree) denoting it as T2c_avg |tree. Additionally,
> r K0 = x
lim P
T →∞ T we trace back the lineages of the whole population to the MRCA,
and therefore we have the time TMRCA |tree as well as the ratios
2qxr h i
= (qr − 1)−x (x − 1)! − F (x − 1, 1 − qr ) , (9) (T2c_avg /TMRCA )|tree and (TMRCA /T )|tree. Finally, by simulating
(x − 1)! many branching processes and then by averaging over trees
where generated, we obtain corresponding unconditional characteristics.
These characteristics include: the histogram HT 2c , the distribution
e−r α − e−α
qr = (10) P (T2c = t ) and its expectation E (T2c ), the distribution P (T2c_avg =
1 − e−α t ) with the expectation E (T2c_avg ), the distribution P (TMRCA = t )
and F : Z+ × (0, 1) → R is defined as with the expected value E (TMRCA ), as well as the histograms and
the expectations over genealogies of the ratios T2c_avg /TMRCA , and
∂n ln(1 − y)
F (n, y) = . (11) TMRCA /T . It is important to notice that E (T2c_avg /TMRCA ) obtained
∂ yn y2 in the procedure described above, can be used in this model in
The O’Connell original distribution is continuous, but to compare the Eq. (2) instead of E (T2c /TMRCA ), what yields a smaller variance
it to the discrete empirical distributions described below, we estimator. Such substitution is justified from the genetic point of
consider the discretized version, specified by the tail of the original view by linking the expectation E (T2c_avg ) (scaled by divergence
distribution computed at points r corresponding to integer values rate δ = µ/λ in (1), with the average pairwise mutation difference
of t = rT . For the sake of terminological simplicity, we will still in a sample davg . Note also that the simulations which became
refer to this discrete distribution as the O’Connell distribution. extinct were excluded from computations since problems similar
In the O’Connell model, to those of dating MRCA of modern humans are always solved
conditionally on non-extinction.
T2c 1 Simulation mode 2
E K0 = 1 = E [(T − DT ) |K0 = 1] , (12)
T T The second simulation mode stores only the course of
and therefore, the Eq. (3) becomes population size change described by the branching process. This
mode is used for numerical computation of the distribution in
TMRCA the Wright–Fisher (7) or the coalescent (8) models, conditional
T̂MRCA(y) = E K0 = 1
T on Nt . In the discrete Wright–Fisher model, the Eq. (7) can be
directly applied if the history of Nt is known from simulation.
davg
× , (13) However, in the continuous coalescent model, instead of using (7)
R1
δ 1−2 0 q̂r
q̂r − 1 − ln q̂r dr we apply a Monte Carlo approach by generation of coalescence
(1−q̂r )2 times from the distribution (8) conditional on Nt . This procedure
was repeated 104 times for one simulated branching process.
where
The conditional histogram is used as the approximation of the
e−r α̂ − e−α̂ conditional distribution P (T2c = t | Nt , CM), in which CM
q̂r = . (14) denotes conditioning on the coalescent model. As in the case of
1 − e−α̂
Please cite this article in press as: Cyran, K.A., Kimmel, M., Alternatives to the Wright–Fisher model: The robustness of mitochondrial Eve dating. Theoretical Population
Biology (2010), doi:10.1016/j.tpb.2010.06.001
ARTICLE IN PRESS
K.A. Cyran, M. Kimmel / Theoretical Population Biology ( ) – 5
3. Genetic data
Please cite this article in press as: Cyran, K.A., Kimmel, M., Alternatives to the Wright–Fisher model: The robustness of mitochondrial Eve dating. Theoretical Population
Biology (2010), doi:10.1016/j.tpb.2010.06.001
ARTICLE IN PRESS
6 K.A. Cyran, M. Kimmel / Theoretical Population Biology ( ) –
Table 1 Table 2
Expectations of the ratio T2c /T ± SD in the O’Connell and the full genealogy models. Comparison of the expectations of T2c /T computed in the Wright–Fisher and the
coalescent models for different progeny number distributions.
Model E (T2c /T )
Progeny number distribution E (T2c /T |WF) E (T2c /T |CM)
O’Connell 0.8054 ± 0.1591
Full genealogy with BF progeny 0.8097 ± 0.1585 BF 0.7497 0.7585
Full genealogy with P progeny 0.8008 ± 0.1645 P 0.8005 0.8078
Full genealogy with LF progeny 0.8002 ± 0.1662 LF 0.8454 0.8550
a Table 3
Expectations of different ratios of the coalescence times and their standard
deviations computed in the full genealogy model for various distributions of the
number of progeny.
Parameter Binary fission Poisson Linear fractional
Table 4
Expected values of the time to the MRCA of modern humans computed in the
O’Connell model, the branching process full genealogy models, the Wright–Fisher
models, and the coalescent models.
b Model T̂MRCA(y) (years × 10−3 )
Please cite this article in press as: Cyran, K.A., Kimmel, M., Alternatives to the Wright–Fisher model: The robustness of mitochondrial Eve dating. Theoretical Population
Biology (2010), doi:10.1016/j.tpb.2010.06.001
ARTICLE IN PRESS
K.A. Cyran, M. Kimmel / Theoretical Population Biology ( ) – 7
Table 5
Expected value and the 95% confidence interval of the estimates.
Parameter Lower confidence bound Expected value Upper confidence bound
5. Discussion mtDNA of all currently living humans should be placed after that
of humans and Neanderthals. A slight admixture of at most 25%
Until the last decade, estimation of the divergence rate (Serré et al., 2004) or 15% (Cyran and Kimmel, 2005) disappeared
in pre-modern and modern humans could rely only on hu- as a result of the genetic drift. Therefore, even if the results of
man–chimpanzee divergence data. Methods used were based on the Neanderthal Genome Project suggest possible interbreeding
phylogenetic trees constructed either by maximum likelihood or between the Neanderthals and the archaic Europeans yielding
maximum parsimony and rooted using the chimpanzee as an out- about 3% admixture of the nuclear DNA (Plagnol and Wall, 2006;
group. However, due to the relatively long time to this divergence, Pennisi, 2006; Green et al., 2010), treating the Neanderthals as a
all estimates of this time were very inaccurate, ranging from 4 to mtDNA outgroup is justified.
9 million years. Consequently, the estimated divergence rate and In the paper we compared the distributions of the time to
time to the MRCA of modern humans could not be accurate, with coalescence of a pair of chromosomes obtained by conceptually
different methods. In particular, we proved that a branching
expected values ranging from 200,000 years ago (Wilson and Cann,
process evolving for as little as 102 generations is approximated by
1992; Vigilant et al., 1991) to 300,000 years ago (Hasegawa and
the O’Connell model with an accuracy of less than 2%. Moreover,
Horai, 1991). Additionally, many possible patterns of human pop-
the result holds for three offspring number distributions, and due
ulation growth were assumed. The simplified exponential models
to the asymptotic character of the O’Connell results, it also remains
were often used, but also the logistic growth of human popula-
true for the evolution of branching processes with an arbitrarily
tion proved to be not inconsistent with the mtDNA variation data large number of generations. Having this result, we were able
(Polanski et al., 1998). to obtain the estimate of the expected value of the ratio of the
The majority of these estimates were in agreement with the coalescence times of two individuals and that of all individuals in
out-of-Africa scenario and in contradiction to the multiregional the population for generation 104 , even if it is infeasible to apply a
theory of origin of modern humans, supported by some paleon- full genealogy model in this case.
tologists (Thorne and Wolpoff, 1992). These researchers claimed Finally, we applied our approach to estimate the age of the
that the time to the MRCA should be placed about a million years root of the mtDNA polymorphism of modern humans based on
ago or even earlier. It should be emphasized that the genetic data the genetic material belonging to contemporary humans and
did not necessarily contradict the multiregional theory, as shown Neanderthal fossils, as reported by Krings et al. (1999), Green et al.
by O’Connell (1995). He inferred, using the branching process (2008) and Briggs et al. (2009). For all stochastic trajectories we
model that the genetic diversity of modern humans was consistent analyzed, the resulting time falls into a 95% confidence interval
with estimates of the mtEve existing between 700 thousand and of the estimate based on phylogenetic trees (Krings et al., 1999).
1.5 million years ago. These estimates depended on an inaccurate Moreover, the result depends mainly (in fact linearly) on the
inference of the human–chimpanzee divergence time and on the assumed time to the MRCA of Neanderthals and modern humans
methods of the inference. To validate his conclusions, O’Connell rather than the method that was applied. Therefore, we conclude
(1995) also indicated the weak points of the outgroup methods that the stochastic models based on branching processes provide
when the genetic distance between the outgroup and the sample similar estimates to those obtained using phylogenetic analysis,
was large. Since the application of different methods to the same each supporting the other.
genetic data gave results differing by almost one order of magni- Hence, our results indicate that the estimates of the time
tude, the multiregional hypothesis could not be rejected solely be- to coalescence in the Wright–Fisher and the coalescent models
cause it was in contradiction with the majority of genetic models, with random population size are quite robust in terms of their
while there were still models supporting it. insensitivity to the model assumptions. They deviate by less
The situation changed after 1997 (Krings et al., 1997), when, than 8% (see Table 4) from the O’Connell model predictions,
and the asymptotic O’Connell prediction differs from the actual
for the first time, the mtDNA from Neanderthals dated to live
value computed in the full genealogy model by only 1.6%. Such
until about 40,000 years ago (Schmitz et al., 2002) was sequenced.
small differences are likely to be negligible compared to the
However, fewer than 400 base pairs were sequenced; hence, any
large range of confidence intervals obtained not only in pairwise
estimates based on this data were not very reliable. The next
difference-based methods considered in the paper, but also in
successful sequencings of Neanderthal mtDNA in 1999 (Krings
the phylogenetic studies (Krings et al., 1999; Green et al., 2008;
et al., 1999, 2000; Ovchinnikov et al., 2000; Krings et al., 2000) Briggs et al., 2009). The greatest uncertainty of the mathematical
confirmed the accuracy of the first experiment and radically expectations is caused by the scaling factors such as the divergence
changed the estimates of the time to the most recent common rate between species (the rate of the molecular clock) and not by
female ancestor of modern humans. Since it seems from the genetic deviations from the particular assumption of the method used.
data (Krings et al., 1999; Green et al., 2008; Briggs et al., 2009) This validates both the Wright–Fisher and the coalescent models
that Neanderthals did not contribute mtDNA to the lineages of with stochastic population sizes also for reproduction schemes not
presently living modern humans, the time of the mtEve should following assumptions of these models. In particular, this provides
be placed after the H. sapiens – H. neanderthalensis divergence. support to the results about inferring population trajectory from
Even if later studies (Serré et al., 2004; Cyran and Kimmel, 2005) the genetic diversity data, as reviewed in Wooding and Rogers
indicated that interbreeding between two human forms could not (2002); results which implicitly relied on the Wright–Fisher model
be excluded, and moreover that there is an evidence of a small- assumption but which remain valid for a much larger spectrum of
scale gene flow (Green et al., 2010), it remains true that the root of possible demographies.
Please cite this article in press as: Cyran, K.A., Kimmel, M., Alternatives to the Wright–Fisher model: The robustness of mitochondrial Eve dating. Theoretical Population
Biology (2010), doi:10.1016/j.tpb.2010.06.001
ARTICLE IN PRESS
8 K.A. Cyran, M. Kimmel / Theoretical Population Biology ( ) –
Acknowledgments Hasegawa, M., Horai, S., 1991. Time of the deepest root for polymorphism in human
mitochondrial DNA. J. Mol. Evol. 32, 37–42.
The authors would like to thank the anonymous reviewers for Jobling, M., 2001. In the name of the father: surnames and genetics. Trends Genet.
17, 353–357.
their suggestions and comments. KC was supported by a Grant Kimmel, M., Axelrod, D.E., 2002. Branching Processes in Biology. Springer-Verlag,
from the Polish Ministry of Science and Higher Education No. New York.
NN519 31 9035 from funds for supporting science in 2008–2010. Kimmel, M., Chakraborty, R., King, J., Bamshad, M., Watkins, W., Jorde, L., 1998.
Signatures of population expansion in microsatellite repeat data. Genetics 148,
MK was supported by a CPRIT grant number RP101089 and by a 1921–1930.
grant from the Polish Ministry of Science and Higher Education No. King, J.P., Kimmel, M., Chakraborty, R., 2000. A power analysis of microstallite-based
N N519 579938. Assistance and collegiality of Dr. Jan Hewitt of Rice statistics for inferring past population growth. Mol. Biol. Evol. 17, 1859–1868.
Klebaner, F.C., Sagitov, S., 2002. The age of a Galton–Watson population with a
University in the editing of the final version of the manuscript is geometric offspring distribution. J. Appl. Probab. 39, 816–828.
hereby gratefully acknowledged. Krings, M., Capelli, C., Tschentscher, F., Geisert, H., Meyer, S., von Haeseler,
A., Grossschmidt, K., Possnert, G., Paunovic, M., Pääbo, S., 2000. A view of
Neanderthal genetic diversity. Nat. Genet. 26, 144–146.
References Krings, M., Geisert, H., Schmitz, R., Krainitzki, H., Pääbo, S., 1999. DNA sequence of
the mitochondrial hypervariable region II from the Neanderthal type specimen.
Bjorklund, M., 2003. Test for a population expansion after a drastic reduction in Proc. Natl. Acad. Sci. USA 96, 5581–5585.
population size using DNA sequence data. Heredity 91, 481–486. Krings, M., Stone, A., Schmitz, R., Krainitzki, H., Stoneking, M., Pääbo, S., 1997.
Bobrowski, A., Kimmel, M., 2004. Asymptotic behavior of joint distributions of Neanderthal DNA sequences and the origin of modern humans. Cell 90,
characteristics of a pair of randomly chosen individuals in discrete-time 19–30.
Fisher–Wright models with mutations and drift. Theor. Popul. Biol. 66, 355–367. Laan, M., Wiebe, V., Khusnutdinova, E., Remm, M., Pääbo, S., 2005. X -chromosome
Briggs, A.W., Good, J.M., Green, R.E., Krause, J., Maricic, T., Stenzel, U., as a marker for population history: linkage disequilibrium and haplotype study
Lalueza-Fox, C., Rudan, P., Brajkovi, D., Kucan, Z., Gui, I., Schmitz, R., in Euroasians populations. Eur. J. Hum. Genet. 13, 452–462.
Doronichev, V.B., Golovanova, L.V., de la Rasilla, M., Fortea, J., Rosas, A., Lambert, A., 2003. Coalescence times for the branching process. Adv. Appl. Probab.
Pääbo, S., 2009. Targeted retrieval and analysis of five Neanderthal mtDNA 35, 1071–1098.
genomes. Science 325, 318–321. Nagylaki, T., 1990. Models and approximations for random genetic drift. Theor.
Burbano, H.A., Hodges, E., Green, R.E., Briggs, A.W., Krause, J., Meyer, M., Good, J.M., Popul. Biol. 37, 192–212.
Maricic, T., Johnson, Ph.L.F., Xuan, Z., Rooks, M., Bhattacharjee, A., Brizuela, L., Noonan, J.P., Coop, G., Kudaravalli, S., Smith, D., Krause, J., Alessi, J., Chen, F.,
Albert, F.W., de la Rasilla, M., Fortea, J., Rosas, A., Lachmann, M., Hannon, G.J., Platt, D., Pääbo, S., Pritchard, J.K., Rubin, E.M., 2006. Sequencing and analysis
Pääbo, S., 2010. Targeted investigation of the Neanderthal genome by array- of Neanderthal genomic DNA. Science 314, 1113–1118.
based sequence capture. Science 328, 723–725. O’Connell, N., 1995. The genealogy of branching processes and the age of our most
Cyran, K.A., 2007. Simulating branching processed in the problem of Mitochondrial recent common ancestor. Adv. Appl. Probab. 27, 418–442.
Eve dating based on coalescent distributions. Int. J. Math. Comput. Simul. 1, Ovchinnikov, I., Götherström, A., Romanova, G., Kharitonov, V., Lidén, K.,
268–274. Goodwin, W., 2000. Molecular analysis of Neanderthal DNA from the northern
Cyran, K.A., Kimmel, M., 2004. Robustness of the dating of the most recent common Caucasus. Nature 404, 490–493.
female ancestor of modern humans. In: Proc. Tenth National Conference on Pennisi, E., 2007. No sex please, we’re Neanderthals. Science 318, 967.
Application of Mathematics in Biology and Medicine, Swiȩty Krzyż, Poland. Pennisi, E., 2006. The dawn of the stone age genomics. Science 314, 1068–1071.
pp. 19–24. Plagnol, V., Wall, J.D., 2006. Possible ancestral structure in human populations. PLoS
Cyran, K.A., Kimmel, M., 2005. Interactions of Neanderthals and modern humans: Genet. 2, 972–979.
what can be inferred from mitochondrial DNA? Math. Biosci. Eng. 2, 487–498. Polanski, A., Kimmel, M., Chakraborty, R., 1998. Application of time-dependent
Cyran, K.A., Myszor, D., 2008. New artificial neural network based test for the coalescence process for inferring the history of population size changes from
detection of past population expansion using microsatellite loci. Int. J. Appl. DNA sequence data. Proc. Natl. Acad. Sci. USA 95, 5456–5461.
Math. Inform. 2, 1–9. Schmitz, R., Bonani, G., Smith, F.H., 2002. New research at the Neanderthal type site
Green, R.E., Krause, J., Briggs, A.W., Maricic, T., Stenzel, U., Kircher, M., in the Neander Valley of Germany. J. Hum. Evol. 42, A32.
Patterson, N., Li, H., Zhai, W., Fritz, M.H.-Y., Hansen, N.F., Durand, E.Y., Serré, D., Langaney, A., Chech, M., Teschler-Nicola, M., Paunovic, M., Mennecier, P.,
Malaspinas, A.-S., Jensen, J.D., Marques-Bonet, T., Alkan, C., Prüfer, K., Hofreiter, M., Possnert, G., Pääbo, S., 2004. No evidence of Neanderthal mtDNA
Meyer, M., Burbano, H.A., Good, J.M., Schultz, R., Aximu-Petri, A., Butthof, A., contribution to early modern humans. PLOS Biol. 2, 313–317.
Höber, B., Höffner, B., Siegemund, M., Weihmann, A., Nusbaum, Ch., Lander, E.S., Thompson, R., Pritchard, J., Shen, P., Oefner, P., Feldman, M., 2000. Recent common
Russ, C., Novod, N., Affourtit, J., Egholm, M., Verna, Ch., Rudan, P., ancestry of human Y chromosomes: evidence from DNA sequence data. Proc.
Brajkovic, D., Kucan, Ž., Gušic, I., Doronichev, V.B., Golovanova, L.V., Natl. Acad. Sci. USA 97, 7360–7365.
Lalueza-Fox, C., de la Rasilla, M., Fortea, J., Rosas, A., Schmitz, R.W., Thorne, A., Wolpoff, M.H., 1992. The multiregional evolution of humans. Scientific
Johnson, Ph.L.F., Eichler, E.E., Falush, D., Birney, E., Mullikin, J.C., American 266, 76–83.
Slatkin, M., Nielsen, R., Kelso, J., Lachmann, M., Reich, D., Pääbo, S., 2010. Vigilant, L., Stoneking, M., Harpending, H., Hawkes, K., Wilson, A.C., 1991. African
A draft sequence of the Neanderthal genome. Science 328, 710–721. populations and the evolution of human mitochondrial DNA. Science 253,
Green, R.E., Krause, J., Ptak, S.E., Briggs, A.W., Ronan, M.T., Simons, J.F., Du, L., Egholm, 1503–1507.
M., Rothberg, J.M., Paunovic, M., Pääbo, S., 2006. Analysis of one million base Wilson, A.C., Cann, R.L., 1992. The recent African genesis of humans. Scientific
pairs of Neanderthal DNA. Nature 444, 330–336. American 266, 68–73.
Green, R.E., Malaspinas, A.-S., Krause, J., Briggs, A.W., Johnson, Ph.L.F., Wooding, S., Rogers, A., 2000. A pleistocence population X -plosion? Hum. Biol. 72,
Uhler, C., Meyer, M., Good, J.M., Maricic, T., Stenzel, U., Pruefer, K., Siebauer, M., 693–695.
Burbano, H.A., Ronan, M., Rothberg, J.M., Egholm, M., Rudan, P., Brajkovic, D., Wooding, S., Rogers, A., 2002. The matrix coalescence and an application to human
Kucan, Z., Gusic, I., Wikstrom, M., Laakkonen, L., Kelso, J., Slatkin, M., Pääbo, S., single-nucleotide polymorphisms. Genetics 161, 1641–1650.
2008. A complete Neanderthal mitochondrial genome sequence determined by Yu, N., Zhao, Z., Fu, Y., Sambuughin, N., Ramsay, M., Jenkins, T., Leskinen, E.,
high-throughput sequencing. Cell 134, 416–426. Patthy, L., Jorde, L., Kuromori, T., Li, W., 2001. Global patterns of human DNA
Griffiths, R.C., Tavaré, S., 1995. Unrooted genealogical tree probabilities in the sequence variation in a 10-kb region on chromosome 1. Mol. Biol. Evol. 18,
infinitely-many-sites model. Math. Biosci. 127, 77–98. 214–222.
Please cite this article in press as: Cyran, K.A., Kimmel, M., Alternatives to the Wright–Fisher model: The robustness of mitochondrial Eve dating. Theoretical Population
Biology (2010), doi:10.1016/j.tpb.2010.06.001