Every individual born with 100 new mutations Most are silent.
• 4 non-synonymous mutations
• 3 detrimental mutations
Genetic Variation & Diversity
Xu et al. 2015 Sci.Rep. Janne Lempe & Detlef Weigel, Max Planck Institute for Developmental Biology
Genetic Variation & Diversity
understandingrace.org
Population Genetics
• Primary questions
• What is the structure of the gene pool & how does it change over time?
• What biological characteristics impact the structure of the gene pool?
• What are the evolutionary forces that change the gene pool?
Loci 1 2 3 4 5 6 7 8 9 10
Individual 1
Individual 2
Individual 3
Individual 4
en.wikipedia.org/wiki/MicrobesOnline
Genetic Diversity
…ATAGCTGCTCGATTT… …ATAGCTGGGTGCTCGATTT…
…ATAGCGGCTCGATTT… …ATAGC-----GCTCGATTT…
…ATAGCAGCTCGATTT…
Loci 1 2 3 4 5 6 7 8 9 10
Individual 1
Individual 2
Individual 3
Individual 4
Loci 1 2 3 4 5 6 7 8 9 10
Individual 1
Individual 2
Individual 3
Individual 4
• Polymorphism
• Co-occurrence of 2 or more alleles at a locus within a population
• Heterozygosity
• Probability or fraction of individuals expected to be heterozygous at a particular
locus given the allele frequencies at that locus
• Population statistic, not statement of genotype
• A locus can be polymorphic, have no heterozygous individuals, and still have
heterozygosity
• Heterozygosity ≠ Heterozygote
Heterozygosity
• Fraction of individuals expected to be heterozygous for a particular locus given the
allele frequency at that locus
0.7
0.6 Insects
Land Snails
0.5
Marine Inverts
Polymorphism
Plants
0.4
Marine Snails
0.3
Reptiles Fish
Mammals Wasps
0.2
Rodents Amphibians
Birds
0.1
0
0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18
Heterozygosity
Lower heterozygosity Higher heterozygosity
than expected given than expected given
polymorphism polymorphism
Heterozygosity in Haploids
Effective Heterozygosity
• Probability of sampling two different alleles from a haploid population
Loci 1 2 3 4 5 6 7 8 9 10
Individual 1
Individual 2
Individual 3
Individual 4
0.22 0.47 0.50 0.38 0.38 0.47 0.47 0.22 0.50 0.22
Loci 1 2 3 4 5 6 7 8 9 10
Individual 1
Individual 2
Individual 3
Individual 4
0.22 0.47 0.50 0.38 0.38 0.47 0.47 0.22 0.50 0.22
Population substructure
Heterozygosity
Loci 1 2 3 4 5 6 7 8 9 10
Individual 1
Individual 2
Individual 3
Individual 4
0.38 0.50 0.50 0.38 0.38 0.38 0.38 0.00 0.50 0.38
Loci 1 2 3 4 5 6 7 8 9 10
Individual 1
Individual 2
Individual 3
Individual 4
Polymorphism:
• Co-occurrence of 2 or more alleles at a locus within a population
• Assesses only the number of variable site
• Does not assess prevalence of variation is or its distribution
Heterozygosity:
• Probability or fraction of individuals expected to be heterozygous at a particular
locus given the allele frequencies at that locus
• Assesses number of variable sites, prevalence of variation, and how the variation
is partitioned within and among individuals and populations (more on that later)
Hardy-Weinberg (HW)
• Model for explaining how Mendelian principles influence the distribution and fate of
genetic variation over time
• Allele frequencies will not change over time in a population that fulfills HW assumptions
Hardy-Weinberg Equilibrium (HWE)
• Non-evolving population (null hypothesis)
Genotype frequencies
n diploid individuals
3 genotypes AA 452 0.909
AA / Aa / aa Aa 43 0.087
aa 2 0.004
497 1.000
Allele frequencies
2n haploid gametes
A 947 0.953
2 alleles
A/a a 47 0.047
994 1.000
n diploid individuals
3 genotypes
AA / Aa / aa
Hardy-Weinberg
Mother
Individual
A a
A AA Aa
Father
a aA aa
Hardy-Weinberg
Female Gametes
Population
A a
A AA Aa
Male Gametes
a aA aa
Hardy-Weinberg
Female Gametes
Population Allele Frequencies
A p a q freq(A) = p
freq(a) = q
A p AA Aa
Male Gametes
p+q=1
a q aA aa
Lec 6, Fri, 9 Mar
BIO260 News
• Next tutorial
• Hardy-Weinberg and Hypothesis Testing
• Quiz on material from March 7 –12 (Wednesday – Monday)
Text
Hardy-Weinberg
Female Gametes
Population Allele Frequencies
A p a q freq(A) = p
freq(a) = q
A p AA p2 Aa pq
Male Gametes
p+q=1
a q aA qp aa q2
Genotype Frequencies
p2 + 2pq + q2 = 1
pop. AA Aa aa n AA Aa aa p q AA Aa aa AA Aa aa
pop. AA Aa aa n AA Aa aa p q AA Aa aa AA Aa aa
p is the frequency of a gamete. A specific allele of this locus that has two alleles
p = freq(AA) + ½ freq(Aa)
q = freq(aa) + ½ freq(Aa)
Hardy-Weinberg
pop. AA Aa aa n AA Aa aa p q AA Aa aa AA Aa aa
frequency of A
p = freq(AA) + ½ freq(Aa) = 0.3 + ½(0.0) = 0.3
Frequency of a
q = freq(aa) + ½ freq(Aa) = 0.7 + ½(0.0) = 0.7
Alternatively: q = 1 - p = 0.7
Hardy-Weinberg
pop. AA Aa aa n AA Aa aa p q AA Aa aa AA Aa aa
Expected AA frequencies = p2
Expected Aa frequencies = 2pq
Expected aa frequencies = q2
Hardy-Weinberg
pop. AA Aa aa n AA Aa aa p q AA Aa aa AA Aa aa
pop. AA Aa aa n AA Aa aa p q AA Aa aa AA Aa aa
pop. AA Aa aa n AA Aa aa p q AA Aa aa AA Aa aa
pop. AA Aa aa n AA Aa aa p q AA Aa aa AA Aa aa
II 6 6 18 30
When looking at a population in an ecosystem you can obtain wildly varying numbers. This
technique can be applied to small or large sample sizes.
Hardy-Weinberg
pop. AA Aa aa n AA Aa aa p q AA Aa aa AA Aa aa
p = freq(AA) + ½ freq(Aa)
q = freq(aa) + ½ freq(Aa)
Hardy-Weinberg
pop. AA Aa aa n AA Aa aa p q AA Aa aa AA Aa aa
II 6 6 18 30 0.2 0.2 0.6 0.3 0.7 0.09 0.42 0.49 2.7 12.6 14.7
III 58 232 290 580 0.1 0.4 0.5 0.3 0.7 0.09 0.42 0.49 52.2 243.6 284.2
Expected AA frequencies = p2
Expected Aa frequencies = 2pq
Expected aa frequencies = q2
Hardy-Weinberg
pop. AA Aa aa n AA Aa aa p q AA Aa aa AA Aa aa
II 6 6 18 30 0.2 0.2 0.6 0.3 0.7 0.09 0.42 0.49 2.7 12.6 14.7
III 58 232 290 580 0.1 0.4 0.5 0.3 0.7 0.09 0.42 0.49 52.2 243.6 284.2
V 2.7 12.6 14.7 30 0.09 0.42 0.49 0.3 0.7 0.09 0.42 0.49 2.7 12.6 14.7
VI 52 244 284 580 0.09 0.42 0.49 0.3 0.7 0.09 0.42 0.49 52.2 243.6 284.2
Implications of Hardy-Weinberg Equilibrium
• HWE is obtained after just one generation irrespective of the genotype frequencies
in the parental generation (if HW assumption are true)
• In the early days of Mendelian genetics it was assumed that dominant alleles
would become more frequent and recessive alleles would vanish. HW shows this
to be incorrect
HWE Distribution
p2 + 2pq + q2 = 1
Hardy-Weinberg
and evolutionary processes
Genotypes
generation t AA Aa aa
HW
equilibrium
mutation
drift
selection
migration
Genotypes
generation t+1 AA Aa aa
Hardy-Weinberg
• HW assumptions
• Infinite population size
• Random mating
• No mutation
• No selection
• No migration
• HW assumptions
• Infinite population size
• Random mating
• No mutation
• No selection
• No migration
dotted like (hardy weinberg expetation) is close to the mean, so the locus is in hardy weinberg
Human HapMap
1st 10,000 SNPs
Chromosome 1
gcbias.org/2011/10/13
Hardy-Weinberg
application
• Mutation
• Genetic Drift
• Selection
The Forces of Evolution
• Mutation
• Genetic Drift
• Selection
Mutation
Point-Mutation Rate
Organism and Genome
per site per generation
Plant chloroplast DNA 1.0 x 10-9
Gram-negative bacterial DNA 1.0 x 10-9
Mammalian nuclear DNA 3.5 x 10-9
Plant nuclear DNA 5.0 x 10-9
Drosophila nuclear DNA 1.5 x 10-8
Mammalian mitochondrial DNA 5.7 x 10-8
HIV-1 6.6 x 10-3
Influenza A virus 1.3 x 10-2
Mutation
Numbers of somatic mutations per gigabase for different cancers by age
• 10,250 cancer genomes across 36 cancer types
correlations between the age of cancer diagnosis and the mutation signature attributed to that cancer
• 2 alleles, A and a
• Frequency of A = p0
p1 = p0 – μp0 = p0(1-μ)
pt = p0(1-μ)t
Directional Mutation
pt = p0(1-μ)t
0.8
0.6
p
0.4
0.2
0.0
0 50,000 100,000 150,000 200,000
Generations
The Forces of Evolution
• Mutation
• Genetic Drift
• Selection
Genetic Drift
• Why do we care?
• Stochastically changes allele frequencies
• Change occur without respect to the fitness of alleles or individuals
• Decreases heterozygosity = Increases homozygosity
• Increased likelihood of exposing deleterious recessive alleles
• May lower fitness of population
The Forces of Evolution
• Mutation
• Genetic Drift
• Selection