Anda di halaman 1dari 57

Population Genetics

1 Classical Population Genetics Measuring genetic variation & Hardy Weinberg


2 Classical Population Genetics Hardy Weinberg cont., Mutation, Genetic Drift
3 Classical Population Genetics Genetic Drift cont.
4 Classical Population Genetics Gene Flow, Natural Selection
5 Classical Population Genetics Natural Selection cont.
6 Molecular Population Genetics Neutral Theory, Coalescent Theory
7 Molecular Population Genetics Linkage Disequilibrium, Haplotypes, Selection
Genetic variation is the rule, not the exception

We are all mutants.


But some of us are more mutant
than others.

Human genome = 3Gb


@ 99.9% identical = 3M differences

Every individual born with 100 new mutations Most are silent.
• 4 non-synonymous mutations
• 3 detrimental mutations
Genetic Variation & Diversity

From typological thinking …


Theres one type that represents the entire species

Takes into the account the diversity of each individual species


… to population thinking

Sarah Leen, National Geographic


Genetic Variation & Diversity

From wild type thinking …

Arabidopsis thaliana ecotype Col-0

Henslow and Darwin’s observation of variation of ecotypes/ phenotypes


… to population thinking
Many variations exist within species

Arabidopsis thaliana ecotypes

Xu et al. 2015 Sci.Rep. Janne Lempe & Detlef Weigel, Max Planck Institute for Developmental Biology
Genetic Variation & Diversity

• Why do we study genetic variation?


• Identify drivers of genotypic and phenotypic diversity
• Predict consequences & fate of diversity
• Understand significance of diversity

• How do we study genetic variation?


• Population Genetics

understandingrace.org
Population Genetics

• Study of genetic variation among individuals in a population.

• Population = group of interbreeding individuals


• Gene pool = the collection of genes shared by a population of individuals

• Primary questions
• What is the structure of the gene pool & how does it change over time?
• What biological characteristics impact the structure of the gene pool?
• What are the evolutionary forces that change the gene pool?

Biological characteristics Evolutionary forces


• Population structure • Mutation
• Breeding system • Migration
• Age structure • Natural selection
• Fecundity • Genetic drift
• Recombination
Genetic Diversity

Loci 1 2 3 4 5 6 7 8 9 10

Individual 1

Individual 2

Individual 3

Individual 4

• Locus: A specific location in the genome


Genetic Diversity

Variation of a single gene

en.wikipedia.org/wiki/MicrobesOnline
Genetic Diversity

invariant. the same AA


Genetic Diversity

Single Nucleotide Insertions/Deletions


Polymorphisms (SNPs) (InDels)

…ATAGCTGCTCGATTT… …ATAGCTGGGTGCTCGATTT…
…ATAGCGGCTCGATTT… …ATAGC-----GCTCGATTT…
…ATAGCAGCTCGATTT…

Copy Number Variation Structural Variation


(CNV) (SV)
Genetic Diversity

Loci 1 2 3 4 5 6 7 8 9 10

Individual 1

Individual 2

Individual 3

Individual 4

• Locus: A specific location in the genome


• Monomorphic loci: locations with no variation
• Polymorphic loci: locations with multiple variants

• Allele: Different forms of the same locus


• Major allele: allele at highest frequency
• Minor allele: low frequency allele
Heterozygosity

Loci 1 2 3 4 5 6 7 8 9 10

Individual 1

Individual 2

Individual 3

Individual 4

• Polymorphism
• Co-occurrence of 2 or more alleles at a locus within a population

• Heterozygosity
• Probability or fraction of individuals expected to be heterozygous at a particular
locus given the allele frequencies at that locus
• Population statistic, not statement of genotype
• A locus can be polymorphic, have no heterozygous individuals, and still have
heterozygosity
• Heterozygosity ≠ Heterozygote
Heterozygosity
• Fraction of individuals expected to be heterozygous for a particular locus given the
allele frequency at that locus

• where pi is the frequency of the ith of k alleles


• where m is the number of loci

Allele Frequency (%)


Locus 1 2 3 4 Heterozygosity
1 100
2 99 1
3 90 10
4 80 20
5 70 30
6 60 40
7 50 50
8 50 40 10
9 50 30 20
10 50 25 25
11 50 25 12.5 12.5
Heterozygosity
• Fraction of individuals expected to be heterozygous for a particular locus given the
allele frequency at that locus

• where pi is the frequency of the ith of k alleles


• where m is the number of loci

Allele Frequency (%)


Locus 1 2 3 4 Heterozygosity
1 100
2 99 1 0.020
3 90 10
4 80 20
5 70 30 H = 1 – (0.992 + 0.012)
6 60 40 H = 1 – (0.9801 + 0.001)
7 50 50 H = 1 - 0.9802
8 50 40 10 H = 0.0198
9 50 30 20
10 50 25 25
11 50 25 12.5 12.5
Heterozygosity
• Fraction of individuals expected to be heterozygous for a particular locus given the
allele frequency at that locus

• where pi is the frequency of the ith of k alleles


• where m is the number of loci

Allele Frequency (%)


Locus 1 2 3 4 Heterozygosity
1 100 0.000
2 99 1 0.020
3 90 10 0.180
4 80 20 0.320
5 70 30 0.420
6 60 40 0.480
7 50 50 0.500
8 50 40 10 0.580
9 50 30 20 0.620
10 50 25 25 0.625
11 50 25 12.5 12.5 0.656
Heterozygosity
• Fraction of individuals expected to be heterozygous for a particular locus given the
allele frequency at that locus

• where pi is the frequency of the ith of k alleles


• where m is the number of loci

Allele Frequency (%)


Locus 1 2 3 4 Heterozygosity
1 100 0.000
2 99 1 0.020
3 90 10 0.180
4 80 20 0.320
5 70 30 0.420
6 60 40 0.480
7 50 50 0.500
8 50 40 10 0.580
9 50 30 20 0.620
10 50 25 25 0.625
11 50 25 12.5 12.5 0.656

Total Heterozygosity 0.400


Genetic Diversity
Heterozygosity of 51 worldwide human populations from 1M SNPs

López et al. 2009. PLoS ONE 4(11): e7888.


Polymorphism & Heterozygosity

0.7

0.6 Insects

Land Snails
0.5
Marine Inverts
Polymorphism

Plants
0.4
Marine Snails
0.3
Reptiles Fish
Mammals Wasps
0.2
Rodents Amphibians
Birds
0.1

0
0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18
Heterozygosity
Lower heterozygosity Higher heterozygosity
than expected given than expected given
polymorphism polymorphism

Adapted from Russell, iGenetics, Table 22.3


Genetic Diversity

Heterozygosity in Haploids

How can haploid organisms be heterozygous?

Effective Heterozygosity
• Probability of sampling two different alleles from a haploid population

Het = 1 – (0.62 + 0.42) = 0.48


Heterozygosity

Loci 1 2 3 4 5 6 7 8 9 10

Individual 1

Individual 2

Individual 3

Individual 4

0.22 0.47 0.50 0.38 0.38 0.47 0.47 0.22 0.50 0.22

Total heterozygosity = 0.38

• High heterozygosity shared among all individual

Single interbreeding population


Heterozygosity

Loci 1 2 3 4 5 6 7 8 9 10

Individual 1

Individual 2

Individual 3

Individual 4

0.22 0.47 0.50 0.38 0.38 0.47 0.47 0.22 0.50 0.22

Total heterozygosity = 0.38

• Same total heterozygosity, but subdivided into 2 populations

Population substructure
Heterozygosity

Loci 1 2 3 4 5 6 7 8 9 10

Individual 1

Individual 2

Individual 3

Individual 4

0.38 0.50 0.50 0.38 0.38 0.38 0.38 0.00 0.50 0.38

Total heterozygosity = 0.38

• Same total heterozygosity, but no heterozygosity within individuals

High inbreeding, or selfing mating system


Polymorphism & Heterozygosity

Loci 1 2 3 4 5 6 7 8 9 10

Individual 1

Individual 2

Individual 3

Individual 4

Polymorphism:
• Co-occurrence of 2 or more alleles at a locus within a population
• Assesses only the number of variable site
• Does not assess prevalence of variation is or its distribution

Heterozygosity:
• Probability or fraction of individuals expected to be heterozygous at a particular
locus given the allele frequencies at that locus
• Assesses number of variable sites, prevalence of variation, and how the variation
is partitioned within and among individuals and populations (more on that later)
Hardy-Weinberg (HW)

• Model for explaining how Mendelian principles influence the distribution and fate of
genetic variation over time

• HW is a very simple model with very strong assumption


• Infinite population size
• Random mating
• No mutation
• No selection
• No migration

• Allele frequencies will not change over time in a population that fulfills HW assumptions
Hardy-Weinberg Equilibrium (HWE)
• Non-evolving population (null hypothesis)

• A population NOT in HW equilibrium must have violated a HW assumption


• Therefore, we can use the HW framework to identify what evolutionary forces are
acting on a population
Hardy-Weinberg

Genotype frequencies
n diploid individuals
3 genotypes AA 452 0.909
AA / Aa / aa Aa 43 0.087
aa 2 0.004
497 1.000

Allele frequencies
2n haploid gametes
A 947 0.953
2 alleles
A/a a 47 0.047
994 1.000

n diploid individuals
3 genotypes
AA / Aa / aa
Hardy-Weinberg

Mother
Individual
A a

A AA Aa
Father

a aA aa
Hardy-Weinberg

Female Gametes
Population
A a

A AA Aa
Male Gametes

a aA aa
Hardy-Weinberg

Female Gametes
Population Allele Frequencies
A p a q freq(A) = p
freq(a) = q
A p AA Aa
Male Gametes

p+q=1

a q aA aa
Lec 6, Fri, 9 Mar

BIO260 News

• Next tutorial
• Hardy-Weinberg and Hypothesis Testing
• Quiz on material from March 7 –12 (Wednesday – Monday)

Text
Hardy-Weinberg

Female Gametes
Population Allele Frequencies
A p a q freq(A) = p
freq(a) = q
A p AA p2 Aa pq
Male Gametes

p+q=1

a q aA qp aa q2

Genotype Frequencies
p2 + 2pq + q2 = 1

Hardy-Weinberg Equilibrium Distribution


Hardy-Weinberg

Observed Observed Allele Expected Expected


Numbers Genotype Freq. Freq. Genotype Freq. Numbers

pop. AA Aa aa n AA Aa aa p q AA Aa aa AA Aa aa

I 30 0 70 100 0.3 0.0 0.7

Observed AA frequencies = AA/n = 30/100 = 0.3


Observed Aa frequencies = Aa/n = 0/100 = 0.0
Observed aa frequencies = aa/n = 70/100 = 0.7
Hardy-Weinberg

Observed Observed Allele Expected Expected


Numbers Genotype Freq. Freq. Genotype Freq. Numbers

pop. AA Aa aa n AA Aa aa p q AA Aa aa AA Aa aa

I 30 0 70 100 0.3 0.0 0.7

p is the frequency of a gamete. A specific allele of this locus that has two alleles

p = freq(AA) + ½ freq(Aa)
q = freq(aa) + ½ freq(Aa)
Hardy-Weinberg

Observed Observed Allele Expected Expected


Numbers Genotype Freq. Freq. Genotype Freq. Numbers

pop. AA Aa aa n AA Aa aa p q AA Aa aa AA Aa aa

I 30 0 70 100 0.3 0.0 0.7 0.3 0.7

frequency of A
p = freq(AA) + ½ freq(Aa) = 0.3 + ½(0.0) = 0.3
Frequency of a
q = freq(aa) + ½ freq(Aa) = 0.7 + ½(0.0) = 0.7
Alternatively: q = 1 - p = 0.7
Hardy-Weinberg

Observed Observed Allele Expected Expected


Numbers Genotype Freq. Freq. Genotype Freq. Numbers

pop. AA Aa aa n AA Aa aa p q AA Aa aa AA Aa aa

I 30 0 70 100 0.3 0.0 0.7 0.3 0.7

Have the alleles assorted randomly?

Expected AA frequencies = p2
Expected Aa frequencies = 2pq
Expected aa frequencies = q2
Hardy-Weinberg

Observed Observed Allele Expected Expected


Numbers Genotype Freq. Freq. Genotype Freq. Numbers

pop. AA Aa aa n AA Aa aa p q AA Aa aa AA Aa aa

I 30 0 70 100 0.3 0.0 0.7 0.3 0.7 0.09 0.42 0.49

Have the alleles assorted randomly?

Expected AA frequencies = p2 = (0.3)2 = 0.09


Expected Aa frequencies = 2pq = 2(0.3)(0.7) = 0.42
Expected aa frequencies = q2 = (0.7)2 = 0.49
Hardy-Weinberg

Observed Observed Allele Expected Expected


Numbers Genotype Freq. Freq. Genotype Freq. Numbers

pop. AA Aa aa n AA Aa aa p q AA Aa aa AA Aa aa

I 30 0 70 100 0.3 0.0 0.7 0.3 0.7 0.09 0.42 0.49 9 42 49

Assuming constant population size

Expected AA numbers = 0.09 x 100 = 9


Expected Aa numbers = 0.42 x 100 = 42
Expected aa numbers = 0.49 x 100 = 49
Hardy-Weinberg

Observed Observed Allele Expected Expected


Numbers Genotype Freq. Freq. Genotype Freq. Numbers

pop. AA Aa aa n AA Aa aa p q AA Aa aa AA Aa aa

I 30 0 70 100 0.3 0.0 0.7 0.3 0.7 0.09 0.42 0.49 9 42 49

Are the observed genotypes different from the expected genotypes?


• Is the population in HWE?

• Yes → the assumptions hold


• no evolutionary forces are acting on the population
• No → an assumption has been violated
• some evolutionary force is in play

We will determine if the observed deviation from expected is statistically


significant in the hypothesis testing tutorial
Hardy-Weinberg

Observed Observed Allele Expected Expected


Numbers Genotype Freq. Freq. Genotype Freq. Numbers

pop. AA Aa aa n AA Aa aa p q AA Aa aa AA Aa aa

I 30 0 70 100 0.3 0.0 0.7 0.3 0.7 0.09 0.42 0.49 9 42 49

II 6 6 18 30

III 58 232 290 580

When looking at a population in an ecosystem you can obtain wildly varying numbers. This
technique can be applied to small or large sample sizes.
Hardy-Weinberg

Observed Observed Allele Expected Expected


Numbers Genotype Freq. Freq. Genotype Freq. Numbers

pop. AA Aa aa n AA Aa aa p q AA Aa aa AA Aa aa

I 30 0 70 100 0.3 0.0 0.7 0.3 0.7 0.09 0.42 0.49 9 42 49

II 6 6 18 30 0.2 0.2 0.6 0.3 0.7

III 58 232 290 580 0.1 0.4 0.5 0.3 0.7

p = freq(AA) + ½ freq(Aa)
q = freq(aa) + ½ freq(Aa)
Hardy-Weinberg

Observed Observed Allele Expected Expected


Numbers Genotype Freq. Freq. Genotype Freq. Numbers

pop. AA Aa aa n AA Aa aa p q AA Aa aa AA Aa aa

I 30 0 70 100 0.3 0.0 0.7 0.3 0.7 0.09 0.42 0.49 9 42 49

II 6 6 18 30 0.2 0.2 0.6 0.3 0.7 0.09 0.42 0.49 2.7 12.6 14.7

III 58 232 290 580 0.1 0.4 0.5 0.3 0.7 0.09 0.42 0.49 52.2 243.6 284.2

Frequency is a seperate question from whether they’re in hardy weinberg

Expected AA frequencies = p2
Expected Aa frequencies = 2pq
Expected aa frequencies = q2
Hardy-Weinberg

Observed Observed Allele Expected Expected


Numbers Genotype Freq. Freq. Genotype Freq. Numbers

pop. AA Aa aa n AA Aa aa p q AA Aa aa AA Aa aa

I 30 0 70 100 0.3 0.0 0.7 0.3 0.7 0.09 0.42 0.49 9 42 49

II 6 6 18 30 0.2 0.2 0.6 0.3 0.7 0.09 0.42 0.49 2.7 12.6 14.7

III 58 232 290 580 0.1 0.4 0.5 0.3 0.7 0.09 0.42 0.49 52.2 243.6 284.2

IV 9 42 49 100 0.09 0.42 0.49 0.3 0.7 0.09 0.42 0.49 9 42 49

V 2.7 12.6 14.7 30 0.09 0.42 0.49 0.3 0.7 0.09 0.42 0.49 2.7 12.6 14.7

VI 52 244 284 580 0.09 0.42 0.49 0.3 0.7 0.09 0.42 0.49 52.2 243.6 284.2
Implications of Hardy-Weinberg Equilibrium

• No change in allele frequencies over time (no loss of genetic variation)

• HWE is obtained after just one generation irrespective of the genotype frequencies
in the parental generation (if HW assumption are true)

• The mechanism of Mendelian inheritance (law of independent segregation) maintains


genetic variation

• In the early days of Mendelian genetics it was assumed that dominant alleles
would become more frequent and recessive alleles would vanish. HW shows this
to be incorrect

HWE Distribution
p2 + 2pq + q2 = 1
Hardy-Weinberg
and evolutionary processes

Genotypes
generation t AA Aa aa

HW
equilibrium

Are these yes


alleles A a frequencies the
same? no

mutation
drift
selection
migration

Genotypes
generation t+1 AA Aa aa
Hardy-Weinberg

• Hardy-Weinberg law explains what happens to a population’s allelic and genotypic


frequencies as the alleles are passed from generation in the absence of evolutionarily
processes

• HW assumptions
• Infinite population size
• Random mating
• No mutation
• No selection
• No migration

• If the HW assumptions are met


• Alleles are expected to combine into genotypes based on simple Mendelian laws
• The population will reach a genetic equilibrium within one generation
• The population will remain in a genetic equilibrium so long as the assumptions are
maintained
• Allele frequencies will not change from one generation to the next
• Genetic diversity will not be lost
Hardy-Weinberg

• HW assumptions
• Infinite population size
• Random mating
• No mutation
• No selection
• No migration

• How can these ever be true???


• Assumptions only apply to locus under study
• Does the specific locus influence mating preference?
• Is the specific locus under selection?
• Is the specific locus strongly linked to population structure?

Most loci are in HWE


HW Equilibrium in Human Populations
a haplotype is a cluster of specific alleles expected to be inherited together
snp s that occur together
1 locus w 2 alleles

dotted like (hardy weinberg expetation) is close to the mean, so the locus is in hardy weinberg

Human HapMap
1st 10,000 SNPs
Chromosome 1

gcbias.org/2011/10/13
Hardy-Weinberg
application

Normal Vision Red-Green Colorblind


Hardy-Weinberg
application

• Red-green color blindness


• Recessive mutation in X-linked gene for red & green color receptors
• Accounts for ~95% of all color vision variation
• Frequency of ~12% among men
• a = 0.12

What is the frequency among women?

Sex Genotype Frequency Phenotype

Males A p = 0.88 Normal

a q = 0.12 Color blind

Females AA p2 = 0.77 Normal

Aa 2pq = 0.21 Normal

aa q2 = 0.02 Color blind


The Forces of Evolution

• Mutation

• Genetic Drift

• Migration (Gene Flow)

• Selection
The Forces of Evolution

• Mutation

• Genetic Drift

• Migration (Gene Flow)

• Selection
Mutation

• The ultimate source of all genetic variability

• Occur at a clock-like rate


• Number is proportional to time
• Very high variance

Point-Mutation Rate
Organism and Genome
per site per generation
Plant chloroplast DNA 1.0 x 10-9
Gram-negative bacterial DNA 1.0 x 10-9
Mammalian nuclear DNA 3.5 x 10-9
Plant nuclear DNA 5.0 x 10-9
Drosophila nuclear DNA 1.5 x 10-8
Mammalian mitochondrial DNA 5.7 x 10-8
HIV-1 6.6 x 10-3
Influenza A virus 1.3 x 10-2
Mutation
Numbers of somatic mutations per gigabase for different cancers by age
• 10,250 cancer genomes across 36 cancer types
correlations between the age of cancer diagnosis and the mutation signature attributed to that cancer

Alexandrov 2015 Nat Genet. 47(12): 1402–1407


Directional Mutation
mutation violates the hardy w assumptions

• 2 alleles, A and a

• Frequency of A = p0

• Rate of mutational change from A a = μ

• Frequency of A after mutation (p1)

= initial frequency (p0) - frequency of mutated alleles (μp0)

p1 = p0 – μp0 = p0(1-μ)

p2 = p1 – μp1 = p1(1-μ) = p0(1-μ)(1-μ) = p0(1-μ)2

p3 = p2 – μp2 = p2(1-μ) = p0(1-μ)2(1-μ) = p0(1-μ)3

pt = p0(1-μ)t
Directional Mutation

pt = p0(1-μ)t

since 1-μ < 1


as t ∞, pt 0

Recurrent directional mutation at m = 10-5


1.0

0.8

0.6
p
0.4

0.2

0.0
0 50,000 100,000 150,000 200,000
Generations
The Forces of Evolution

• Mutation

• Genetic Drift

• Migration (Gene Flow)

• Selection
Genetic Drift

Change in allele frequencies due to random sampling variation between generations


• Stochastic sampling process

• Only significant in finite populations


• Relaxation of infinite population size assumption
• Magnitude is inversely related to the population size (N)

• Why do we care?
• Stochastically changes allele frequencies
• Change occur without respect to the fitness of alleles or individuals
• Decreases heterozygosity = Increases homozygosity
• Increased likelihood of exposing deleterious recessive alleles
• May lower fitness of population
The Forces of Evolution

• Mutation

• Genetic Drift

• Migration (Gene Flow)

• Selection

Anda mungkin juga menyukai