Anda di halaman 1dari 79

Bioinformatics and MAS - an Indian Experience

N. K. Singh and T. R. Sharma

NRC Plant Biotechnology IARI, New Delhi

Outline
1. Informatics in MAS 2. Rice genome sequence information for mapping, tagging and map-based cloning of genes 3. Mining novel alleles of cloned genes in the germplasm 4. Synteny and colinearity- transferring rice genome information to wheat 5. Databases and web-based tools to assist breeders 6. High throughput genotyping to save time and cost 7. Proposed activities for the Indo-Aus wheat MAS network

1. Informatics in MAS

Phase 4
Rice

Applications in crop improvement


(MAS/ Transgenics)

Phase 3
Tomato Soybean Sorghum Medicago Brassica

High throughput gene/ marker discovery Large scale genome sequencing Pilot genome sequencing Gene based markers Map based cloning Identify DNA markers linked with traits
(Simple/ QTL)

Functional genomics
Genotype Transcriptome Proteome Metabolome Phenome

Phase 2
Maize Wheat Cotton Sugarcane

High density molecular genetic map

Phase1
Chickpea Pigeonpea Mango Banana

Basic Resources : Germplasm, mutant lines, knock outs, mapping populations, GSTs, ESTs, BAC libraries, BAC-end sequences, Bioinformatics

Different Phases of Plant Genomics Research

Arrow of Time

Status of genome maps


Group Microbes Animals Completed 268 8 (Human, Chimp, Mouse, Rat, Fruit fly, Mosquito, Fogu fish) 3 (Arabidopsis, Rice, Poplar) In progress 543 10

Plants

5 (Medicago, Lotus, Soybean, Sorghum, Tomato)

Source: NCBI

Enormous size of crop genomes


1pg = 1 billion base pairs (1000 Mbp) Maize 2500 Mb

Barley 6000 Mb

Rice 390 Mb Arabidopsis 125 Mb

Sorghum 1000 Mb

Wheat 16000 Mb

Human 3000 Mb

Microbes 5 Mb

What is needed?
(for Molecular Breeding in Orphan Crops)

1. De novo tools- Better maps, ESTs, BACs


(Sequencing, genotyping facility)

2. Comparative genomics- leverage information from model species


(Genome-informatics facility, human resource)

3. Create novel genetic variation- wide crosses, transgenics


(Green houses, tissue culture facility, gene constructs, IPR)

NRCPB Genoinformatics Centre

2. Rice genome sequence information for mapping, tagging and map-based cloning of genes

Maps of the 12 Sequenced Rice Chromosomes

Size = 388.8 Mb

Ref: IRGSP, Nature 11 August 2005

Mapping of QTLs/Genes for important traits in rice


Basmati quality traits Grain number Salt tolerance Blast resistance

Gene Discovery: Efforts at NRCPB, IARI and CSSRI

Fine mapping of QTLs, and expression profiling (micro array/ proteomics) of genes for complex agronomic traits
Effect of 100 mM NaCl on salt susceptible (MI 48) and salt tolerant (CSR 27) varieties of rice

Frequency

10

15

20

25

30

0 P2

P1 P2 RILs

Grain length

Distribution of Grain length in RILs

Grain length (in mm)


P1

5. 51 -5 .7 1 5. 91 -6 .1 1 6. 31 -6 .5 1 6. 71 -6 .9 1 7. 11 -7 .3 1 7. 51 -7 .7 1 7. 91 -8 .1 1 8. 31 -8 .5 1 8. 71 -8 .9 1 9. 11 -9 .3 1

Overview of Quality QTLs in Pusa 1121

Salt tolerance

Effect of 100 mM NaCl on salt susceptible (MI 48) and salt tolerant (CSR 27) varieties of rice

Graphic display of detected QTLs for 17 salt tolerance parameters

GENETIC & PHYSICAL MAP of Pi-kh locus - Comparative Genomics Approaches


MARKER RM536 40.5 149Kb RM202 9.5 CAPS100 4.5 RM6965 3.6 RM2190 RM206 TRS 26 Pi kh TRS 33 1.3 0 0.7 0.5 171 Kb 10.7 AC104846 RM224 136 Kb 130 Kb Genetic map of Pi kh locus,Chr 11 AC 122143 Search for New SSR markers in Nippon bare AC 145349 AC125782 136 Kb AC104846 AC118340 134 kb 170502-171522 R gene TRS 33 16,903-16 972 CG AC125780 150Kb 142 Kb cM size 148Kb AC 121327 BAC Acc.No AC 109832 size 171Kb BAC Acc.No AC 145349 TRS 26 35,016-35059

(7400 genes)

Physical map of Pi-kh locus in Nippon bare

Candidate gene identified


1.5kb

(18 genes)

28 MB

~1 MB

142kb

Cloning of Disease Resistance Gene in Rice

R-gene like sequences (Nipponbare)

Design PCR primers flanking to the R-gene

Expected PCR product

Isolation & Structure of the Candidate Gene


Tetep
P1 -343 WUN-1 -101 -269 - 221 MeJA resp CAAT BOX Element

TAC

ATT
S 990

Poly A

P2

-64 -47 T1
TATA GT1 BOX

AAcAAA

Motiff

990 bp ORF
TAC

HP2216
P1 -343 WUN-1 -269 - 221 -101 MeJA resp CAAT BOX Element

ATT
S 990

Poly A

P2

-64 -47 G1
TATA GT1 BOX

AAcAAA

PCR amplification of Pi-kh gene from Tetep and HP2216

Motiff

Gene structure

Comparison of gene structure between NB & cloned gene

Sharma et al. Mol Gen Genomics: 274:569-578

Table: Growing number of genes for important agronomic traits cloned recently making use of the rice genome sequence information
S. no. 1 Trait Bacterial leaf blight resistance Plant height Amylose content Grain number Salt tolerance Grain aroma Blast resistance Submergence tolerance Lodging tolerance Seed shattering Gene Xa 21 (NBS-LRR type receptor kinase) Sd 1 (gibberellin-20-oxidase) Sbe 3 (starch branching enzymes) OsCKX2 (cytokinin oxydase) SKC1 (a HKT type transporter) BAD2 (betaine aldehyde dehydrogenase 2) Pikh (NBS-LRR type protein) Sub1 (Ethylene response factorlike) Lsi 1 (Silicon transporter) qSH 1 (BEL1-type homeobox) Reference Song et al. (1995)

2 3 4 5 6 7 8 9 10

Sasaki et al. (2002) Liu et al. (2004) Ashikari et al. (2005) Ren et al. (2005) Bradburry et al. (2005) Sharma et al. (2005) Xu et al. (2006) Ma et. (2006) Konishi et al. (2006)

3. Mining novel alleles of cloned genes in the germplasm

Phenotypic analysis of O.sativa lines and species with M. grisea

Resistant:

O. punctata, O. latifolia, O. officinalis, O. rhizomatis, Coloro, Jatto, K-60, Fukunishiki and Tetap

Lesions

Susceptible: O. rufipogon, O. nivara,


O. minuta, O. gradiglumis, HP2216, Co-39, Bhrigudhan

PCR amplification of Pi-kh gene from Different Rice Lines


F- Primer R- Primer

M 1 bp 1500 1000 750 250

3 4 5

6 7

8 9 10 11 12 13 14

PCR amplification of Pi-Kh allele from rice lines and wild Oryza species
M UC 1 2 3 4 5 6 7 8 UC M bp
4361 2322 2027

M pUC 1 2 3 4 5 6

Quantification (A) and Restriction analysis plasmid to check presence of insert

(B)

of

Consensus sequence derived from individual reads

Mining blast resistance genes from wild species of Rice

Tetep HP2216 Coloro K-60 Jatto Co-39 Bhrigudhan Fukunishiki Nipponbare O. nivara O .rufipogon O. latifolia O.officinalis O. rhizomatis O. punctata O. minuta O.gradiglumis 1bp 1765 bp

ORFs and number of exons predicted in Pi-kh gene isolated from different O. sativa lines and wild species

Number of transitions, transversions and indels in Pi-kh allele

100 Transition Transversion Indels 60

80

Number of SNPs

40

20

Bh Jat rig to ud ha Fu n ku n N i sh ip k po i nb ar O .n e iv ar a

O .m in

.r O

Indica

Japonica
Lines/Species

Wild species

O .p

o 0 9 16 -3 olor -6 22 o K C C P H

on g o fi p

s is ia is ol nal umi at f m tis fici igl o z a d f hi O. l .o ran .r O g O . O

ut

un

ct a

ta

Allele mining for disease resistance genes

Phylogenetic relationship among Pi-kh genes amplified from different wild species of rice.

An overview of the sequence contigs of the badh1 gene of 16 rice varieties based on sequence reads obtained using 16 pair of primers, assembled using Phred/Phrap/Consed software)

Screen shot of Consed window showing location of one of the 20 SNPs discovered by sequencing of the badh1 gene fragments from 16 rice varieties.

Location of PCR primers (reverse primer underlined) and 20 SNPs (highlighted) in the badh1 gene of rice. The gene has 15 exons (in bold) and 14 introns

Summary of the SNP alleles in 16 rice varieties and the reference variety Nipponbare at 20 different positions in the badh1 gene, starting from the ATG codon of Nipponbare..

SNP: BADH1- S1

S2

S3

S4

S5

S6

S7

S8

S9

S10

S11

S12

S13

S14

S15

S16

S17

S18

S19 S20

4. Synteny and colinearity -transferring rice information to wheat

Rice chromosome 11 long arm: comparison with wheat

45 40 35

No. of G enes

30 25 20 15 10 5 0 1 2 3 4 5 6 7

Wheat Chromosome Group

Genome wide analysis of homology between 56298 predicted rice gene CDS (from the IRGSP sequence) and 39,813 wheat EST contigs (from wheat SNP consortium, build 3), plus 3792 binmapped wheat EST contigs (USDA-NSF wheat genome project, version Aug. 03)

Homology plot of 5840 rice genes mapped on 21 wheat chromosomes

Ancient duplications in the Rice Genome

Rice-Sorghum gene colinearity


rice

sorghum

Rice-Maize gene colinearity


rice

maize

Typical Patterns of Synteny between rice and wheat

Distribution of Copy Number among the 4659 Rice Gene Homologs of Wheat

Conserved Synteny of Single Copy Rice Genes with Wheat

Conserved Synteny of single copy rice genes with wheat

A. Rice chr 1/ wheat chr 3 B. Rice chr 2/ wheat chr 6 C. Rice chr 6/ wheat chr 8

Transposition of genes concentrated near wheat centromeres

Rice-Wheat Colinearity based on 1063 single copy rice genes


Conclusions:
1. Seven wheat chromosomes seem to have evolved from the 12 ancestral rice chromosomes by 3 centric fusion and 6 translocation events W1 = R5 + R10 W2 = R7 + R4 W3 = R1 W4 = R3 + R11 W5 = R12 + R9 +R3 W6 = R2 W7 = R6 + R8

Wheat Chromosomes

1.

10 11 12

Rice Chromosomes

Predicting wheat bin location of 6178 unmapped single copy rice gene homologs

Predicting wheat bin location of 6178 unmapped rice gene homologs

Validation of the predicted wheat bin location of unmapped single copy rice genes

1. Chinese Spring ; 2. CS del 6AL8; 3. CS del 6BL5; 4. CS del 6DL10

Experimental Validation of the Predicted Map Location of genes in Wheat

35% of the 213 single copy rice gene homologs, representing all 12 rice and all 7 wheat chromosomes, mapped to their predicted bin location

5. Databases and web-based tools to assist breeders

Containing Info on 56298 genes

Distribution of R-like Genes and Defense Response Genes in Rice Genome


R- gene like seq.= NBS-LRR, LZ-NBS-LRR, LRR-TM, Misc. (putative, known genes) Defense response genes = chitinases, glucanases, thaumatin like proteins

Chromoso me No. 1 2 3 4 5 6 7 8 9 10 11 12 Total

Total genes 7378 5770 6544 5831 4977 5071 4858 4536 3561 3933 4436 4355 61250

Rgenes 115 84 46 38 54 63 62 62 44 39 115 53 775

R-genes (%) 1.56 1.46 0.7 0.65 1.08 1.24 1.28 1.37 1.24 0.99 2.59 1.22 15.38

Def. Response 28 10 28 4 20 16 11 13 8 8 11 10 167

Def. Response (%) 0.38 0.17 0.43 0.07 0.4 0.32 0.23 0.29 0.22 0.2 0.25 0.23 3.19

Different types of R-Like Genes Distributed on Rice Chromosomes


80 70 No. of R-genes 60 50 40 30 20 10 0 1 2 3 4 5 6 7 8 9 10 11 12 Rice Chromosome NBS-LRR Misc LZ-NBS-LRR Def.Resp LRR-TM

Distribution of Defense Response Genes on Rice Chromosomes


25 D e f. R e s p o n s e G e n e s (N o . ) Thaumatin 20 15 10 5 0 1 2 3 4 5 6 7 8 9 10 11 12 Rice Chromosome Glucanases Chitinases

Mapping of R-like Genes on Rice Chromosome 1

Chr 1

Chr 2

Chr 3

Chr 4

Chr 5

Chr 6

Mapping of R-like Genes on Rice Chromosome 11

Chr 7

Chr 8

Chr 9

Chr 10

Chr11

Chr 12
A total of 176 R- and DR-genes clusters identified

Mapping of R-like Genes on Rice Chromosome 11

To sum up
VansanuDhan-A Rice Genome Database created at NRCPB contains 562898 genes info. It is being used in functional and comparative genome analysis in rice. We found 942 R-gene and Defense Response gene like sequences in the rice genome. The physical location and orientation of each gene delineated. Comparative analysis of indica- japonica sequences helped us in mapping and cloning of a new Rice blast gene Pi-kh. Analysis of Pi-kh locus in indica- japonica provided an insight in the presence of SSR elements in this region which may play an important role in shuffling of genes in the genome. Extensive sequence variation was observed between the Pi-kh alleles amplified from wild species and land races of rice. These alleles are being used in functional validation experiments.

6. High throughput genotyping to save time and cost

Multipex SNP assays using Sequenom MassARRAY system Sequence of flanking pre-amplification primers (PCRP) and single nucleotide extension primers (UEP) for genotyping of 20 SNPs of badh1 gene

Assaying SNPs by MALDI-ToF MS


Once a SNP is identified, 3 primers are required to enable genotyping including:- Two PCR primers to amplify the region around the SNP 5 3

- One Extension primer which anneals directly adjacent to the SNP. 5 3

MALDI-TOF Mass Spectrometry


DNA samples (384 no.) are mixed with a matrix, spotted onto a MALDI plate and loaded into the Mass Spectrometer. Each spot is subjected to pulses of nitrogen laser (337nm) in vacuum, which vaporises and ionises sample. Matrix absorbs most of the laser energy, preventing degradation of the sample, and allows ionisation of some of the DNA substrate. Application of an electric field causes DNA ions to enter flight tube and are accelerated towards Mass detector. All ions gain same kinetic energy, so larger ions take longer to reach detector. The variation in Time of Flight allows separation based on size.

Assaying SNPs by MALDI-ToF MS


120

C = 273 Da
100

T = 288 Da A = 297 Da G = 313 Da

Difference in peak size reveals the SNP allele for that rice variety

80

60

40

Unextended SNP primer

Extended SNP primer

20

0 4300

4400

4500

4600

4700

4800

4900

5000

Genotyping

Genotyping spectra for one well (11 SNPs)

Genotyping of badh1_S5 SNP Using Sequenom MassARRAY


badh1_S5 Unextended primer Another Primer of the Multiplex badh1_S5 badh1_S5 Primer Primer extended extended with T with C Another Primer of the Multiplex

Pusa NPT11

Pusa 1342

7. Proposed activities for the Indo-Australian wheat MAS project

Proposed Activities
Database of markers and cloned genes
ESTs, GSS, HTGS, SSRs,, SNPs, Traits

Allele mining for specified genes


GBSS-1, Glu-1, Amylase, Lr, Sr and Yr genes etc.

High throughput genotyping


Sequenom Mass Array Capillary sequencer fragment analysis

Thank You Very Much

Anda mungkin juga menyukai