Anda di halaman 1dari 8

306

reviews

Isolating plant genes


tn Gibson and Chris Somerville
The genetm transformation of most agriculturally important plant species is now
possible. However, the application of this technology to rational plant-improvement is currently limited by a shortage of cloned genes for important traits. Recent

technological advances in plant-gene isolation and identification, such as map-based


cloning, msertional mutagenesis and large-scale cDNA sequencing, have accelerated

the rate of gene isolation and significantly expanded the opportunities for genetic
engineering of crop plants.
No crop plant is perfectly suited to human needs. Plant breeders are routinely concerned with improving traits such as tolerance to disease or environmental stress, yield, quality o f the commercial product, and plant architecture. In addition, since many plants that produce useful products are not suitable for large-scale cultivation, transfer o f a wide variety of biosynthetic capabilities into species that are well adapted to modern agricultural practices could be useful. In those instances where the mechanistic basis for a particular trait is known, it is frequently possible to envisage strategies for the rational improvement of plants by the introduction o f one or a few cloned gene. Even complex physiological traits may be modified in this way if the relevant genes have been identified and isolated. For example, introducing the gene for glycerol3-phosphate acyltransferase from a chilling-tolerant species into a chilling-susceptible species resulted in altered membrane-lipid composition, due to the different substrate specificity o f the introduced isozyme, and increased low-temperature viability o f the transgenic plants 2. Many potentially useful modifications involve manipulating the amount o f desirable or undesirable compounds by altering the expression o f genes involved in key steps in biosynthetic pathways. For example, the expression of a gene for stearoyl-ACP desaturase in an antisense configuration in Brassica napus resulted in a significant increase in the amount ofstearic acid in the seed oiP. In other instances, useful changes may be made by introducing genes that encode enzymes with altered substrate specificities or allosteric properties. For instance, a fatty acyl thioesterase gene from California bay that has an unusually high specificity for medium-chain fatty acids was used to increase the percentage of medium-chain fatty acids in Arabidopsis ttlaliana 4. Similarly, introducing a feedback-insensitive gene for ADP-glucose pyrophosphorylase, which catalyses the first commltted step of starch biosynthesis, into potato resulted in a commercially significant increase in the amount o f starch accumulated by the potato tubers s. In addition, there are likely to be many opportunities to increase the levels o f useful metabolites, proteins or storage products in plants by altering the expression of relevant transcriptional activators. In a recent demonstration o f the potential of this approach, maize genes encoding transcription factors were used to alter the amount, and tissue-specificity of expression of the anthocyanin biosynthetic pathway in A. thaliana and Nicotiana tabacum 6. In these, and most other examples of plant genetic engineering 7, the rate-limiting step in developing modified plants has been the isolation of the relevant genes. However, several recent technological developments promise to facilitate greatly the identification and isolation o f new plant genes. In this brief review we have outlined these methods and some o f the new opportunities that may be afforded by applying them to rational plant-improvement.

S Gibson and C. SomerMle are at the Department of Energy Plant Research Laboratory, Michigan State Uniuerslty, East Lansing, MI 48824, USA.
CH JULY 1993 (VOL 11)

General strategies for cloning genes from plants All techniques for gene isolation exploit one or more o f the four characteristics that define genes: they have a defined primary structure (sequence); they occupy a particular location within the gcnome; they encode an R.NA with a particular expression pattern; and most genes have a function. Some techniques may permit the isolation of genes from any plant, while others are only applicable to one or a few plant species. H o w ever, flowering plants, which include almost all economically useful plants, evolved only ~ 150 million years ago. Consequently, genes isolated from one flowering plant can generally be used to isolate the corresponding genes from other plants by hcterologous hybridization. Furthermore, promoters for many genes retain normal patterns of expression when introduced into distantly related plant species. Therefore, genes isolated from non-crop plants, such as Arabidopsis, may be useful in manipulating crop plants. For these reasons, an increasingly important tool for isolating plant genes has been the development of advanced genetic methods, genetic maps (see
1993, Elsevier Science PubfishersLtd (UK)

307

reviews
Glossary

Glossary) and large collections of mutations in several model species such as Arabidopsis, maize and tomato. Cloning methods based on gene function The most widely used methods for isolating genes based on their function involve protein purification or complementation o f mutant phenotypes. The major hmitation to cloning genes based on the properties of the corresponding protein is that, for many genes, the product is not known, or cannot be purified in sufficient quantities to permit amino acid sequence determination or the preparation of antibodies. Many plant genes can be isolated by their abihty to complement mutations in bacteria and yeast. In one recent example, each o f eight chfferent auxotrophic mutants of yeast were functionally complemented by a cDNA library from A rabidopsis 8. Similarly, an engineered yeast strain that allowed the phenotypic rccogration o f sucrose-carrier actiwty was used to isolate a cDNA encoding a sucrose carrier from spinach 9. The cloning of several integral membrane proteins by beterologous expression is consistent with other evidence indicating that plants and yeast use similar structural moti to direct proteins to the various organelles. Thus, microbial complementation promises to be a very powerful method for isolating the many genes that encode functions common to plants and yeast. Although it is possible to transform and grow large numbers of plant cells in culture, isolating a gene by complementation of a plant mutation has not yet been accomplished. The feasibility of the approach was demonstrated by the rescue o f an introduced antibiotic resistance marker from a transformed plant 1, but the utility o f this method appears to be limited by the relatively small number of mutations that are expressed in cell culture. As a result o f the proliferation in the number of cloned genes for which function has not been established, the elucidation of gene function by creating transgenic plants that express antisense m R N A for a transgene may become increasingly usefhl. Antisense constructs are prepared by cloning a cDNA for the gent of interest next to a highly expressed promoter, in an orientation such that mP,.NA is transcribed from the D N A strand opposite to the D N A strand that mlLNA is transcribed from in the naturally occurring gene. When such a construct is introduced into a plant, this 'antisense' R N A can lower the steady-state level o f the naturally occurnng sense 1LNA, thereby decreasing the level o f the gene product. In this respect, the use o f antisense techniques is equivalent to creating a specific mutation. Transgenic plants expressing antisense constructs have been used to identafy two genes involved in frmt ripening in tomato11. Cloning methods based on m R N A or protein expression levels Several techniques for isolating plant genes take advantage of the fact that many genes have characteristic patterns o f expression. For example, many

Antisense construct - A plasmid that contains the coding region of

a gene placed adjacent to a promoter ~nsuch an orientation as to cause the gene to be transcribed from the DNA strand complementary to that from which RNA is normally transcribed.
Bulk segregant analysis - The division of the F2 progeny of a cross

between two different varieties of the same species into two or three pools. The division is made on the basis of the phenotype of each of the F2 progeny for a particular locus that is segregating in the cross. On average, the different pools have an identical genotype for all markers (eg., RAPDs, RFLPs), except in the region surrounding the locus that was scored to create the pools.
c D N A (complementary DNA) - DNA that is made by reverse-tran-

scribing RNA.
C h r o m o s o m e w a l k i n g - The process of ~dentifying a series of clones

containing overlapping inserts (Fig. 4).


Dominant mutation - A mutation that causes an altered phenotype, even in an organism that contains a wild-type copy of the gene. Ectopic insertion - The insertion of a DNA fragment at an abnormal

position in the genome.


Enhancer - A regulatory DNA sequence that activates transcription from nearby promoters. Functional complementation - The introduction into a mutant organ-

ism of a DNA fragment that restores the wild-type phenotype of the organism.
Genetic map - A map where the distances between genes are

expressed in terms of the frequency of melotm DNA recombination between the genes.
iPCR (inverse PCR) - A variation of PCR that makes the amplification of DNA segments of unknown sequence that flank DNA segments of known sequence possible. In brief, total DNA is digested to completion and the fragments ligated under conditions that favor circularization of the fragments. A pair of PCR primers, designed from known sequence, are used to prime the PCR from opposite strands resulting in amplification of a fragment of unknown sequence. PCR (polymerase chain reaction) - The exponential amplification of DNA fragment(s) using oligonucleotide(s), a thermostable DNA polymerase and repeated cycles of DNA denaturing, annealing and extension.

RAPD (randomly amplified polymorphic DNA) - A PCR product that is obtained from one strain, but not another, when genom~c DNA is used in a PCR reaction. The primers used in RAPD mapping are typically arbitrary lO-mers (Fig. 3).
RFLP (restriction f r a g m e n t length polymorphism) - A DNA frag-

ment that, when used to probe Southern blots of restricted genomlc DNA from different strains of the same species, allows the visualization of variations in the size and/or number of the restriction fragments generated from the different strains (Fig. 3).
T-DNA (transferred DNA) -

The part of the Ti plasmld of

Agrobacteriumtumefaciensthat is randomly inserted into the genome


of a plant during infection of the plant by the bacterium.
Transgenic organism - An organism that contains artificially intro-

duced DNA.
Transposon - A small DNA fragment that has the ability to move from

one location in the genome to another.


YAC (yeast artificial c h r o m o s o m e ) vector - A piasmld that contains all the sequences necessary for stable maintenance in yeast (a centromere, DNA replication origin and telomeres), as well as a yeast selectable marker.

TIBTECH JULY 1993 (VOL ] 1)

308

reviews
Isolate polyA mRNA from source A and source B, perform following reactions m arallel wtth both mRNA samples

Prime reverse transcnptlon wtth one of twelve possible otigonucleottdes of .~quence T(11 )XY, m thts example use ohgonucleotlde T(11 )GC
-~ CGTTTTTTTTTTT GCAAAAAAAAAAAAA O~ T T T T T T T T T T T XYAAAAAAAAAAAAA

priming

no priming if XY ts any dinucleotide other than CG

Use a random 10-mer and the oligonucleottde used for reverse transcription, (1 1)GC m this example, to amplify the reverse transcribed DNA m a labelled PCR ~act~on
CGTTTTTTTTTTT

only the small fraction of DNA molecules to which the random 10-mer anneals will be amphfied

Separate PCR products using htgh resolution polyacrylamtde gel electrophoresis A B


- s o m e PCR

products are unique to one mRNA source


v a r y in

- s o m e PCR p r o d u c t s

abundance

to identify genes that are relatively highly expressed. The new 'differential display' technique, which uses the polymerase chain react]on (PCtL) to amplify rare cDNAs, makes the identificanon of differentially expressed genes of low abundance possible 12 (Fig. 1). A subset o f cDNA fragments is amplified from firststrand c D N A by using one of a set of 12 possible oligonucleotides of the formula 5'-(dT)ndN2-3' and an arbitrary 10-met as PCR. primers. Comparison o f the reaction products from various sources o f mP, N A (different tissues, plants grown in different environments, etc.) on a D N A sequencing gel reveals the differences in gene expression between the tissue types. The same primers can then be used to amphfy excised bands for subsequent cloning, or for use as probes to screen cDNA or genomic D N A libraries The obvious problem with attempting to identify a particular gene by these approaches is that many genes have similar patterns of expression. Thus, additional criteria, such as the construction of a transgemc plant expressing an ant]sense construct, are necessary to narrow the search to a s]ngle gene.
C l o n i n g m e t h o d s based o n D N A insertions Several techniques exploit the fact that certain kinds of mutations lead to relatively large alterations o f chromosomal structure that can be used to isolate the corresponding genes. The most productive approach involves either transposon or T - I ) N A (transferred D N A from the Ti plasmid ofAgrobacterium tumefaciens; see Glossary) tagging to insert foreign D N A into, or near, the gene of interest, thereby causing a mutation that 'marks' the gene. The chromosomal region surrounding the molecular tag can then be readily isolated by probing a genormc library of the tagged line with the tag DNA. Transposon tagging has been confined largely to diploid self-fertilizing species, such as snapdragon and maize, that contain well-characterized endogenous transposable elements, where it has proven to be a very useful approach for isolating genes 13,14. The genes isolated using this approach include the maize genes viviparous-1 (involved in seed development) 15 and opaque-2 (encoding a transcriptional regulator) ~6, and several genes involved in flower development in snapdragon 17. In order to make transposon tagging feasible in other plant species, various derivatives o f the maize Ac element have been introduced by transformation into several plant species where they have been shown to transpose 18,19. Although no mutations have, as yet, been attributed to the introduced Ac elements, additional development o f these systems could extend significantly the utility o f this approach T - D N A tagging exploits the fact that during the production o f transgenic plants by A. tumefaciensmediated transformation, one or more copies o f the T - D N A are inserted into the genome at apparently random locations 2. As the tissue culture procedures used dunng the transformation of most plant species are also mutagenic, this procedure is most practical for Arabidopsis,where a method for large-scale transform-

m m m

- some PCR products are present m equal amounts

To clone dtfferentmlly expressed genes, exc~se bands, reamplify using PCR action and same oligonucleottdes as m step 3, hgate to vector, transform Lcterla

Figure 1 ng using the differential display technique. PolyA mRNA is isolated from two ent sources (d~fferenttissues, organs, etc.). A small fraction of the polyA mRNA ies is then amplified using PCR as described in steps 2 and 3. The amount of PCR product will be proportional to the amount of the corresponding spectes lyA mRNA present in the staring sample. Since some RNA species wtll be unique ,e RNA source, or may vary in abundance between RNA sources, the amounts me PCR products will vary between the different samples. These variations are hzed by separating the PCR products on a high-resolution polyacrylamide gel as a DNA sequencing gel). Differentially expressed genes are then Isolated by mg bands of varying abundance from the gel, reamphfying them using PCR and lg the resulting products.

potentially valuable plant genes, such as those involved in the synthesis o f valuable secondary metabolites, are expressed in specialized tissues. A commonly used method to enrich for such genes is differential screening. In this technique, mR.NA is prepared from tissues from different plants that are distinguished by some criterion such as exposure to particular environmental conditions (for example, drought), or being derived from a different tissue or stage of development. A D N A library from the appropriate organism is then probed sequentially with labelled cDNA produced from each o f the m[LNA samples. Clones that are more highly labelled by the cDNA from one mtLNA sample than from another contain genes that are differentially expressed in the two samples. The major limitation o f this technique is that it can only be used
:H JULY 1993 (VOL 11)

309

reviews
anon of intact plants has been developed 2. When this transformation procedure is used, 35-40% of the mutations generated are tagged by a T - D N A insert (Ken Feldmann, pers. commun.). A rabidopsis is also particularly useful for T - D N A tagging because its relative lack of non-coding D N A means that a reasonably high percentage (~ 19%) of the transformants have visible phenotypes. More than 10000 independent, transformed Arabidopsis hnes have been produced, of which approximately half are available from the Arabidolosis Resource Center at Ohio State University, OH, USA. The major limitation of the approach is the relatively large amount of effort required to produce and propagate considerable numbers of intact, transformed plants. As a result, ira particular mutation is not present m the available collections, another method must be used to clone the gene of interest. An important variation of the gene-tagging approach employs a T - D N A containing either a reporter or a strong transcriptional enhancer to transform large numbers ofprotoplasts or cultured cells 21,22. Ectopic insertion (see Gloassary) of the reporter gene construct downstream from a promoter is detected by transcriptional activation of the reporter. The efficacy of this 'enhancer trapping' approach depends on identifying specific conditions for inducing gene expression m cultured cells. Similarly, insertion of a strong transcriptional enhancer adjacent to an otherwise quiescent gene may induce abnormal expression of the gene. The key to the success of this approach is the design of selection schemes based on gene activation. In an elegant demonstration of the potential of this approach, millions of tobacco protoplasts were transformed with a Ti plasmid containing the CaMV 35S enhancer adjacent to one border, followed by selection for the ability to grow in culture without the growth factor auxin 22. The surviving colonies exhibited enhancer-mediated expression of a previously umdentified gene that, in some way, is presumably involved in auxin synthesis.
A

wild type B

deletion mutant A C ', ', shear and biotinylate DNA

cut DNA with restriction enzyme

A
I Ii. ~

B
I

C
i

A
. i ! ! i . ! ! 1 1 1 1 1 ~

" ~ ~

excess~

denature and then reassociate /


A
I I

C
I

~
I

B
l

bind to

coated J. ~

bound DNA
B
I

unbound DNA B
I I I I I

amplification B ligate to vector, transform bacteria B B

~ '' ~

transformed bacterial
colonies

Subtractive hybridization Subtractive cloning is the conceptual complement to gene tagging. In thls technique, total genornic DNA from a line containing a deletion mutation is hybridized in excess to highly fragmented DNA from wild-type plants (Fig. 2). Fragments that do not hybridize to D N A from the mutant correspond to the deleted region. The utility of the method was demonstrated by the isolation of a gene (gal) involved in gibberellin biosynthesis in Arabidopsis 23. This techtuque can also be used at the eDNA level to isolate genes that are not expressed in plants containing certain types of mutations, or to isolate genes that are naturally differentially expressed (see section on cloning methods based on mRNA/protein expression levels). An example of this approach was the use of subtractive hybridization to isolate salt-stress-induced genes from wheatgrass 24. As with most other methods, subtracnve cloning is most practical for plants with small genomes. A prob-

Figure 2 Cloning by subtractlve hybridization.Total chromosomal DNA from a wild-typeplant ts cut into small fragments with a restriction endonuclease such as Sau3A. Total chromosomal DNA from a plant containing a deletion mutation is randomly sheared and then biotinylated (indicated by hatching). The two DNA samples are then denatured and the non-biotinylatedDNA is hybridizedto a large excess of the b~ot~nylated DNA in solution. The DNA Jsthen applied to a column containing avJdln-coated beads (indicated by stippled circles) that bind biotmylated DNA molecules. Most of the DNA from the wild-type plant hybridizesto the excess bJotJnylatedDNA from the mutant plant and, therefore, is bound to the column. However, DNA from the wildtype plant (DNAfragment B) that corresponds to the DNA fragment missingfrom the mutant plant cannot hybr~dtzeto a biotmylated strand and so will be enriched m the column eluate.The elutedDNA ~sthen rehybrldizedto an excess of blobnylatedmutant DNA and the column enrichment process is repeated. After severalcycles of enrich ment the resultingDNA ~samplifiedusing PCR and then cloned. The clones are tested by hybridization to Southern blots of the mutant and wild-type to ensure that they hybridize to the wild-type, but not to the mutant DNA.

lem with this technique is that methods have not been developed for reliably producing mutagenized plant populations with small deletions. In addinon, once a mutation is isolated, extensive genetic analysis ]s

TIBTECH JULYi993[VOL!1)

310

reviews
a

strain I ii t
III I IIII I I I III IIIT III I J I I I I III II IIII II

strata 2

II I
I II I I

I IIII
I III Ill I i

I II
I I I ~ III II

I. d~gest total genomm DNA w~th restriction endonuclease 2 size fracttonate DNA

strain 1 0.2

strain 2 0.2

fragments on a gel
stram 1 stram 2

I perform polymerase chain


reaction

2. size fractlonate PCR products on 3 transfer gel to nitrocellulose 4. hybndtze wtth labelled DNA ( ~ ) I from region spannmg point

~ a ge
stram 1 strain 2
0.5 kb - -

mutatton ('~)
5. perform autoradtography stram 1 strain 2

0.3 kb
0.2 kb

Figure 3
(a) RFLPs. Three regions of chromosomes from two stratus of the same species are indicated in the figure by horizontal lines. Recognition sequences for a particular restriction endonuclease are indtcated by vertical lines. The asterisk indicates the presence of a mutation that has eliminated a restriction endonuclease recognition sequence from the genome of strain 2. The loss of this recogmt~on s~te (or other changes such as deletions, insertions or the appearance of new restriction sites) can be wsualized as shown ~nthe figure. Genomic DNA from both strains is ~solated,cut with the restriction endonuclease and size fractionated by gel electrophoresis. In practice, the number of recognition sites for a particular restriction endonuclease would be much greater than shown ~nthe figure and, consequently, the number of bands on the gel would be so large that it would be impossible to distinguish indMdual bands. Therefore, in order to wsuahze the bands from a particular region of the genome, the DNA is transferred from the gel to a nitrocellulose filter and probed with a small pece of genomlc DNA (indicated in the top part of the figure by a black box) in a Southern blot type experiment. In this example, the probe DNA will hybridize to two DNA restriction fragments from strain 1 but, due to the loss of a recognition site, ~twill hybridize to only one (larger) DNA restriction fragment from strain 2. (b) RAPDs are conceptually similar to RFLPs. In this example, regions of three chromosomes from two strains of the same species are indicated by horizontal lines. Binding s~tes for a particular ohgonucleotlde on the chromosomes are indicated by arrows, with the direction of the arrow showing the orientation in which the oligonucleotlde will brad at that site. When PCR Is performed, DNA will only be amplified between pairs of oligonucleotides that bind close together and with the opposite orientation with respect to each other. The chromosomal regions that would be amphfled in this example are Indicated by numbers showing the distance, in kilobase pairs, between the relevant pairs of binding srtes. The asterisk indicates the presence of a mutation that ehminates an ohgonucleotlde binding site from the genome of strain 2. The loss of this binding site leads, m turn, to the production of one less PCR product from the strain 2 genome than from the strain 1 genome. The loss of this PCR product can be visualized by size fractlonatmg the PCR products on a gel.

r e q u i r e d to deterrmne whether it is due to a deletion

or some other aberration.

Map-based cloning
Map-based cloning methods, such as 'chromosome walking' (see Glossary), theoretically permit the isolanon of any gene for which a mutation can be identified. In the first stage, a mutation is mapped genetically relative to the map positions of closely linked cloned D N A fragments such as restricuon fragment length polymorphisms (RFLPs), or randomly amplified polymorphic D N A markers (RAPDs) 2s (Fig. 3). Typically, the most closely linked pair of flanking markers are then used as hybridizauon probes to iso2H JULY 1993 (VOL 11)

late clones containing the region of the genome located between the markers (Fig. 4). Finally, the gene is identified within the cloned region by its ability to complement the mutation genetically. Currently, an essential requirement for map-based cloning is the availability of comprehensive genomic libraries of relatively large DNA fragments, typically in yeast artificial chromosome (YAC) vectors (see Glossary). Extensively characterized YAC libraries are available for Arabidopsis26->, and complete, or almost complete, libraries are available for several important crop species 3,31. Thus, library construction is not a technical limitation. Another requirement, and the key to the successful application of these methods, is

311

reviews
the availability of D N A probes that are closely linked to the gene o f interest (ideally less than a few hundred kilobases apart). If the flanking markers are too far apart, it can be difficult to clone all o f the D N A between the markers, and it may also be difficult to find the gene o f interest in hundreds o f kilobases of cloned DNA. For this reason, chromosome-walking efforts have focused on plants, such as Arabidopsis32,33 and tomato, that have relatively small genomes, and for which high-density P,,FLP 34,3s and R A P D 36 maps are available. Once a region o f D N A containing a gene has been cloned, a variety o f strategies may be used to idennfy the gene. If the gene is preferentially expressed in a particular tissue or under specific conditions, identifying the relevant gene by probing an enriched cDNA library with the YACs covering the gene may be possible. This approach was used to clone a fatty-acid desaturase that was preferentially expressed in developing seeds32. If this approach is not feasible, the gene can be identified by testing subclones o f the chromosomal region known to contain the gene for the ability to complement the mutation. A gene that complemented an abscisic-acid insensitive (abi3) mutation of Arabidopsis was cloned in this way 33. The number o f genes isolated by map-based cloning methods is likely to increase dramatically in the next few years. The pending identification of a set of overlapping YAC clones of Arabidopsis should greatly facilitate map-based cloning in this species 27. In addition, current research on the development o f improved methods for rapid identification and mapping o f large numbers of genetic polymorphisms may f~tcilitate the application of these methods to many species ofh~gher plants for which high-density genetic maps are not available. Together with the development of techniques such as bulk-segregant analysis 37, which facilitates the identification o f D N A markers near a gene of interest, these new markers should make map-based cloning possible in plants with large genomes for which genetic maps are not available.
RFLP1 gene
I

RFLP2
I

1. probe YAC library wtth RFLP


A-

2. tsolate ends of YAC-A 3. probe YAC hbrary w~th ends of YAC-A B1 B2 -4. isolate ends of YAC-B1 and YAC-B2 5. probe YAC library wtth ends of YAC-B1 and YAC-B2 C1 C2 -

Figure 4
Gene isolation by map-based cloning. In thin tllustration, a mutation in a gene has been determined by genetic mapping to lie between two particular RFLP markers DNA from RFLP 1 is used to probe a genomm DNA library (a YAC library is used in this example) in order to identify a clone (indicated by thick line) that contains an overlapping DNA insert. DNA from the end of thin clone (YAC-A)is then isolated by reverse PCR or plasmld rescue and used to screen the genomlc DNA library to tdentlfy clones containing additional overlapping DNA fragments4o (YAC-BIand YAC-B2, in this exampie). Each cycle of tsolating the end of a clone and then using this end to identify clones containing overlapping DNA fragments represents one step in chromosome walking. The process ts continued until clones spanning the gene of interest are ~dentifled. In thts case, since RFLP 2 is known from genetic mapping data to he on the far side of the gene of interest from RFLP 1, steps are taken until a clone (YAC-C2)that contains an msert that overlaps the DNA contained m RFLP 2 ts idenbfled. In pracrice, the walking process would usually be initiated from both RFLPs simultaneously. Once clones spanning the chromosomal region known to contain the gene have been motated, the gene can be ~dentifled by determmmg whmh subclones of the original clones are able to complement the mutation.

Implications o f random cDNA sequencing


Recent improvements in automated-sequencing and robotics technologies have made large-scale D N A sequencing practical and widely accessible. A laboratory equipped with two automatic D N A sequencers can obtain ~ 500bp o f sequence from each o f 70 cDNA clones per day. By simply picking random cDNA clones, obtaining partial sequence from the 5' end and comparing the six possible translations of the partial cDNA sequences with the sequences of known proteins in the various databanks, ~ 18% of 5000 human clones were assigrmd probable function 38. Sirralarly, of the first 1000 partial sequences from rice 39, and the first several thousand sequences from Arabidopsis (M. Caboche, INRA, France, pets. commun., and C. Somerville and T. Newman, unpublished), ~ 8% and 20%, respectively, were assigned a probable function - mostly by homology to non-plant genes. In view o f the efficiency of this approach as a

mechanism for relating plant biology to the large amount o f sequence information available for other organisms, the desirability of obtaining large numbers of partial sequences has become apparent. Several institutes, such as The Institute for Gene Research (TIGP,,; Bethesda, MD, USA), have the capacity to process more than 1000 cDNAs per day. Therefore, most, or all, of the ~ 20000 to 40000 different cDNA sequences in Arabidopsis (see Box 1) and rice are likely to be partially sequenced by the end o f this decade. A preliminary attempt has also been made on sequencing the Arabidopsis genome, but the resources necessary to complete this project are unlikely to be available in the foreseeable future. Most genes of interest currently cannot be identified on the basis of their nucleotide sequence alone. One way o f identifying the function o f at least some unidentified clones is to determine their location on the genetic map so that a correlation can be made with mutations located at the same position. In the case of Arabidopsis, many cDNAs can be mapped by simply hybridizing the clones to the YAC libraries. Although not all of the Arabidopsis YACs have been aligned with the genetic map as yet 27, the process of hybridizing c D N A clones to the YACs will eventually lead to the complete alignment of all the YACs with the genenc
TIBTFCH JULY i993iVOL] 1)

312

reviews
Box 1. How m a n y genes to m a k e a plant? he genome size of Arabidopsis thaliana, a typical flowering plant, has een estimated to be between ~ 70000 kb (Ref. 41) and 145000 kb lef. 42), with ~ 10% of the genome consisting of highly repetitive equences 4~. In tobacco, which, like Arabidopsls, is a dicotyledonous lant, the average mRNA is 1.34 kb long43. The average intron in icotyledonous plants is 215 bp long44. Based on an analysis of the ~ailable Arabidopsis genes in Genbank release 70, the average Araidopsis gene contains 3.13 introns and the average cDNA has 161 p of untranslated sequence at the 5' end and 232 bp at the 3' end. hus, a reasonable estimate for the average size of primary transcripts ~Arabidopsis is 1.34 + (3.13 x 0.215) + 0.161 + 0.232 = 2.4kb. we assume that the average gene requires O.4kb at the 5' end for anscriptional regulation and 0.2 kb for 3' sequences, the average ene would be 3kb long. Thus, with between 63000kb and 130500kb f unique sequence there are ~ 21000 to 43500 genes in Arabidopsis. the average gene is separated from its neighbor by lkb, the amber of genes drops to between - 16000 and 33000.
G. M (1992) &ience 258, 287-292 6 Lloyd, A. M., Walbot, V and Davis, R W (1992) &ience 258, 1773-1775 7 W11lmltzer, L and T6pfcr, 1L (1992) Curr Opm Btoteehnol 3, 176-180 8 Mmet, M , Dufour, M-E andLacroute, F. (1992) Platltd 2, 417 422 9 Rlesmelr, J W , Wdlrmtzer, L and Frommer, W B. (1992) EMBO d 11, 4705-4713 10 Klee, H.J, Hayford, M. B. and Rogers, 8. G. (1987) Mol. Gen Genet 210, 282-287 11 Gray,J., Plcton, S, Shabbeer,J., Schucb, W and Grlerson, D. (1992) Plant Mol. Biol. 19, 6%87 12 Llang, P. and Pardec, A B. (1992) Saence 257, 967-971 13 Balcells, L, Swmborne, J and Coupland, G (1991) Trends Bzotechlzol 9, 31-37 14 Walbot, V (1992) Ann. Rev. Platzt Phystol PlautMol Btol 43, 49 82 15 McCarty, D R., Carson, C. B., Stmard, P S. and Roberston, D S (1993) Plant Cell 1,523-532 16 Schrmdt, I~ J., Burr, F A. and Burr, B. (1987) Science 238, 960-963 17 Bradley, D , Carpenter, R , Elhott, tL, Simon, tL , Romero, J., Hantke, S., Doyle, S., Mooney, M , Luo, D., McSteen, P , Copsey, L., Robinson, C and Coen, E. (I993) Phil. Trans. Royal Soc. 339, 193-197 18 Dean, C , 8jodln, C , Page, T ,Jones,J and Laster, C (1992) Plant] 2, 69-81 19 Greveldmg, C., Becker, D , Kunze, R ,von Menges, A., Fantcs, V , Schell, J and Masterson, tL (1992)Proc Natl Acad Set USA 89, 6085-6089 20 Feldmann, K. (1991) Plant] 1, 71-82 21 Keltbundlt, S., de Greve, H., Debroeck, F, van Montagu, M and Hemalsteens,J. P. (1991) Proc NatlAcad Set. USA 88, 5212 5216 22 Hayasha, H., Czaja, L, Lubenow, H , Scheli,J. and Walden, R. (1992) Science 258, 1350 1353 23 Sun, T , Goodman, H M. and Ausubel, F. M. (1992) Plant Cell 4, 119-128 24 Guhck, P.J and l)vorak, J. (1990) Gene 95, 173-177 25 Wdhams, J G K , Kubehk, A. R , Llvak, K J., Rafalska, J A and Tmgey, S. V. (I 990) Nucleic Acids Res. 18, 6531-6535 26 Grill, E. and Somer-valle,C. (1991) Mol Gen Genet 226, 484 490 27 Hwang, l., Kohchl, T., Hauge, B., Goodman, H., Schrmdt, IL, Cnops, G., Deans, C., Gibson, S., Iba, K., Lermeux, B., Arondel, V., Danhof, I.. and Somerville, C R. (1991) Plant] 1,367-374 28 Matallana, E., Bell, C.J., Dunn, pj., Lu, M. and Ecker, J. (1992) m Methods" tn Atabtdopszs Research (Koncz, C , Chua, N-H. and Schell, J , eds), pp. 144-169, World Scientific 29 Ward, E. R. andJen, G C (1990) PlatttMol Btol. 14, 561-568 30 Mamn, G. B., Canal, M W and Tanksley, S. D (1992) Mol Gen Genet. 233, 25-32 31 Edwards, K J., Thompson, H , Edwards, D , De 8alzleu, A , Sparks, C., Thompson, J. A., Greenland, A. J., Eyers, M. and Schuch, W. (1992) Pla~t Mol. Biol. 19, 299-308 32 Arondel, V., Lemleux, B., Hwang, l , Gibson, S., Goodman, H and Somerville, C R (1992) Science 258, 1353-1355 33 Glraudat, J., Hauge, B., Valon, C., Slnalle, J , Parcy, F and Goodman, H. M. (1992) Plant Cell4, 1251 1261 34 Hauge, B. M., Hanley, S M , Cartmhour, S., Cherry, J. M , Goodman, H. M., Koomneef, M., Stare, P., Chang, C., Kempm, S, Medrano, L and Meyerowltz, E. M. (1993) Plant] 3,745-754 35 Tanksley, S. D , Canal, M. W., Prince, J P , de Vlcente, M. C , Bomerbale, M W., Broun, P , Fulton, T. M , Glovannom, J J , Granddlo, S., Martin, G B, Messeguer, R , Mdler, J C , Mdler, L, Paterson, A. H , Pmeda, O., iK6der, M. S, Wing, R A., Wu, W and Young, N. D (1992) Genetics 132, 1141-1160 36 IKelter, 1L. S., Wllhams, J. G K., Feldmann, K. A , tLafalskl, J A , Tmgey, S. V and Scolmk, P. A. (1992) Proc. Natl dead. Sd. USA 89, 1477-1482 37 Mlchelmore, R W., Paran, 1 and Kesseh, tZ. V (1991) Proc Natl Acad &l. USA 88, 9828-9832 38 Adams, M D., Kelley, J. M , Gocayne, J D , Dubnlck, M , Polymeropoulos, M. H , Xmo, H., Mernl, C. 1L, Wu, A, Oldc,

map. Overlaps between Y A C clones that cover the same region of the genome can be identified by the fact that they hybridize to the same cDNA. The genetic map location of such families of contiguous clones ('contigs') can be determined by hybridization to prewously mapped R F L P markers. In theory, hybridization o f ~ 2000 c D N A clones to the existing YAC libraries should be sufficient to produce a set of ordered YACs for the entire Arabidopsisgenome ~8.

Conclusions The abihty to isolate, at least from certain plants, any gene for which genetic variation can be unambiguously scored represents a qualitative technical advance that has created entirely new scientific and technological opportunities. Many interesting or useful genes that have been k n o w n for years on the basis of mutant phenotypes can n o w be isolated and characterized. This ability to isolate genes for which m u tations are available provides a substantial new motivation to expand the collection o f mutations that affect characters o f scientific or agronomic importance. In addition, the large-scale sequencing efforts will provide a flood o f new genes for which a hypothetical function can be proposed. In order to exploit this wealth of new information, the development of simple methods to test experimentally the function of cloned genes will be necessary. In thas respect, the development o f a method for using cloned genes to inactivate endogenous genes is very important. Acknowledgements We thank Ruth Wilson for artistic assistance. References
1 Gasser, C and Fraley, R T (1989) Sctence 244, 1293-1299 2 Murata, N , lshlzaki-Nlshlzawa, O , Hlgashl, S, Hayashl, H , Tasaka, Y and Nlshlda, I (1992) Nature 356,710-713 3 Knutzon, D S., Thompson, G. A., Radke, S. E., Johnson, W. B, Knauf,, V C and Kndl, J C (1992) Proc. Natl Acad. So. USA 89, 2624-2628 4 Voelker, T A , WorrelI, A. C , Anderson, L, Blelbaum,J, Fan, C., Hawkins, D J , 1Ladke, 8. E and Dawes, H M (1992) Science 257, 72-74 5 Stark, D M., Tammerman, K. P , Barry, G. F, Prelss, J. and Klshore,
CH JULY 1993 (VOL 11)

313

reviews
B, Moreno 17,. F., Kerlavage, A. IZ, McCombm, W. R and Venter, J C (1991) Science252, 1651 1653 39 Uchlmaya,H, Kldou, S., Sblrnazaka,T, Aotsuka,S., Takamatsu,S., Nlshl, R., Hashlmoto, H, Matsubayashi,Y, gadou, N, Umeda, M and Kato, A (1992) Plantd 2, 1005-1009 40 Gibson, S. I. and Somerville,C (1992) m Methods in A~abdopsis Research (Koncz, C, Chua, N-H and Schcll,J, eds), pp 119-143, World Scientific 41 Meyerowltz,E. M (1989) Cell 56, 263-269 42 Arumuganathan,K. and Earle, E D. (1991) Plant Mol Blol Rep 9, 208-218 43 Goldberg, tk B , Hoschek, G and Kamalay, J (1978) Cell 14, 123-131 44 Goodall,G.J. and Flhpowltz,W (1990) PlantMol Bwl 14,727-733

book reviews
The chapters on synthetic glycoconjugates and glycosyltransferase inhibitors explore some o f the uses o f these compounds in preparative work and for probing biosynthetic pathways. The importance of glycolipids in the central nervous system is discussed, and follows a useful summary o f the structure and synthesis o f these molecules. The ubiquity ofproteoglycans and the efforts which are being made to unravel the significance o f their levels ofglycosylation are also addressed. The article on glycoconjugate turnover is full of stimulating ideas and concludes with the hypothesxs ofgalactosyl homeostasis. This theory suggests that terminal GalGalNAc receptors, which are important during development, may be so deleterious to the mature organism that there are several pathways which ensure that molecules with such exposed residues are rapidly eliminated to avoid disruption to normal cell growth and organization. There is comprehensive coverage o f microbial, vertebrate and invertebrate lectlns. The section on plant lectlns could usefully have been allowed space to include a table of specificities and more discussion of the protein-sugar interactions which have been explored recently, for example in studies o f wheat germ agglutinin and Lathyrus ochrus lectin. The value o f molecular biology in the glycoconjugate field is rapidly being recognized, and several sections discuss such approaches; in particular there are examples o f glycosyl transferases and glycosidases which have been successfully cloned. A noticeable omission in this otherwise excellent book, is the
TIBTECH JULY 1993 (VOt

Glycoconjugatesfrom A - Z
Glycoconjugates - Composition, Structure and Function edited by H..J. Allen and A. C. Kisailus, Marcel Dekker, 1992. U S $ I 9 5 . 0 0 (viii + 685 pages) I S B N 0 8247 8431 6

This book maintains an excellent balance between stimulating, wellwritten essays covering most facets af this rapidly expanding field, and factual reformation (including a chapter on nomenclature) to which :he specialist will turn for reference. Both experts and beginners will enjoy this book, which is remarkably free o f jargon and written in a challenging style which encourages the reader to look up more details m the references mpplied. Throughout the book, ane author after another succeeds Ln taking the non-specialist quickly :hrough the basic theory so that it is possible to appreciate the excitement being generated at the Forefront o f the field. The reader is :hen subtly drawn to relate aarticular events to those which ~recede and follow. For example, :he enzymatic modification of ;ugars leads naturally to a need to anderstand the factors which control site occupancy, on the one nan& and receptor binding o f nature glycoproteins, on the other. Before long, the whole book )ecomes compelling reading, since t deals with glycoconjugates from )iosynthesls to remodelling. The lucid introduction moves tuickly through the historical =ontribunons o f the early part o f :his century to discuss the overall ]irection that the field is taking :oday. It begins with biosynthesis md transport, continues with a

discussion o f function, and addresses the need to isolate and analyse individual glycoforms. It succeeds in leaving a tantalizing trail o f unanswered questions to be addressed later by other authors. The survey of enzymatic and chemical methods used to release and sequence oligosaccharides is a valuable introduction to a veritable minefield; the examples discussed highlight some o f the pitfalls. There is the problem o f low levels o f contaminating enzymes, which can become significant during long digestions, and the sobering possibility that the full range o f specificity o f some enzymes may not yet be known. Sections on nuclear magnetic resonance spectroscopy ( N M R ) and mass spectrometry give an insight into their value for probing the structure and conformation o f released oligosaccharides. Currently, there is much interest in the application o f these techniques directly to glycoconjugates, and this will be something to review in the future. The chapters dealing with N and O-glycosylated proteins are full o f interesting information, drawing on examples from many different species, and constantly focusing on factors which may influence glycosylation. The function of glycosylation is, predictably, a recurrent theme throughout the b o o k and especially in the review o f secretory glycoconjugates.

Anda mungkin juga menyukai