Array Vision Genomics Software - Rapid and Automated Analysis of Genome Arrays

Contents
Introduction...................................................................................................................................................... 2 Analyzing Arrays ........................................................................................................................................... 3 MicroArrays................................................................................................................................................... 4 Macro Arrays ................................................................................................................................................. 7 Genomics Imaging Systems and ArrayVision Software................................................................................... 7 Typical High Throughput Genetic Analyses ................................................................................................... 8 Library Screening........................................................................................................................................... 8 Gene Expression and Functional Genomics .................................................................................................... 9 Summary of the ArrayVision System ............................................................................................................ 11 References ...................................................................................................................................................... 12
ArrayVision
Introduction
Array-based genetic analyses start with cDNAs or oligonucleotides, immobilized on a substrate. The array elements are hybridized with a single labeled sequence (Fig. 1), or a labeled complex mixture derived from a tissue or cell line messenger RNA. Figure 1: Array containing 2304 rectangles, each populated with 25 discrete elements from a YAC library, giving a total of 57,600 array elements. The entire membrane (24 x 24 cm) was hybridized with a simple mRNA labeled with P32, and was exposed on a storage phosphor imaging plate. The intense dark spots arranged in pairs are used for positioning. The few probes which hybridized to the labeled sequence are seen as fainter spots. Most of the membrane is almost clear, indicating a low background. Specimen courtesy of Genome Systems, Inc.
Typical arrays vary from about 500 to 60,000 or more elements, with each element representing a discrete hybridization assay. The rapid, simultaneous analysis of large numbers of hybridization assays gives array-based genetic analysis its high throughput characteristics.
Amersham Biosciences
ArrayVision
Analyzing Arrays
Array quantification involves a number of steps. The array of hybridization assays (array elements) is imaged. A template consisting of a regularly spaced matrix of circles or squares (template elements) is placed over the array. The template elements are aligned with the array elements. Data are reported.
The alignment step needs some explanation. Ideally, all of the template elements would align with the array elements. Unfortunately, most arrays exhibit some geometric error. Therefore, template elements are not perfectly aligned with their targets. This results in two types of error. Hybridization values can be identified to incorrect positions within the array. Hybridization values will be in error, wherever a template element does not fit precisely over an array element.
To avoid these errors, we must align the template. We could move each discrete template element to its proper position, by eye. This type of manual definition is so tedious as to be impractical for any but the smallest arrays. It is also dangerous. After staring at thousands of dots for a few minutes, a human observer is prone to errors. The alternative is to align templates, automatically. There are various procedures for this, including simple thresholding (finding array elements on the basis of intensity, e.g. Nguyen et al., 1995; Pietu et al., 1996), and more complex spot finding algorithms (as in ArrayVision). The success of an array analysis system is, in large part, dependent upon how well it succeeds in automated alignment. If alignment is inaccurate, a great deal of editing is required and this is not much better than manual definition. To be really useful, an array analysis package must align a template with minimal user editing, and across a variety of specimen formats (isotopic, luminescent/fluorescent, macro and micro arrays). ArrayVision uses a fuzzy logic algorithm (patent pending) to place each discrete template element over the best fit location. It does this by evaluating the image around each template element. It uses signal intensity to determine if there is an array element that is a likely fit to that template element. Then comes the fuzzy part. If array elements exhibit strong and distinct signals, the software is quite ready to move template elements to new positions over those array elements. If array elements are weakly labeled or unlabeled, the software tends to leave template elements in their original (predefined) locations within the template. That is, the software uses confidence weighting to align template elements to array elements. This allows each and every array element to be read, including those that fail to exhibit label above background. Unlabeled array elements will be read at the original position specified by the template, adjusted to fit within the context of more clearly defined elements. Although the algorithmic exercise is a bit complex for the computer, it is simplicity itself for the user. Click the mouse and the template aligns with the array. With most arrays, very little or no editing is required. Click again and data from thousands of array elements are reported, quickly and accurately (Fig. 2).
ArrayVision
Figure 2: Comparison of manual and automated alignment and detection. A 33P-labeled expression array was imaged (phosphor imager) and displayed within ArrayVision. A template was generated to fit roughly over the array. In the manual condition, the operator moved individual template elements to match the array. In the automated condition, the alignment was performed entirely by the computer. Manual and automated alignment are in excellent agreement, over the entire range of hybridization intensities.
Hybridization Intensity: Manual vs. Auto Alignment 33P Array

45000 40000 R2 = 0.9993 35000
Intensity: Auto Alignment
30000 25000 20000 15000 10000 5000 0 0 5000 10000 15000 20000 25000 30000 35000 40000 45000
Intensity: Manual Alignment
MicroArrays
Microfabricated arrays have been in use for some time (e.g. Eggers et al., 1994; Fodor et al., 1991; Lamture et al., 1994; Maskos and Southern, 1992; Mason, Rampal and Coassin, 1994; Pearson and Tonucci, 1995; Pease et al., 1994; Saiki et al., 1989; Southern, Maskos and Elder, 1992). The use of microfabricated arrays is growing, while the recent availability of commercial instruments for creating and detecting (e.g. Molecular Dynamics Avalanche) ad hoc microarrays underlies a rapidly expanding use of nonfabricated specimens. Microarrays have advantages in achieving higher signal to noise with rare mRNAs. The proportion of total mRNA represented by a particular mRNA species is not necessarily related to its functional importance. Species which are present in low copy numbers may be of interest, but are difficult to detect (e.g. Wan et al., 1996). Therefore, a goal
ArrayVision
of miniaturization is to concentrate mRNA molecules into a smaller area, where they can be detected more easily (Fig. 3).
ArrayVision
Figure 3: Two array formats. The isotopic macroarray uses elements 1 mm in diameter. It is deposited on a membrane and detected using a phosphor imager. The Cy3-labeled microarray uses 250 um diameter elements, deposited on a microscope slide (microarray specimen courtesy of M. Erlander).
Membrane macroarray
Glass microarray
As nylon (Anchordoguy et al., 1996) or glass (Wittrup, Westerman and Desai, 1994) substrates are (with nonspecific hybridization) a major source of nonspecific background, using smaller and more highly concentrated assay sites can yield higher sensitivity. For example, consider distributing 4,000,000 molecules of a fluoresceinlabeled mRNA over a target area of 1 x 1 mm. This would achieve a concentration of 4 molecules/m2. This very low concentration would probably not be visible above the substrate background fluorescence. In contrast, distributing the same 4,000,000 molecules over a target area of 20 x 20 m would achieve a concentration of 40,000 molecules/m2, a concentration which has been used in evaluating the performance of microfabricated devices (Chee et al., 1996). Scanning the small target with a tightly focused laser, or viewing it at high magnification under a fluorescence microscope would allow very small amounts of signal to be detected. Problems remain with hybridization to such small targets, but maturation of this technology will lead to routine imaging of many thousands of microscopic targets within small areas (e.g. DeRisi et al., 1996; de Saizieu et al., 1998; Khrapko et al., 1991; Schena et al., 1995; Shalon, Smith and Brown, 1996). In addition to increasing the concentration of mRNA molecules/element by miniaturization, microarrays benefit from the use of fluorescence label. a) Fluorescent labels with higher extinction coefficients and higher quantum yields have better signal relative to background (Oi, Glazer and Stryer, 1982). b) We can attach more label molecules to the probe molecule. c) Longer wavelengths of excitation lead to less autofluorescence (Tsien and Waggoner, 1989). d) Time resolved fluorescence can be used to minimize nonspecific background (e.g. Hennink et al., 1996; Jovin and Arndt-Jovin, 1989; Seveus et al., 1992).
ArrayVision
It has been suggested that an array of 1 m2 elements occupying 4 cm2 (about 4 million sites) would be sufficient to query the 100,000 gene content of the entire human genome. Of course, the technology to create, hybridize to, and detect such high density arrays remains to be developed. In the push to ever higher array densities, microfabricated devices (which are readily miniaturized) will define the state of the art. However, microfabrication is not very suitable for low-volume research applications, in that advanced manufacturing procedures (photolithography, light-directed combinatorial synthesis, etc.) are required. Therefore, microfabrication is only economical for large numbers of devices, targeted at specific sequences. Research laboratories or small companies cannot easily create novel microfabricated devices for their own genes of interest. As an alternative to the relatively inflexible microfabricated arrays, non-fabricated arraying technologies are developing rapidly. Tools for creating fluorescent microarrays (typically with probes 100-250 m in diameter), improved substrates, and high resolution detectors are now available. These innovations combine with image analysis to yield a rapidly evolving capability for creating ad hoc microarrays.
Macro Arrays
The macro format (Figs. 1,4) was introduced some years ago (e.g. Gress et al., 1992; Guo et al., 1994; Khrapko et al., 1991; Saiki et al., 1989; Zhao et al., 1995) and is in fairly widespread use. Typically, arrays are laid down on membranes as spots of about 1 mm in diameter. These large spots are easily produced with robots, and are well suited to isotopic labeling because the spread of ionizing radiation from an energetic label molecule (e.g. 32P) precludes the use of small, closely-spaced elements. Detection is most commonly performed using storage phosphor imagers. Figure 4: A 33P-labeled Clontech Atlas Array, imaged with a Molecular Dynamics phosphor imager and analyzed with ArrayVision. This expression array contains 1,176 discrete probes organized as 6 blocks of 196 probes. Each block represents a different class of cellular function. Most probes exhibit some degree of hybridization to the complex test sample.
Although most macro array studies have used isotopic label, laboratories are also working with fluorescent labels on membranes or glass. These fluorescent arrays can be made with the same types of technologies used in creating isotopic arrays (e.g. non-contact dispensers, pin tools). Detection can use scanning fluorescence imagers (such as the MD FluorImager), or CCD-based low light imaging systems. Scanning laser systems are easy to use and can be quite cost-effective. CCD-based systems have the advantage that they are not limited to fixed laser lines. Rather, they can use any wavelengths produced by interference filters.
Genomics Imaging Systems and ArrayVision Software

ArrayVision software is used for rapid and automated analyses of images generated from any macro or micro imaging devices. Contact us for details regarding complete systems (Fig. 5).
ArrayVision
Figure 5: ArrayVision accepts data from almost any detection system.
macro array imaging system ArrayVision software software camera
imaging plate reader
scanning laser scanning microscope
Typical High Throughput Genetic Analyses

Library Screening
The defining feature of library screening is that the probe is simple, containing only one or a few complements. The use of large arrays provides the best likelihood that hybridization will occur, even with allelic variation in the target and limited sequence lengths in the array elements. A typical screening image contains many thousands of unlabeled elements, and a few points where elements have hybridized. Analysis is less concerned with quantification of hybridization intensity than with localizing labeled array elements to their proper locations in the array. Accurate localization requires alignment of the template to the entire array, even though most of that array is blank. ArrayVision uses specific anchor spots on the array, to provide spatial points of reference (see Fig. 1). Using these anchors, the software performs accurate alignment of the template and localization of array elements. Even though hybridization intensity is not the main objective of a screen, ArrayVision does generate quantitative data for every element (Fig. 6). Therefore, objective statistical methods can be applied to identify hits. For example, we select those elements whose hybridization intensity is more than four standard deviation units away from the mean. The software reports these elements as numerical data, and as a graphical display of hit locations.
ArrayVision
Figure 6: Frequency histogram of 57,576 elements, with a mean of 2,124 counts, and a standard deviation of 1,172. The Y axis is plotted logarithmically, so that we can see small numbers of data points lying in the higher intensity bins. For example, there are fewer than twenty elements lying above 27,500 counts. It is likely (probability better than 99.999%) that these probes form their own, unique distribution - the distribution of hits.
100000
10000 1000
100 10
1
2500 7500 12500 17500 22500 27500 32500 37500 42500 47500 More
Counts/mm2
Gene Expression and Functional Genomics

In expression analysis, libraries of cDNAs or oligonucleotides are hybridized with total genomic mRNA. When everything is working correctly, the expression level of a gene is reflected in the number of mRNA copies that it contributes to the mixture, and is proportional to the signal detected at complementary elements in the high-density array. The key difference between library screening and gene expression is analysis does not look at just a few hits on a clear background. Rather, there is hybridization to almost every array element (Fig. 7). Figure 7: A small portion of an expression array, showing that there is hybridization at almost every element.
Expression levels can be analyzed within a single sample, or across multiple samples. Typically, expression is compared across tissues or cell lines (Fig. 8). This is done by using replicate arrays, exposed to different mRNA conditions. Figure 8: A macro array containing 1,536 discrete probes, hybridized to a complex mixture of mRNAs. At left, we see the image without a template. The image at right shows the aligned template over the array. Expression
10
ArrayVision
data are taken from the entire array. automatically.
ArrayVision placed and aligned the template and read data, all
A problem in studying gene expression is the sheer volume of the data. Expression arrays can contain tens of thousands of targets replicated across multiple conditions. It is not unusual for the computer to be juggling matrices containing >100,000 numbers. Therefore, it is important that the image analysis software be designed to handle such large data sets. ArrayVision handles large data matrices as efficiently as possible, and can export the results of analyses directly to your own data structures. A good analysis system should go beyond just reporting the data. It should also give you procedures for making objective comparisons of gene expression across conditions. This issue of comparing expression is non-trivial. Each array specimen will differ from the others in the absolute intensity of signal, so we cannot simply compare signal strength across specimens. Rather, irrelevant inter-specimen variation must be minimized to allow valid comparisons. The most common method for minimizing irrelevant variation is normalization within arrays, and subsequent comparison of normalized values across arrays. Because we are dealing with ratios during normalization, background is subtracted prior to the normalization process. There are many methods for normalization, including selection of specific reference elements (e.g. housekeeping genes), dividing by the mean of all elements (e.g. Pietu et. al, 1996), or using some other parameter (such as the median) that seeks to define an internal reference for the array. ArrayVision provides various forms of normalization, with flexible definition of background. It allows the use of methods appropriate to both additive and proportional error variance (proportional is typical of hybridization arrays). ArrayVision tries to allow flexible data analysis. It also shows alterations in expression across specimens, in easily understood form. ArrayVision includes elemental display functions, which create easily understood graphics that summarize the results (e.g. up or down regulation) of complex expression studies. At present, ArrayVision provides accurate "raw" data (hybridization intensities, hybridization intensities corrected for background, ratios, differences). It provides statistical tests which describe how a particular array element relates to the distribution of all elements. Export these data, and perform further analyses in your own informatics software. We are developing more sophisticated statistical procedures for high level analyses of expression data (Statistical Informatics). Statistical informatics includes more tests for determining whether a given element is different from others in a single array. It also provides quality metrics for each array element, and allows us to state (with confidence estimates) whether elements (including low expressors) alter their expression across arrays. Contact us for details.
ArrayVision
11
Summary of the ArrayVision System

Large arrays are analyzed quickly and automatically. Macro or micro arrays can be analyzed, using fluorescent, luminescent, or isotopic labels. Automatic array propagation, alignment, and background correction. Provides objective statistical methods for detection of alterations in expression. Comparative expression analyses across two or more arrays. Elemental displays are easy-to-understand graphics, which display hybridization parameters as color-coded dots.
12
ArrayVision
References
Anchordoguy, T.L., Crawford, D.L. Hardewig, I. and Hand, S.C. Heterogeneity of DNA binding to membranes used in quantitative dot blots, BioTechniques 20:754-756 (1996). Chee, M.S., Yang, R.Y., Hubbell, E., Berno, A., Huang, X.C., Stern, D., Winkler, J., Lockhart, D.J., Morris, M.S. and Fodor, S.P.A. Accessing genetic information with high-density oligonucleotide arrays, Science 274:610614 (1996). DeRisi, J., Penland, L., Brown, P.O., Bittner, M.L., Meltzer, P.S., Ray, M., Chen, Y., Yan, A.S. and Trent, J.M. Use of a cDNA microarray to analyse gene expression patterns in human cancer, Nature Genetics 14:457-460 (1996). de Saizieu, A., Certa, U., Warrington, J., Gray, C., Keck, W. and Mous, J. Bacterial transcript imaging by hybridization of total RNA to oligonucleotide arrays, Nature Biotechnology 16:45-48 (1998). Eggers, M., Hogan, M., Reich, R.K., Lamture, J., Ehrlich, D., Hollis, M., et al., A microchip for quantitative detection of molecules utilizing luminescent and radioisotope reporter groups, Biotechniques 17:516-525 (1994). Fodor, S.P.A., Read, L.J., Pirrung, M.C., Stryer, L., Lu, A.M. and Solas, D. Light-directed, spatially addressable parallel chemical synthesis, Science 251:767-773 (1991). Gress, T.M., Hoheisel, J.D., Lennon, G.G., Zehetner, G. and Lehrach, H. Hybridization fingerprinting of highdensity cDNA library arrays with cDNA pools derived from whole tissues, Mammalian Genome 3:609619 (1992). Guo, Z., Guilfoyle, R.A., Thiel, A.J., Wang, R. and Smith, L.M. Direct fluorescence analysis of genetic polymorphisms by hybridization with oligonucleotide arrays on glass supports, Nucleic Acids Research 22:5456-5465 (1994). Hennink, E.J., de Haas, R., Verwoerd, N.P. and Tanke, J.J. Evaluation of a time-resolved fluorescence microscope using a phosphorescent Pt-porphine model system, Cytometry 24:312-320 (1996). Jovin, T.M. and Arndt-Jovin, D.J. Luminescence digital imaging microscopy, Annual Review of Biophysical Chemistry 18:271-308 (1989). Khrapko, K.R., Lysov, Y.P., Khorlin, A.A., Ivanov, I.B., Yershov, G.M., Vasilenko, S.K., Florentiev, V.L. and Mirzabekhov, A.D. A method for DNA sequencing by hybridization with oligonucleotide matrix, DNA Sequence - Journal of DNA Sequencing and Mapping 1:375-388 (1991). Lamture, J.B., Beattie, K.L., Burke, B.E., Eggers, M.D., Ehrlich, D.J., Fowler, R., Hollis, M.A., Kosicki, B.B., Reich, R.K., Smith, S.R., Varma, R.S. and Hogan, M.E. Direct detection of nucleic acid hybridization on the surface of a charge coupled device, Nucleic Acids Research 22:2121-2125 1994. Maskos, U. and Southern, E.M. Parallel analysis of oligodeoxyribonucleotide (oligonucleotide) interactions. I. Analysis of factors influencing oligonucleotide duplex formation, Nucleic Acids Research 20:1675-1678 (1992). Mason, R.S., Rampal, J.B. and Coassin, P.J. Biopolymer synthesis on polypropylene supports. I. Oligonucleotides, Analytical Biochemistry 217:306-310 (1994). Nguyen, C., Rocha, D., Granjeaud, S., Baldit, M., Bernard, K., Naquet, P. and Jordan, B.R. Differential gene expression in the murine thymus assayed by quantitative hybridization of arrayed cDNA clones, Genomics 29:207-216 (1995). Oi, V.T., Glazer, A.N. and Stryer, L. Fluorescent phycobiliprotein conjugates for analyses of cells and molecules, Journal of Cell Biology 93:981-986 (1982). Pearson, D.H. and Tonucci, R.J. Nanochannel glass replica membranes, Science 270:68-69 (1995). Pease, A.C., Solas, D., Sullivan, E.J., Cronin, M.T., Holmes, C.P. and Fodor, S.P.A. Light-generated oligonucleotide arrays for rapid DNA sequence analysis, Proceedings of the National Academy of Sciences USA, 91:5022-5026 (1994). Pietu, G., Alibert, O., Guichard, V., Lamy, B., Bois, F., Leroy, E., Mariage-Smason, R., Houlgatte, R., Soulare, P. and Auffray, C. Novel gene transcripts preferentially expressed in human muscles revealed by quantitative hybridization of a high density cDNA array, Genome Research 6:492-503 (1996).
ArrayVision
13
Saiki, R.K., Walsh, P.S., Levenson, C.H. and Erlich, H.A. Genetic analysis of amplified DNA with immobilized sequence-specific oligonucleotide probes, Proceedings of the National Academy of Sciences USA, 86:6230-6234 (1989). Schena, M., Shalon, D., Davis, R.W. and Brown, P.O. Quantitative monitoring of gene expression patterns with a complementary DNA microarray, Science 270:467-470 (1995). Seveus, L., Visl, M., Syrjnen, S., Sandberg, M., Kuusisto, A., Harjo, R., Salo, J., Hemmil, J., Kojola, H. and Soini, E.J. Time-resolved fluorescence imaging of europium chelate label in immunohistochemistry and in situ hybridization, Cytometry 13:329-338 (1992). Shalon, D., Smith, S.J. and brown, P.O. A DNA microarray system for analyzing complex DNA samples using two-color fluorescent probe hybridization, Genome Research 6:639-645 (1996). Southern, E.M., Maskos, U. and Elder, J.K. Analyzing and comparing nucleic acid sequences by hybridization to arrays of oligonucleotides: Evaluation using experimental models, Genomics 13:1008-1017 (1992). Tsien, R.Y and Waggoner, A. Fluorophores for confocal microscopy: Photophysics and photochemistry, In Pawley, G.P. (ed.) The Handbook of Biological Confocal Microscopy, IMR Press, pp 153-161, (1989). Wan, J.S., Sharp, S.J., Poirier, G.M.-C., Wagaman, P.C., Chambers, J., Pyati, J., Hom, Y.-L., Galindo, J.E., Huvar, A., Peterson, P.A., Jackson, M.R. and Erlander, M.G. Cloning differentially expressed mRNAs, Nature Biotechnology 14:1685-1691 (1996). Wittrup, K.D., Westerman, R.J. and Desai, R. Fluorescence array detector for large-field quantitative fluorescence cytometry, Cytometry 16:206-213 (1994). Zhao, N., Hashida, H., Takahashi, N., Misumi, Y. and Sakaki, Y. High-density cDNA filter analysis: a novel approach for large-scale, quantitative analysis of gene expression, Gene 156:207-213 (1995).

Array Vision Genomics Software - Rapid and Automated Analysis of Genome Arrays

Diunggah oleh

Informasi Dokumen

Deskripsi Asli:

Hak Cipta

Format Tersedia

Bagikan dokumen Ini

Bagikan atau Tanam Dokumen

Opsi Berbagi

Apakah menurut Anda dokumen ini bermanfaat?

Apakah konten ini tidak pantas?

Hak Cipta:

Format Tersedia

Array Vision Genomics Software - Rapid and Automated Analysis of Genome Arrays

Diunggah oleh

Hak Cipta:

Format Tersedia

Contents

Hybridization Intensity: Manual vs. Auto Alignment 33P Array

Intensity: Auto Alignment

Intensity: Manual Alignment

Genomics Imaging Systems and ArrayVision Software

Figure 5: ArrayVision accepts data from almost any detection system.

macro array imaging system ArrayVision software software camera

imaging plate reader

scanning laser scanning microscope

Typical High Throughput Genetic Analyses

Gene Expression and Functional Genomics

data are taken from the entire array. automatically.

Summary of the ArrayVision System

Anda mungkin juga menyukai