Abstract
Computing Science is in the middle of a major paradigm shift, driven by Molecular Biology. Adleman by his breath -taking paper announced the arrival of computers based on biochemical operations and has showed that a large class of difficult and computationally hard problems is best solved not by pushing electrons through wires in a computing laboratory, but by mixing solutions in test tubes in a molecular biology laboratory. As the computationally hard problems are the stumbling blocks for the contemporary Von Neumann computers, the DNA based computation is poised to play a greater role in computing. This article discussed about this novel idea of DNA based computation.
1. Introduction
Today's computers are millions of time s more powerful than their crude ancestors in the 40's and 50's. Almost every two years, computers have become twice as fast whereas their components have assumed only half the space and however, it has also been realized that integrated circuit -technology is running against its limits and it has been a hotly debated question whether computers of an entirely new kind, quantum-mechanical computers or computers based on Molecular Biology is in the offing. One of the recently introduced unconventional paradigm s, which promises to have a tremendous influence on the theoretical and practical progress of computer science is DNA computing, which under some circumstances might be an elegant alternative to the classical Turing/Von Neumann notion of computing. It is sure that the union of two of science's most fertile fields, molecular biology and computer science is to produce some remarkable offsprings. In 1994, Adleman invented a method for solving a small instance of a Directed Hamiltonian Path (DHP) Problem by an in vitro DNA-recombination assay which he performed experimentally using hybridization, several agarose -gel separations, and PCR by handling DNA sequences in a test tube. Before discussing about this experiment, here is an overview about DNA molecules, whi ch make the way for this sort of innovative computing model.
length n. The four DNA nucleotides are adenine, guanine, cytosine and thymine, commonly abbreviated to A,G, C and T respectively. Each strand has, according to chemical convention, a 5' and a 3' end, thus any single strand has a natural orientation. The classical double helix of DNA is formed when two separate strands bond together. Bonding occurs by the pairwise attraction of bases; A bonds with T and G bonds with C. The pairs (A,T) and (G,C) are therefore known as Watson-Crick complementary base pairs. Thus a hypothetical DNA molecule sequence is AACGCGTACGTACAAGTGTCCGAATGGCCAATG TTGCGCATGCATGTTCACAGGCTTACCGGTTAC
For example, the strand coding for cities 1 and 2 could be AATGCCGG, TTTAAGCC respectively. A road from city 1 to 2 is encoded in such a way that the first part is the complementary strand to the second half of strand for city 1, and the second part is the complementary strand to the first half of the strand for city 2, i.e. GGCCAAAT. That is, GGCC is the complementary version of the last four bases of city 1, and AAAT is the complementary version of the first four bases of city 2. Thus the edge joining the cities 1 and 2 is being encoded as follows. GGCCAAAT AATGCCGGTTTAAGCC Similarly the DNA molecules strands can be formed for all the nodes and edges representing all possible routes in the directed graph in the test tube. The first stage of Adleman's computation was to extract those long strands which start with city 1 and store these in a separate test tube. The second stage was to extract those strands which corresponded to a certain length which signified exactly 5 cities being passed through. If each city is represented by 8 DNA bases, all strands of 40 bases would be extracted and stored in a separate test tube. The third stage is to extract all those strands containing the DNA sequence for city 1, then those containing the DNA sequence for city 2, and so on. If there is a solution to this route problem, it will be found in the strands extracted for the last city 5.
computers, their parallelism would strongly prevail, so that in certain models the number of operations per second could be of order 10 18 operations per second, which is at least 100,000 times faster than the fastest supercomputers existing today. Energy efficiency: 1019 operations per Joule. This is about a billion times more energy efficient than today's electronic devices. The research in this field had both experimental and theoretical aspects. The experiments that have actually been carried out are not numerous so far. P.Kaplan replicated Adleman's experiment, a Wisconsin team of computer scientists and biochemists made a partial progress in solving a 5-variable instance of SAT problem, an another NP-complete problem, by using a surface-based approach. F.Guarnieri and his team have used a horizontal chain -reaction for DNA-based addition. Also, various aspects of the implementability of DNA comput ing have been experimentally investigated: the effect of good encodings on the solutions of Adleman's problem were addressed; the complications raised by using the Polymerase Chain Reaction (PCR) were studied; the usability of self -assembly of DNA was studied; the experimental gap between the design and assembly of unusual DNA structures was pointed out; joining and rotating data with molecules was reported; concatenation with PCR was studied; evaluating simple Boolean formulas was started; ligation experiments in computing with DNA were conducted. The theoretical work on DNA computing consists, on one side, of designing potential experiments for solving various problems by means of DNA manipulation. Descriptions of such experiments include the Satisfiabilit y Problem, breaking the Data Encryption Standard (DES), expansions of symbolic determinants, matrix multiplication, graph connectivity and knapsack problem using dynamic programming, road coloring problem, exascale computer algebra problems, and simple Horn clause computation. On the other side, the theoretical research on DNA computing comprises attempts to model the process in general, and to give it a mathematical foundation. To this aim, models of DNA computing have been proposed and studied from the point of view of their computational power and their in -vitro feasibility. Some of them are given below.
to other NP-complete problems. The other model is by s plicing operation proposed by Head and vigrously followed by many researchers using formal language theory. It is shown that the generative power of finite extended splicing systems is equal to that of Turing Machines. Afterwords, Paun and others introduce d the socalled sticker model. Unlike previous models, the sticker mode has a memory that can both read and written to, and employs reusable DNA. Also there is a proposal about the tendency of DNA structures to self-assemble as a computational tool. They show that the self-assembly of complex branches known as double crossovers into two-dimensional sheets or three-dimensional solids is a computationally powerful model. However, there are some impediments to effective computation by these models. It is a common feature of all the proposed implementations that the biological operations to be used are assumed to be error -free. An operation central to and frequently employed in most models is the extraction of DNA strands containing a certain sequence (known as removal by DNA hybridization). The most important problem with this method is that extraction is not 100% efficient and may at times inadvertently remove strands that do not contain the specified sequence. Especially for a large problem, the number of extractions required may run into hundreds, or even thousands resulting a high probability of incorrect hybri dization. Thus, a novel error-resistant model of DNA computation has been proposed by Alan Gibbons and his team that obviates the need for hybridization extraction within the main body of the computation. Like previous models, this model is particularly e ffective for algorithmic description. It is sufficiently strong to solve any of the problems in the class NC and the authors have given DNA algorithms for 3-vertex-colorability problem, Permutations Problem, Hamiltonian Path Problem, the Subgraph isomorphi sm problem, and the Maximum clique and maximum independent set problem. There are two general formats in which complex combinatorial sets of DNA molecules may be manipulated.
y y
The solid-phase format possesses many important advantages over the solution phase format.
Facilitated sample handling. With the DNA molecules attached to a support, the experimental manipulations are very simple. They are addition of a solution to the supp ort
and removal (washing) to a solution from the support. These steps are readily automated.
Decreased losses during sample handling Reduction of interference between oligonucleotides Solid-phase chemistry permits facile purification of the DNA molecules at every step of the process.
of automating, miniaturizing and integrating them into a general -purpose desktop DNA computer may take much longer time.