Anda di halaman 1dari 8

BY K.V.

S GANESH KUMAR
kantiganesh@ymail.com B.VIVEK CHAITANYA bvivekchaitanya@yahoo.com

Department of Computer Science Engineering

MVGR COLLEGE OF ENGINEERING

Abstract:
In this paper we present a brief idea of Bioinformatics. Firstly, we describe about role of computers in bioinformatics, in the identification of unknown virus. Then we present the creation of databases of the protein structures .Then we move

to the core of the paper which deals with the similarity searching techniques: FASTA and PSI-BLAST. In FASTA, we describe its application and sequence format of multiple alignments with an example, which determines among horse, mike whale and red kangaroo which two species are closely related. In PSI-BLAST, we present the flow chat of a PSI-BLAST searching technique and an example which answers, why pig, or in general, animal derived insulin can be used to treat diabetes. Then we discuss the application of bioinformatics mainly in the field of drug discovery .Finally, we conclude by describing why bioinformatics has become such a hot topic in career option. Hence Bioinformatics is,

Involving computers to know the blue print of life

Introduction:
Bioinformatics, as reflected by the term has two components -Bio and Informatics and as such primarily depicts the convergence of two fields of biology and information technology. Bioinformatics is the application of computer technology to the management of biological information. Specifically, it is the science of developing computer databases and algorithms to facilitate and expedite biological research. The Human Genome Project (HGP) has been the biggest achievement of bioinformatics to date. Other areas of bioinformatics are sequence alignment, protein structure prediction, systems biology, protein-protein interactions and virtual evolution.[SR04]

Computers in Bioinformatics:
When scientists talk about bioinformatics or doing bioinformatics, they mean the use of computers to store, retrieve, analyze, predict and simulate the nature and properties of biological macromolecules like nucleic acids (e.g.: deoxyribonucleic acid, or DNA), and proteins (the product of DNA). Computers and their networking via internet help the biologists not only to store large volumes of data but also to retrieve data from anywhere around the world quickly. Scientists and software enthusiasts have designed innumerable computational tools for analysis of various physio-chemical and structural properties of biomolecules.

For example, imagine a crisis-sometime in future-in which a new biological virus creates an epidemic of fatal disease in humans or animals. Laboratory scientists will isolate its genetic material and determine the sequence. Computer programs will take over. Viruses contain protein molecules which are suitable targets, for drugs that will interfere with viral structure or function. From the viral DNA sequences, computer programs will derive the amino acid sequences and other programs will compute the structures of these proteins and functional properties.

Thus, knowing the viral protein structure and function will make it possible to design therapeutic agents. [http://bioinformatics.oupjournals.org/]

Similarity searching techniques:


Theoretical scientists have derived new and sophisticated algorithms which allow sequences to be readily compared using probability theories. These comparisons become the basis for determining gene function, developing phylogenetic relationships and simulating protein models. The two popular data based similarity searching techniques FASTA and BLASTA. [HLC-95] Computer scripting languages such as Perl and Python are often used to interface with biological databases.

FASTA:
A very common format for sequence data is derived from conventions of FASTA, a program for FAST Alignment by W.R.Pearson.

A Sequence in FASTA format:


Begin with a single-line description. A > must appear in the first column. Subsequent lines contain the sequence, one character per residue. Use one letter codes for nucleotides or amino acids specified by IUB and IUPAC Lines can have different lengths, i.e., ragged right margins. Most programs will accept lower case letters as amino acid codes.

PSI-BLAST:
BLAST is Basic Local Sequence Alignment Tool. This program has variants which check entry in the databank independently against query sequence. This program is so commonly used that the first encounter you have with bioinformatics tools and biological databases will probably be through the National Center for Biotechnology Information's (NCBI) BLAST web interface. Figure 1-1. Form for submitting a BLAST search against nucleotide databases at NCBI

Often the databank contains close matches to the query sequence. Less sensitive but faster programs are quite capable of identifying the close matches, and if that is what is required. The method used by BLAST goes back, in a sense to the dot plot approach, checking for well-matching local regions. For each entry in the database, it checks for short contiguous regions that match a short contiguous region in the query sequence, using substation scoring matrix but allowing no gaps. An approach in which candidate regions of fixed length are identified initially can be made very fast by use of lookup tables. The PSIBLAST was originally designed to solve that a full dynamic programming methods are rather slow for complete searches in a large databank.

A flowchart for PSI-BLAST:


Probe each sequence in the chosen database independently for local regions of Collect significant hits. Construct a multiple sequence alignment table between Form a profile from the multiple sequence alignment. Reprove the database with the profile, still looking only for local matches. Decide which hits are statistically significant and retain these only. Go back to step 2, until a cycle produces no change. The below is an example for a BLAST. The following pictures show the reason why pig, or in general, animal derived insulin can be used to treat diabetes: As can be seen in these figures, the amino acid sequences of the animal insulin are very similar to the human form. Amino acids, symbolized using the one-letter-code, which match exactly are displayed in the middle row marked with horizontal lines. Those which do not match are similarity to the query sequence, using a BLAST-type search but allowing gaps. the query sequence and the significant local matches.

marked with vertical bars. The sequence identity is 94% for rabbits, 89% for pigs, and 87%

for cows. Image of human insulin

PSI-BLAST, using iterated pattern search, is much more powerful than simple pair wise BLAST in picking up distant relationships. PSI-BLAST correctly identifies three times as many homologues as BLAST .Therefore its better method for analyzing genomes. [BI03, http://ncbi.nlm.nih.gov/BLAST/]

Application of Bioinformatics:
Companies in the business of developing drugs, agricultural chemicals, hybrid plants, plastics and other petroleum derivatives, and biological approaches to environmental remediation, among others, are developing bioinformatics divisions and looking to bioinformatics to provide new targets and to help replace scarce natural resources.

Conclusion:
Bioinformatics is first and foremost a component of the biological sciences. Therefore, it is imperative to appreciate the fact that to deal in bioinformatics, one must necessarily understand the essentials of molecular biology on the one hand and the fundamental principles of computer and information technology. Bioinformatics has created a great hullabaloo in career option. Biologists with knowledge of computers, computer enthusiasts with their biology basics freshened up, mathematics freaks with a penchant for computational wizardry, biostatisticians and physicists who can handle computers-are all being welcomed into the big new world of bioinformatics.. Bioinformatics is a tool and not an end in itself.

References:

Books:

[SR04] SCIENCE REPORTER -NOV2004 [HLC-95] Holm.L and Sander-Network tools for protein structure 1995 [BI03] Arthur-Bioinformatics-2003

Internet resources:

Human Genome Project and Bioinformatics Bioinformatics journal (http://bioinformatics.oupjournals.org/) BLAST

(http://www.ornl.gov/TechResources/Human_Genome/research/informatics.html) http://ncbi.nlm.nih.gov/BLAST/

Anda mungkin juga menyukai