Introduction
Bioinformatics is an interdisciplinary research area at the interface between computer
science and biological science. It involves the technology that uses computers for storage,
retrieval, manipulation and distribution of information related to biological
macromolecules such as DNA, RNA and proteins. Bioinformatics is limited to sequence,
structural, and functional analysis of genes and genomes and their corresponding
products and is often considered computational molecular biology. It consists of
two subfields: the development of computational tools and databases and the application
of these tools and databases in generating biological knowledge to better understand
living systems. These tools are used in three areas of genomic and molecular biological
research: molecular sequence analysis, molecular structural analysis and molecular
functional analysis. The areas of sequence analysis include sequence alignment, sequence
database searching, motif and pattern discovery, gene and promoter finding,
reconstruction of evolutionary relationships, and genome assembly and comparison.
Structural analyses include protein and nucleic acid structure analysis, comparison,
Classification and prediction. The functional analysis includes gene expression profiling,
protein- protein interaction prediction, protein sub cellular localization prediction,
metabolic pathway reconstruction, and simulation. The three aspects of bioinformatics
analysis are not isolated but often interact to produce integrated results. For example,
protein structure prediction depends on sequence alignment data; clustering of gene
expression profiles requires the use of phylogenetic tree construction methods derived
In sequence analysis. Sequence- based prediction is related functional analysis of co
expressed genes. The first major bioinformatics project was undertaken by Margaret
Dayhoff in 1965, who developed a first protein sequence database called Atlas of Protein
Sequence and Structure. Subsequently, in the early 1970s, the Brookhaven national
laboratory established the Protein Data Bank for archiving three-dimensional protein
structures. At its onset, the database stored less than a dozen protein structures, compared
to more than 30,000 structures today. The first sequence alignment algorithm was
2
Developed by Needleman and Wunsch in 1970. This was a fundamental step in the
development of the field of bioinformatics, which paved the way for the routine sequence
comparisons and database searching practiced by modern biologists.
10 The recent advance of Bioinformatics is molecular modeling which is aimed at
understanding structure-function and structure property relationship in physico-chemical
processes and pharmaceuticals & thus has become increasingly important for finding and
designing new drugs. In fact computers are playing an important role in new drug
discovery and drug design.
HEPATITIS:-
Hepatitis (plural hepatitides) implies injury
liver tissue. Etymologically from ancient Greek hepar or hepato- meaning 'liver,' and
suffix -itis, denoting 'inflammation’. The condition can be self limiting, healing on its
3
among other things, screening of harmful substances, regulation of
blood composition, and production of bile to help digestion.
Causes
Acute hepatitis
Chronic hepatitis
4
Autoimmune: Autoimmune hepatitis
Alcohol
Drugs: Methyl-dopa, Nitrofurantoin,Iisoniazide, Ketoconazole
Non-alcoholic steatohepatitis
Heredity: Wilson's disease, alpha 1-antitrypsin deficiency
Primary biliary cirrhosis and primary sclerosing
cholangitis occasionally mimic chronic hepatitis[4]
Viral hepatitis
A virus is a particle which is smaller than bacteria, and contains complex genetic
information called DNA or RNA. This genetic material allows the virus to infect bacteria
or living cells, set up the machinery to reproduce itself, leading to destruction of the cell
in which it resides. To date, five viruses, labeled A through E, have been identified which
appear to cause viral hepatitis. Viruses A and E can be contracted from contaminated
water or food (by mouth), while viruses B, C and D are transmitted by direct injection
into the bloodstream (through any method of injection under the skin). The term viral
hepatitis describes any one of the illnesses caused by the five viruses mentioned, and
consists of an infection of liver cells which leads to damage of the liver over days in
some cases, but over many years in others. Thirty years ago, none of the hepatitis viruses
had been identified. In the 1960's, transfusion-related viral hepatitis was extremely
common, with 30% of patients receiving blood products becoming infected. By 1970, a
blood test called the Australia antigen, was developed which appeared to identify those
infected with one hepatitis virus which we now call hepatitis B. The
investigator who discovered the Australia antigen, the protein which makes up the coat of
the virus and which is now called the hepatitis B surface antigen (HBsAg), was awarded
the Nobel prize. Our understanding of viral hepatitis has grown tremendously since the
discovery of the Australia antigen.
5
Currently 11 viruses are recognized as causing hepatitis, Two are
herpes viruses (cytomegalovirus virus[CMV] and Epstein- Barr virus[EBV]) and 9 are
hepatotropic viruses
EBV and CMV cause mild ,self-resolving forms of hepatitis with no permanent
hepatic damage. Both viruses causes the typical infectious mononucleosis of fatigue
,nausea , and malaise.
Of the nine human hepatotrofic viruses ,only five are well characterized;
hepatitis G and TTV(transfusion transmitted virus) are newly discovered viruses
.hepatitis A (sometimes called infectious hepatitis), and hepatic E (formally called enteric
–transmitted NANB hepatitis) ,are transmitted by fecal-oral contamination .The most
important type include hepatitis B(sometime called serum hepatitis), hepatitis C (formally
called formally non-A ,non-B hepatic), and hepatitis D (formally called delta hepatitis).
Hepatitis A
6
Hepatitis E
Incubation period 30-40 days
Acute, self limiting hepatitis, no chronic carrier state
Age: predominantly young adults, 15-40 years .Fulminate hepatitis in pregnant women.
Mortality rate is high (up to 40%).Similar to hepatitis A; virus replicates in the gut
initially, before invading the liver, and virus is shed in the stool prior to the onset of
symptoms. Viraemia is transient. A large inoculum of virus is needed to establish
infection.Little is known yet. The incidence of infection appears to be low in first world
countries.
Hepatitis C
Putative Togavirus related to the Flavi and Pesti viruses.
Thus probably enveloped. Has a ssRNA genome
Does not grow in cell culture, but can infect Chimpanzees Incubation period 6-8 weeks
Causes a milder form of acute hepatitis than does hepatitis B
But 50% individuals develop chronic infection, following exposure.
2) Hepatocellular carcinoma
Hepatitis D
Defective virus which requires Hepatitis B as a helper virus in order to replicate.
Infection therefore only occurs in patients who are already infected with Hepatitis
B.Increased severity of liver disease in Hepatitis B carriers. virus particle 36 nm in
7
diameter encapsulated with HBsAg, derived from HBV
delta antigen is associated with virus particles ssRNA genome
Identified in intra-venous drug abusers
Hepatitis G
A virus originally cloned from the serum of a surgeon with non-A, non-B, non-C
hepatitis, has been called Hepatitis G virus. It was implicated as a cause of parenterally
transmitted hepatitis, but is no longer believed to be a major agent of liver disease. It has
been classified as a Flavivirus
Hepatitis B
Since the identification of the hepatitis B virus, several other viruses which are nearly
identical, have been identified in Eastern woodchucks, ground squirrels and Peking
8
ducks. The members of this virus family, termed the 'Hepadna' viruses, have similar life
cycles to that observed in man and can serve as animal models, allowing further study of
these unique disease-causing agents.
Family : hepadnaviridae
Size 42nm Virions (also known as "Dane particles") contain a circular dsDNA genome.
HBV Antigens
HBsAg = surface (coat) protein produced in excess as small spheres and tubules
9
HBcAg = inner core protein
HBeAg = secreted protein; function unknown.
Clinical Features
Incubation period 2 - 5 months
Insidious onset of symptoms. Tends to cause a more severe disease than Hepatitis A.
Asymptomatic infections occur frequently.
Pathogenesis
Infection is parenterally transmitted. The virus replicates in the liver and virus
particles, as well as excess viral surface protein, are shed in large amounts into the blood.
Viraemia is prolonged and the blood of infected individuals is highly infectious.
Complications
1) Persistant infection:-
Following acute infection, approximately 5% of infected individuals fail to eliminate the
virus completely and become persistantly infected.
The virus persists in the hepatocytes and on-going liver damage occurs because of the
host immune response against the infected liver cells.
10
Chronic Active Hepatitis - There is aggressive destruction of liver tissue and rapid
progression to cirrhosis or liver failure. Patients who become persistently infected are at
risk of developing hepatocellular carcinoma (HCC).
HBV is thought to play a role in the development of this malignancy because:
3) Fulminant Hepatitis
Rare; accounts for 1% of infections.
Epidemiology
1) Blood:
2) Sexual intercourse
11
become infected at between three and nine years of age.
Horizontal transmission also occurs in children's institutions and mental homes.
Diagnosis: Serology
Viral antigens:
1) Surface antigen (HBsAg) is secreted in excess into the blood as 22 nm spheres and
tubules. Its presence in serum indicates that virus replication is occurring in the liver
2) 'e' antigen (HBeAg) secreted protein is shed in small amounts into the blood. Its
presence in serum indicates that a high level of viral replication is occurring in the liver
3) core antigen (HBcAg) core protein is not found in blood
Antibody response:
1) Surface antibody (anti-HBs) becomes detectable late in convalescence, and indicates
immunity following infection. It remains detectable for life and is not found in chronic
carriers (see below).
2) e antibody (anti-HBe) becomes detectable as viral replication falls. It indicates low
infectivity in a carrier.
3) Core IgM rises early in infection and indicates recent infection
4) Core IgG rises soon after IgM, and remains present for life in both chronic carriers as
well as those who clear the infection. Its presence indicates exposure to HBV.of the
chronic carrier
12
Fig.Hepatitis B virus in serum.
Prevention
1) Active Immunization
Both vaccines are equally safe and effective. The administration of three doses induces
protective levels of antibodies in 95% of vaccine recipients.
Universal immunization of infants was introduced in April 1995. Infants receive 3 doses
at 6, 10 and 14 weeks of age.
2) Passive Antibody
Hepatitis B immune globulin should be administered to non immune individuals
13
following single episode exposure to HBV-infected blood. For example: needlestick
injuries.
When most individuals become infected with the hepatitis B virus, they are not aware of
the infection for several weeks, until they develop symptoms of acute hepatitis, such as
nausea, fatigue and jaundice (yellowing of the eyes). The acute hepatitis phase may last
for several weeks and occasionally leads to hospitalization, but acute hepatitis B resolves
completely in 95% of those infected.
Others who do not develop significant symptoms following exposure
may not be aware of the infection. These individuals may also overcome the infection
completely and develop immunity, but frequently become chronic carriers.
The outcome of hepatitis B infection depends to a great extent on
the status of the person's immune system at the time of exposure. Most chronic carriers or
those with chronic hepatitis B are not aware of their on-going infection, although some
have persistent fatigue.
Molecular virology
14
Fig. hepatitis B virus genome
organization, with four overlapping reading frames running in one direction and no
noncoding regions. The minus strand is unit length and has a protein covalently attached
to the 5' end. The other strand, the plus strand, is variable in length, but has less than unit
length, and has an RNA oligonulceotide at its 5' end. Thus neither DNA strand is closed
and circularity is maintained by cohesive ends (Strauss, 2002). The four overlapping open
reading frames (ORFs) in the genome are responsible for the transcription and expression
of seven different hepatitis B proteins. The transcription and translation of these proteins
15
is through the used of multiple in-frame start codons. The HBV genome also contains
parts that regulate transcription, determine the site of polyadenylation and a specific
transcript for encapsidation into the nucleocapsid.
Life cycle
In order to reproduce, the hepatitis B virus, must first attach onto a cell which is capable of
supporting its replication. Although hepatocytes are known to be the most effective cell type for
replicating HBV, other types of cells in the human body have be found to be able to support
replication to a lesser degree.
The initial steps following HBV entry are not clearly defined although it is
known that the virion initially attaches to a susceptible hepatocyte through recognition of cell
surface receptor that has yet to be indified (Garces, HBVP). The DNA is then enters into the
nucleus, where it is known to form a convalently close circular form called cccDNA
At early times after the infection, the DNA is recirculated to the nucleus,
where the process is repeated, resulting in the the accumulation of 10 to 30 molecules of CCC
DNA and an increase in viral mRNA concentrations (Flint et al., 765).
16
Fig. HBV life cycle
17
The hepatitis B virion, also known as the Dane particle, is the one infectious particle
found within the body of an infected patient. This virion has a diameter of 42nm and its
outer envelope contains a high quantity of hepatitis b surface proteins. The envelope
surrounds the inner nucleocapsid which is made up of 180 hepatitis B core proteins
arranged in an icosahedral arrangement. The nucleocapsid also contains at least one
hepatitis b ploymerase protein (P) along with the HBV genome.
In infected people, virions actually compose a small minority of HBV-derived particles.
Large numbers of smaller subviral particles are also present,that usually outnumber the
virions in the ratio of 100:1.These two subviral particles the hepatitis B filament and a
hepatitis B sphere,are often referred to as a group named surface antigen particles.The
sphere contains both middle and small surface proteins whereas the filament also
includes large hepatitis B surface protein lso includes large hepatitis B surface protein.
The absence of the hepatitis B core, polymerase, and genome causes these particles to
have a non-infectious nature. High levels of these non-infectious particles can be found
during the acute phase of the infection. Since the non-infectious particles present the
same sites as the virion, they induce a significant immune response and are thought to be
non-advantagous for the virus. However, it is also believed that the presence of high
levels of non-infectious particles may allow the infectious viral particles to travel
undetected by antibodies through the blood stream (Garces, HBVP
Hepatitis B Antigens:
There are three different types of hepatitis b antigens encoded by the HBV genome-
.Hepatitis B Surface antigen (HBsAg)- There are three different types of hepatitis B
surface antigens; small hepatitis B surface antigen (HBsAg or SHBsAg), middle hepatitis
B surface antigen (MHBsAg), and large hepatitis B surface Antigen (LHBsAg). HBsAg
is the smallest protein of the hepatitis B surface proteins and has historically been known
as the Australia antigen (Au antigen). It is very hydrophobic, containing four-
transmembrane spanning regions. This protein is the prime constituent of all hepatitis b
particle forms and appears to be manufactured by the virus in high quantities. It also
contains a highly antigenic epitope which may be responsible for triggering immune
18
response. Regardless of the high Antigenicity and prevalence of these particles,the
immune system appears basically oblivious to their presence.
Hepatitis B Core Antigen (HBcAg)- The only HBV antigen that can not be detected
directly by blood test, this antigen can only be isolated by analyzing an infected
hepatocyte. A 185 amino acid protein is expressed in the cytoplasm of infected cells, they
are highly associated with nucleocapsid assembly (Strauss, 2002).
Hepatitis B e Antigen (HBeAg)- The e antigen is named due to its "early" appearance
during an acute HBV infection. Thought to be located in the core structure of the virus
molecule, this antigen can be detected by blood test. If found its usually indicative of
complete virus particles in circulation. (Strauss, 2002)
19
20
REVIEW OF LITERATURE
Approximately 5% of the world population is infected by the hepatitis B virus (HBV) that
causes a necroinflammatory liver disease of variable duration and severity. Chronically
infected patients with active liver disease carry a high risk of developing cirrhosis and
hepatocellular carcinoma.
21
the development of DNA damage that can cause hepatocellular carcinoma. Elucidation of
the immunological and virological basis for
Antibody
1) Surface antibody (anti-HBs) becomes detectable late in convalescence, and indicates
immunity following infection. It remains detectable for life and is not found in chronic
carriers .
2) e antibody (anti-HBe) becomes detectable as viral replication falls. It indicates low
infectivity in a carrier.
3) Core IgM rises early in infection and indicates recent infection
4) Core IgG rises soon after IgM, and remains present for life in both chronic carriers as
22
well as those who clear the infection. Its presence indicates exposure to HBV. of the
chronic carrier.[4]
The quality of the alignment of the query to the template sequence is a major factor in
determining the quality of homology models. This is one of the sources of the 30% rule,
because alignment quality usually decreases dramatically below about 30% sequence
identity. (A structural explanation for this observation has been offered by Chung and
Subbiah, 1996). Advances in the accuracy of sequence alignments using structure-based
profile methods such as those described above should result in continuing improvements
in the quality of homology models. [5,6]
With the number of protein-ligand complexes available in the Protein Data Bank
constantly growing, structure-based approaches to drug design and screening have
become increasingly important. Alongside this explosion of structural information, a
number of molecular docking methods have been developed over the last years with the
aim of maximally exploiting all available structural and chemical information that can be
derived from proteins, from ligands, and from protein-ligand complexes. In this respect,
the term 'guided docking' is introduced to refer to docking approaches that incorporate
23
some degree of chemical information to actively guide the orientation of the ligand into
the binding site. To reflect the focus on the use of chemical information, a classification
scheme for guided docking approaches is proposed. In general terms, guided docking
approaches can be divided into indirect and direct approaches. Indirect approaches
incorporate chemical information implicitly, having an effect on scoring but not on
orienting the ligand during sampling. In contrast, direct approaches incorporate chemical
information explicitly, thus actively guiding the
orientation of the ligand during sampling. Direct approaches can be further divided into
protein-based, mapping-based, and ligand-based approaches to reflect the source used to
derive the features capturing the chemical information inside the protein cavity. Within
each category, a representative list of docking approaches is discussed. In view of the
limitations of current scoring functions, it was generally found that making optimal use of
chemical information represents an efficient knowledge-based strategy for improving
binding affinity estimations, ligand binding-mode predictions, and virtual screening
enrichments obtained from protein-ligand docking. [7]
This review gives an introduction into ligand - receptor docking and illustrates the basic
underlying concepts. An overview of different approaches and algorithms is provided.
Although the application of docking and scoring has led to some remarkable successes,
there are still some major challenges ahead, which are outlined here as well. Approaches
to address some of these challenges and the latest developments in the area are presented.
Some aspects of the assessment of docking program performance are discussed. A
number of successful applications of structure-based virtual screening are described. [8]
24
25
Material and methods
1. NCBI-
Established in 1988 as a national resource for molecular biology information, NCBI
creates public databases, conducts research in computational biology, develops
software tools for analyzing genome data, and disseminates biomedical information -
all for the better understanding of molecular processes affecting human health and
disease
Swiss-prot-
: a curated protein sequence database which strives to provide a
high level of annotation (such as the description of the function of a
26
protein, its domains structure, post-translational modifications,
variants, etc.), a minimal level of redundancy and high level of
integration with other databases
3. FASTA
FASTA is a DNA and Protein sequence alignment software package first described (as
FASTP) by David J. Lipman and William R. Pearson in 1985 in the article Rapid and
sensitive protein similarity searches. The original FASTP program was designed for
protein sequence similarity searching. FASTA, described in 1988 (Improved Tools for
Biological Sequence Comparison) added the ability to do DNA:DNA searches, translated
protein:DNA searches, and also provided a more sophisticated shuffling program for
evaluating statistical significance. There are several programs in this package that allow
the alignment of protein sequences and DNA sequences. FASTA is pronounced "FAST-
Aye", and stands for "FAST-All", because it works with any alphabet, an extension of
"FAST-P" (protein) and "FAST-N" (nucleotide) alignment.
In addition to rapid heuristic search methods, the FASTA package provides SSEARCH,
an implementation of the optimal Smith-Waterman algorithm. A major focus of the
package is the calculation of accurate similarity statistics, so that biologists can judge
whether an alignment is likely to have occurred by chance, or whether it can be used to
infer homology. The FASTA package is available fromfasta.bioch.virginia.edu
27
4.BLAST
• extinction coefficient
• half-life
• instability index
• aliphatic index
28
Using SOPMA for secondry structure analysis
Recently a new method called the self-optimized prediction method (SOPM) has been
described to improve the success rate in the prediction of the secondary structure of
proteins. In this paper we report improvements brought about by predicting all the
sequences of a set of aligned proteins belonging to the same family. This improved SOPM
method (SOPMA) correctly predicts 69.5% of amino acids for a three-state description of
the secondary structure ( -helix, ß-sheet and coil) in a whole database containing 126
chains of non-homologous (less than 25% identity) proteins. Joint prediction with
SOPMA and a neural networks method (PHD) correctly predicts 82.2% of residues for
74% of co-predicted amino acids. Predictions are available by Email to deleage@ibcp.fr
or on a Web page (http://www.ibcp.fr/predict.html )
PROTOCOL FOLLOWED
29
Obtained the Receptor (Target Protein) from the literature references and available
journals available online and Pubmed literature for HBV strain
Retrieved the FASTA sequence of the protein HBeAg from the database
Swiss- Prot.
Retrieved the PDB-ID for template structure using BLAST: PDBID 2B8N and found
the similarity search.
Loaded the target sequence in pdb format in SWISS MODEL as a raw sequence and
modeled the receptor.
Verified our model through different parameter like Ranachandran plot and other which
is available in SAVS
Selected the best Ligand from the Database KEGG for HBV disease.
30
31
In protein struct
modeling, is a cla
from its amino ac
32
(29)
conserved, which may in turn lead to experiments to test those hypotheses. For example,
the spatial arrangement of conserved residues may suggest whether a particular residue is
conserved to stabilize the folding, to participate in binding some small molecule, or to
foster association with another protein or nucleic acid.
Figure : First, the known, template 3D structures are aligned with the target sequence to be
modelled. Second, spatial features, such as CZ - CZ distances, hydrogen bonds, and main chain and
side chain dihedral angles, are transferred from the templates to the target. Thus, a number of
spatial restraints on its structure are obtained. Third, the 3D model is obtained by satisfying all the
restraints as well as possible.
33
Homology modeling can produce high-quality structural models when the target and
template are closely related, which has inspired the formation of a structural genomics
consortium dedicated to the production of representative experimental structures for all
classes of protein folds. The chief inaccuracies in homology modeling, which worsen
with lower sequence identity, derive from errors in the initial sequence alignment and
from improper template selection Like other methods of structure prediction, current
practice in homology modeling is assessed in a biannual large-scale experiment known as
the Critical Assessment of Techniques for Protein Structure Prediction, or CASP.
The critical first step in homology modeling is the identification of the best template
structure, if indeed any are available. The simplest method of template identification
relies on serial pairwise sequence alignments aided by database search techniques such as
FASTA and BLAST. More sensitive methods based on multiple sequence alignment - of
which PSI-BLAST is the most common example - iteratively update their position-
specific scoring matrix to successively idenfity more distantly related homologs. This
family of methods has been shown to produce a larger number of potential templates and
to identify better templates for sequences that have only distant relationships to any
solved structure. Protein threading, also known as fold recognition or 3D-1D alignment,
can also be used as a search technique for identifying templates to be used in traditional
homology modeling methods. When performing a BLAST search, a reliable first
approach is to identify hits with a sufficiently low E-value, which are considered
sufficiently close in evolution to make a reliable homology model. Other factors may tip
the balance in marginal cases; for example, the template may have a function similar to
that of the query sequence, or it may belong to a homologous operon. However, a
template with a poor E-value should generally not be chosen, even if it is the only one
available, since it may well have a wrong structure, leading to the production of a
misguided model. A better approach is to submit the primary sequence to fold-
recognition servers or, better still, consensus meta-servers which improve upon individual
34
fold-recognition servers by identifying similarities (consensus) among independent
predictions.
Often several candidate template structures are identified by these approaches. Although
some methods can generate hybrid models from multiple templates, most methods rely
on a single template. Therefore, choosing the best template from among the candidates is
a key step, and can affect the final accuracy of the structure significantly. This choice is
guided by several factors, such as the similarity of the query and template sequences, of
their functions, and of the predicted query and observed template secondary structures.
Perhaps most importantly, the coverage of the aligned regions: the fraction of the query
sequence structure that can be predicted from the template, and the plausibility of the
resulting model. Thus, sometimes several homology models are produced for a single
query sequence, with the most likely candidate chosen only in the final step.
It is possible to use the sequence alignment generated by the database search technique as
the basis for the subsequent model production; however, more sophisticated approaches
have also been explored.
7. Molecular Docking
Introduction to Docking
Docking studies are molecular modeling studies aiming at finding a proper fit between a
ligand and its binding site.
There are two classes of protein docking:
1)Protein-protein docking
2)Protein Receptor-Ligand
35
binding and ease movement. Conformational changes are limited by steric constraint and
thus are said to be rigid.
Protein receptor-ligand motifs fit together tightly, and are often referred to as a lock and
key mechanism. There is both high specificity and induced fit within these interfaces with
specificity increasing with rigidity. Protein receptor-ligand can either have a rigid ligand
and a flexible receptor, or a flexible ligand with a rigid receptor.
The native structure of the rigid ligand flexible receptor often maximizes the interface
area between the molecules. They move within respect to one another in a perpendicular
direction in respect to the interface. This allows for binding of a receptor with a larger
than usual ligand. Normally when there is ligand overlap in the docking interface, energy
penalties incur. If the van der Waals forces can be decreased, energy loss in the system
36
will be minimilized. This can be accomplished by allowing flexibility in the receptor.
Flexibility receptors allow for docking of a larger ligand than would be allowed for with
a rigid receptor.
When the fit between the ligand and receptor does not need to be induced, the receptor
can retain its rigidity while maintaing the free energy of the system. For successful
docking, the parameters of the ligand need to be maintained and the ligand must be
slightly smaller in size than that of the receptor interface. No docking is completely rigid
though; there is intrinsic movement which allows for small conformational adaptation for
ligand binding. When the six degrees of freedom for protein movement are taken into
consideration (three rotational, three translational), the amount of inherent flexibility
allowed the receptor is even greater. This further offsets any energy penalty between the
receptor and ligand, allowing for easier, more enegetically favorable binding between the
two.
Aim of docking
The aim of docking is to find out the new drugs target, it will open new vistas for further
drug development .The finding of our docking will be useful in finding a cure for the
infectious disease bird flu, also it will open new avenues for finding other possible drug
targets in influenza A virus. The docking results can be used to design new lead
compounds and hence can aid in the new drug discovery process.
Receptor
A residue on the surface of the cell that serves as a recognition or binding site for
antigens,antibody or other cellular or immunological components.It is a molecule with in
a cell suface to which a substance (such as harmones or a drug ),selectively bind causing
a change in the activity of the cell.
Ligand
The molecule which binds to a protein molecule (eg, receptor). As a ligand binds through
the interaction of many weak, noncovalent bonds formed to the binding site of a protein,
the tight binding of a ligand depends upon a precise fit to the surface-exposed amino acid
37
residues on the protein.
Active Site
The active site of a protein/enzyme is the region that binds the substrates (and the
cofactor, if any). It also contains the residues that directly participate in the making and
breaking of bonds. These residues are called the catalytic groups. In essence, the
interaction of the enzyme and substrate at the active site promotes the formation of the
transition state. The active site is the region of the enzyme that most directly lowers the
Delta G of the reaction, which results in the rate enhancement characteristic of enzyme
action.
Amino acids in protein active sites:
RAMACHANDRAN PLOT
A Ramachandran Plot (also known as Ramachandran Map or a Ramachandran diagram ),
developed by Gopalasamudram Narayana Ramachandran, is a way to visualize dihedral
angles phi against (sai ) of amino acid residues in protein structure. It shows the possible
conformation of phi and
� shi angles for a polypeptide. In a polypeptide, the main chain
N-CZ and CZ- CZ bonds relatively are free to rotate. This plot is drawn between torsion
angles phi and psi. Ramachandran used computer models of small polypeptides to
38
systematically vary and with the objective of finding stable conformations. For each
conformation, the structure was examined for close contacts between atoms. Atoms were
treated as hard spheres with
dimensions corresponding to their Vander Waals radii. And the angles, which cause
spheres to collide,
correspond to sterically disallowed conformations of the polypeptide backbone.
SAVS is a server for analyzing protein structures for validity and assessing how correct
they are. Depending on how many programs one select to use, the server can take several
minutes to run. It also depends on how many residues there are in the protein that is
submitted.
PROCHECK
The aim of PROCHECK is to assess how normal, or conversely how unusual, the
geometry of the residues in a given protein structure is ,as compared with stereo chemical
parameters derived from well-refined, high resolution structure. The checks also make
use of ‘ideal’ bond lengths and bond angles, as derived from a recent and comprehensive
analysis of small molecule structures in the Cambridge Structural Database (CSD).
INPUT
The input to PROCHECK is a single file containing the coordinates of the protein
structure. One of the by-products of running PROCHECK is that coordinate file will be
“cleaned up” by the first of the programs. The cleaning up process corrects any
mislabelled atoms and creates a new coordinates file which has a file–extension of
.new. .new file will have the atoms labelled in accordance with the IUPAC naming
convention.
OUTPUT
The output comprises of the plots, together with detailed residue-by-residue listing. It
generates number of output files in the default directory which have the same name as the
original PDB file, but with different extensions.
39
The residue-by residue listing has a, out extension and lists all the computed stereo
chemical properties, by residue, in a printable ASCII text file.
ENERGY MINIMIZATION
Energy is a function of the degree of freedom in a molecule (i.e. bonds, angels, and
dihedrals).Energy minimization can repair distorted geometries by moving atoms release
internal constraints. Energy minimization is good to release local constraints for a
residue, but it will not pass through high energy barriers and stop in a local minima.
The potential energy calculated by summing the energies of various interactions is a
numerical value for a single conformation. This number can be used to evaluate a
particular conformation, but it may not be a useful measure of a conformation because it
can be dominated by a few bad interactions. For instance, a large molecule with an
excellent conformation fro nearly all atoms can have a large overall energy because of a
single bad interactions, for instance two atoms too near each other space and having a
huge Vander wals repulsion energy. It is often preferable to carry out energyminimization
on a conformation to find the best nearby conformation. Energy minimization isusually
performed by gradient optimization: atoms are moved so as to reduce the net forces on
them. The minimized structure has small forces on each atom and therefore serves as an
excellent starting point for molecular dynamics simulations.
40
41
Result and discussion
Entry Information
Entry name GLCTK_HUMAN
Primary accession number Q8IVS8
Blat result:-
pdb 2QIJ-C Chain C, Hepatitis B Capsid Protein With An N-Termina... 197 2e-51
pdb 2G33-C CAPSD_HBVD1 Chain C,Human T4 Capsid, Strain Ad... 192 6e-50
42
pdb 1AW9-A Chain A, Structure Of Glutathione S-Transferase Iii I 27 6.0
By ProtParam:
GLCTK_HUMAN (Q8IVS8)
43
Molecular weight: 55252.6
Atomic composition:
Carbon C 2435
Hydrogen H 3967
Nitrogen N 711
44
(41)
Oxygen O 719
Sulfur S 17
Formula: C2435H3967N711O719S17
Total number of atoms: 7849
Extinction coefficients:
10 20 30 40 50
60 70
| | | | |
| |
MAAALQVLPRLARAPLHPLLWRGSVARLASSMALAEQARQLFESAVGAVLPGP
MLHRALSLDPGGRQLKV
hhhhhhhhhhhccccccceeetcchhhhhhhhhhhhhhhhhhhhhhhhcccth
hhhhhhhhcttcceeee
(42)
RDRNFQLRQNLYLVGFGKAVLGMAAAAEELLGQHLVQGVISVPKGIRAAMERA
GKQEMLLKPHSRVQVFE
45
ccccccccceeeeeeccchhhhhhhhhhhhhhhhcctteeeecccccccccht
tchheeeccccceeeee
GAEDNLPDRDALRAALAIQQLAEGLTADDLLLVLISGGGSALLPAPIPPVTLE
EKQTLTRLLAARGATIQ
eccccccccchhhhhhhhhhhhhhccttceeeeeetttcceeeeccccccchh
hhhhhhhhhhhttcchh
ELNTIRKALSQLKGGGLAQAAYPAQVVSLILSDVVGDPVEVIASGPTVASSHN
VQDCLHILNRYGLRAAL
hhhhhhhhhhhhttcchhhhccchhheeeeeeccttccceeeecccccccccc
hhhhhhhhhhhtccccc
PRSVKTVLSRADSDPHGPHTCGHVLNVIIGSNVLALAEAQRQAEALGYQAVVL
SAAMQGDVKSMAQFYGL
chhhhhhhhhtcccccccccchhhhheeehcchhhhhhhhhhhhhttcceeee
ehhhhtchhhhhhhhhh
LAHVARTRLTPSMAGASVEEDAQLHELAAELQIPDLQLEEALETMAWGRGPVC
LLAGGEPTVQLQGSGRG
hhhhhhcttcccccccchhhhhhhhhhhhhhccchhhhhhhhhhhhcccccee
eeettcceeeeeccccc
GRNQELALRVGAELRRWPLGPIDVLFLSGGTDGQDGPTEAAGAWVTPELASQA
AAEGLDIATFLAHNDSH
ccchhhhhhhhhhhttccccccceeeeeccccccccccchhhheecthhhhhh
hhttcchhhhhhccccc
TFFCCLQGGAHLLHTGMTGTNVMDTHLLFLRPR
hhhhhhhttcheeeecccccccchheeeeecct
Sequence length : 523
SOPMA :
Alpha helix (Hh) : 235 is 44.93%
310 helix (Gg) : 0 is 0.00%
Pi helix (Ii) : 0 is 0.00%
Beta bridge (Bb) : 0 is 0.00%
Extended strand (Ee) : 80 is 15.30%
Beta turn (Tt) : 36 is 6.88%
Bend region (Ss) : 0 is 0.00%
Random coil (Cc) : 172 is 32.89%
Ambigous states (?) : 0 is 0.00%
Other states : 0 is 0.00%
(43)
46
Parameters :
Window width : 17
Similarity threshold : 8
Number of states : 4
ClustalW2 Results
1. Number of sequences 10
4. Sequence type aa
47
8. Your input file clustalw2-20080510-
09552541.input
Scores Table
(45)
2 Q64896|HBEAG_ASHV 217 8 P17099|HBEAG_HBVA4 214
65
2 Q64896|HBEAG_ASHV 217 9 Q81105|HBEAG_HBVA5 214
65
2 Q64896|HBEAG_ASHV 217 10 Q91C37|HBEAG_HBVA6 214
65
3 P03154|HBEAG_DHBV1 305 4 P0C6J9|HBEAG_DHBV3 305
97
3 P03154|HBEAG_DHBV1 305 5 P03153|HBEAG_GSHV 217
24
3 P03154|HBEAG_DHBV1 305 6 P0C692|HBEAG_HBVA2 214
26
3 P03154|HBEAG_DHBV1 305 7 P0C625|HBEAG_HBVA3 214
27
3 P03154|HBEAG_DHBV1 305 8 P17099|HBEAG_HBVA4 214
25
3 P03154|HBEAG_DHBV1 305 9 Q81105|HBEAG_HBVA5 214
26
48
3 P03154|HBEAG_DHBV1 305 10 Q91C37|HBEAG_HBVA6 214
26
4 P0C6J9|HBEAG_DHBV3 305 5 P03153|HBEAG_GSHV 217
25
4 P0C6J9|HBEAG_DHBV3 305 6 P0C692|HBEAG_HBVA2 214
26
4 P0C6J9|HBEAG_DHBV3 305 7 P0C625|HBEAG_HBVA3 214
27
4 P0C6J9|HBEAG_DHBV3 305 8 P17099|HBEAG_HBVA4 214
25
4 P0C6J9|HBEAG_DHBV3 305 9 Q81105|HBEAG_HBVA5 214
26
4 P0C6J9|HBEAG_DHBV3 305 10 Q91C37|HBEAG_HBVA6 214
26
5 P03153|HBEAG_GSHV 217 6 P0C692|HBEAG_HBVA2 214
70
5 P03153|HBEAG_GSHV 217 7 P0C625|HBEAG_HBVA3 214
69
5 P03153|HBEAG_GSHV 217 8 P17099|HBEAG_HBVA4 214
69
5 P03153|HBEAG_GSHV 217 9 Q81105|HBEAG_HBVA5 214
69
5 P03153|HBEAG_GSHV 217 10 Q91C37|HBEAG_HBVA6 214
69
6 P0C692|HBEAG_HBVA2 214 7 P0C625|HBEAG_HBVA3 214
98
6 P0C692|HBEAG_HBVA2 214 8 P17099|HBEAG_HBVA4 214
98
6 P0C692|HBEAG_HBVA2 214 9 Q81105|HBEAG_HBVA5 214
98
6 P0C692|HBEAG_HBVA2 214 10 Q91C37|HBEAG_HBVA6 214
98
7 P0C625|HBEAG_HBVA3 214 8 P17099|HBEAG_HBVA4 214
98
7 P0C625|HBEAG_HBVA3 214 9 Q81105|HBEAG_HBVA5 214
97
Alignment
49
P17099|HBEAG_HBVA4
------------------------------------------------------------
Q91C37|HBEAG_HBVA6
------------------------------------------------------------
P0C692|HBEAG_HBVA2
------------------------------------------------------------
P0C625|HBEAG_HBVA3
------------------------------------------------------------
Q81105|HBEAG_HBVA5
------------------------------------------------------------
Q64896|HBEAG_ASHV
------------------------------------------------------------
P03153|HBEAG_GSHV
------------------------------------------------------------
P03154|HBEAG_DHBV1
------------------------------------------------------------
P0C6J9|HBEAG_DHBV3
------------------------------------------------------------
Q8IVS8|GLCTK_HUMAN
MAAALQVLPRLARAPLHPLLWRGSVARLASSMALAEQARQLFESAVGAVLPGPMLHRALS 60
P17099|HBEAG_HBVA4 ----------------------MQLFHLCLIISCT-
CPTVQASKLCLGWLWG-------M 30
Q91C37|HBEAG_HBVA6 ----------------------MQLFHLCLIISCT-
CPTVQASKLCLGWLWG-------M 30
P0C692|HBEAG_HBVA2 ----------------------MQLFHLCLIISCT-
CPTVQASKLCLGWLWG-------M 30
P0C625|HBEAG_HBVA3 ----------------------MQLFHLCLIISCT-
CPTVQASKLCLGWLWG-------M 30
Q81105|HBEAG_HBVA5 ----------------------MQLFHLCLIISCT-
CPTFQASKLCLGWLWG-------M 30
Q64896|HBEAG_ASHV
----------------------MYLFHLCLVFACVSCPTVQASKLCLGWLWD-------M 31
P03153|HBEAG_GSHV
----------------------MYLFHLCLVFACVPCPTVQASKLCLGWLWD-------M 31
P03154|HBEAG_DHBV1
----------------------MWNLRITPLSFGAACQGIFTSTLLLSCVTVPLVCTIVY 38
P0C6J9|HBEAG_DHBV3
----------------------MWNLRITPLSFGAACQGIFTSTLLLSCVTVPLVCTIVY 38
Q8IVS8|GLCTK_HUMAN
LDPGGRQLKVRDRNFQLRQNLYLVGFGKAVLGMAAAAEELLGQHLVQGVISVPKGIRAAM 120
: : : . . . . * . :
P17099|HBEAG_HBVA4 DIDP------------------------
YKEFGATVELLSF------------------- 47
Q91C37|HBEAG_HBVA6 DIDP------------------------
YKEFGATVELLSF------------------- 47
P0C692|HBEAG_HBVA2 DIDP------------------------
YKEFGATVELLSF------------------- 47
P0C625|HBEAG_HBVA3 DIDP------------------------
YKEFGATVELLSF------------------- 47
Q81105|HBEAG_HBVA5 DIDP------------------------
YKEFGATVELLSF------------------- 47
Q64896|HBEAG_ASHV DIDP------------------------
YKEFGSSYQLLNF------------------- 48
50
P03153|HBEAG_GSHV DIDP------------------------
YKEFGSSYQLLNF------------------- 48
P03154|HBEAG_DHBV1 DSCL------------------------
YMDINASRALANVYD----------------- 57
P0C6J9|HBEAG_DHBV3 DSCL------------------------
YMDINASRALANVYD----------------- 57
Q8IVS8|GLCTK_HUMAN
ERAGKQEMLLKPHSRVQVFEGAEDNLPDRDALRAALAIQQLAEGLTADDLLLVLISGGGS 180
: : :: : ..
P17099|HBEAG_HBVA4 --LPSDFFPSVRDLLDTASALYREALES--------------------
PEHCSPHHTALR 85
Q91C37|HBEAG_HBVA6 --LPSDFFPSVRDLLDTASALYREALES--------------------
PEHCSPHHTALR 85
P0C692|HBEAG_HBVA2 --LPSDFFPSVRDLLDTASALYREALES--------------------
PEHCSPHHTALR 85
P0C625|HBEAG_HBVA3 --LPSDFFPSVRDLLDTASALYREALES--------------------
PEHCSPHHTALR 85
Q81105|HBEAG_HBVA5 --LPSDFFPSVRDLXDTASALYREALES--------------------
PEHCSPHHTALR 85
Q64896|HBEAG_ASHV --LPLDFFPELNALVDTATALYEEELTG--------------------
REHCSPHHTAIR 86
P03153|HBEAG_GSHV --LPLDFFPDLNALVDTAAALYEEELTG--------------------
REHCSPHHTAIR 86
P03154|HBEAG_DHBV1 --LPDDFFPKIDDLVRDAKDALEPYWKSDSIK-----------
KHVLIATHFVDLIEDFW 104
P0C6J9|HBEAG_DHBV3 --LPDDFFPKIDDLVRDAKDALEPYWRSDSIK-----------
KHVLIATHFVDLIEDFW 104
Q8IVS8|GLCTK_HUMAN
ALLPAPIPPVTLEEKQTLTRLLAARGATIQELNTIRKALSQLKGGGLAQAAYPAQVVSLI 240
** : *
:
P17099|HBEAG_HBVA4
QAILCWGELMTLATWVGNNLEDPASRDLVVNY---------------------------- 117
Q91C37|HBEAG_HBVA6
ETILCWGELMTLATWVGNNLEDPASRDLVVNY---------------------------- 117
P0C692|HBEAG_HBVA2
QAILCWGELMTLATWVGNNLQDPASRDLVVNY---------------------------- 117
P0C625|HBEAG_HBVA3
QAILCWGELMTLATWVGNNLEDPASRDLVVNY---------------------------- 117
Q81105|HBEAG_HBVA5
QAILCWGKLMTLATWVGNNLEDPASRDLVVNY---------------------------- 117
Q64896|HBEAG_ASHV
QALVCWEELTRLIAWMSANINSEEVRRVIVAH---------------------------- 118
P03153|HBEAG_GSHV QALVCWEELTRLITWMSENT-
TEEVRRIIVDH---------------------------- 117
P03154|HBEAG_DHBV1
QTTQGMHEIAESLRAVIPPTTTPVPPGYLIQHEEAEEIPLGDLFKHQEERIVSFQPDYPI 164
P0C6J9|HBEAG_DHBV3
QTTQGMHEIAEALRAVIPPTTTPVPQGYLIQHDEAEEIPLGDLFKHQEERIVSFQPDYPI 164
Q8IVS8|GLCTK_HUMAN
LSDVVGDPVEVIASGPTVASSHNVQDCLHILNRYGLRAALPRSVKTVLSRADSDPHGPHT 300
: : :
P17099|HBEAG_HBVA4
-------------VNTNMGLKIRQLLWFRISYLTFGRETVLEYLVSFGVWIRTPPAYRPP 164
Q91C37|HBEAG_HBVA6
-------------VNTNMGLKIRQLLWFHISCLTFGRETVLEYLVSFGVWIRTPPAYRPP 164
51
P0C692|HBEAG_HBVA2
-------------VNTNMGLKIRQLLWFHISCLTFGRETVLEYLVSFGVWIRTPPAYRPP 164
P0C625|HBEAG_HBVA3
-------------VNTNVGLKIRQLLWFHISCLTFGRETVLEYLVSFGVWIRTPPAYRPP 164
Q81105|HBEAG_HBVA5
-------------VNTNMGLKIRQLLWFHISCLTFGRETVLEYLVSFGVWIRTPPAYRPP 164
Q64896|HBEAG_ASHV
-------------VNDTWGLKVRQNLWFHLSCLTFGQHTVQEFLVSFGVRIRTPAPYRPP 165
P03153|HBEAG_GSHV
-------------VNNTWGLKVRQTLWFHLSCLTFGQHTVQEFLVSFGVWIRTPAPYRPP 164
P03154|HBEAG_DHBV1
TARIHAHLKAYAKINEESLDRARRLLWWHYNCLLWGEAQVTNYISRLRTWLSTPEKYRGR 224
P0C6J9|HBEAG_DHBV3
TARIHAHLKAYAKINEESLDRARRLLWWHYNCLLWGEANVTNYISRLRTWLSTPERYRGR 224
Q8IVS8|GLCTK_HUMAN
CGHVLNVIIGSNVLALAEAQRQAEALGYQAVVLSAAMQGDVKSMAQFYGLLAHVARTRLT 360
: : . * :: * . : : : :
*
P17099|HBEAG_HBVA4 NAPILSTLPETTVVRRRDRG-----------------------------
RSPRRRTPSPR 195
Q91C37|HBEAG_HBVA6 NAPILSTLPETTVVRRRDRG-----------------------------
RSPRRRTPSPR 195
P0C692|HBEAG_HBVA2 NAPILSTLPETTVVRRRDRG-----------------------------
RSPRRRTPSPR 195
P0C625|HBEAG_HBVA3 NAPILSTLPETTVVRRRDRG-----------------------------
RSPRRRTPSPR 195
Q81105|HBEAG_HBVA5 NAPILSTLPETTVVRRRDRG-----------------------------
RSPRRRTPSPR 195
Q64896|HBEAG_ASHV NAPILSTLPEHTVIRRRGSARVV--------------------------
RSPRRRTPSPR 199
P03153|HBEAG_GSHV NAPILSTLPEHTVIRRRGGSRAA--------------------------
RSPRRRTPSPR 198
P03154|HBEAG_DHBV1
DAPTIEAITRPIQVAQGGRKTTTGTRKPRGLEPRRRKVKTTVVYGRRRSKSRERRAPTPQ 284
P0C6J9|HBEAG_DHBV3
DAPTIEAITRPIQVAQGGRKTTSGTRKPRGLEPRRRKVKTTVVYGRRRSKSRERRAPTPQ 284
Q8IVS8|GLCTK_HUMAN
PSMAGASVEEDAQLHELAAELQIPDLQLEEALETMAWGRGPVCLLAGGEPTVQLQGSGRG 420
: :: . : . : . :
.
P17099|HBEAG_HBVA4
RRRSQSPRRRRSQSRESQC----------------------------------------- 214
Q91C37|HBEAG_HBVA6
RRRSQSPRRRRSQSRESQC----------------------------------------- 214
P0C692|HBEAG_HBVA2
RRRSQSPRRRRSQSRESQC----------------------------------------- 214
P0C625|HBEAG_HBVA3
RRRSPSPRRRRSQSRESQC----------------------------------------- 214
Q81105|HBEAG_HBVA5
RRRSQSPRRRRSQSRESQC----------------------------------------- 214
Q64896|HBEAG_ASHV RRRSQSPRRR-
PQSPASNC----------------------------------------- 217
P03153|HBEAG_GSHV
RRRSQSPRRRRSQSPASNC----------------------------------------- 217
P03154|HBEAG_DHBV1
RAGSPLPRSSSSHHRSPSPRK--------------------------------------- 305
P0C6J9|HBEAG_DHBV3
RAGSPLPRSSSSHHRSPSPRK--------------------------------------- 305
Q8IVS8|GLCTK_HUMAN
GRNQELALRVGAELRRWPLGPIDVLFLSGGTDGQDGPTEAAGAWVTPELASQAAAEGLDI 480
52
. . ..
P17099|HBEAG_HBVA4 -------------------------------------------
Q91C37|HBEAG_HBVA6 -------------------------------------------
P0C692|HBEAG_HBVA2 -------------------------------------------
P0C625|HBEAG_HBVA3 -------------------------------------------
Q81105|HBEAG_HBVA5 -------------------------------------------
Q64896|HBEAG_ASHV -------------------------------------------
P03153|HBEAG_GSHV -------------------------------------------
P03154|HBEAG_DHBV1 -------------------------------------------
P0C6J9|HBEAG_DHBV3 -------------------------------------------
Q8IVS8|GLCTK_HUMAN ATFLAHNDSHTFFCCLQGGAHLLHTGMTGTNVMDTHLLFLRPR 523
Guide Tree
(
(
(
(
(
(
Q8IVS8|GLCTK_HUMAN:0.59519,
(
P03154|HBEAG_DHBV1:0.01176,
P0C6J9|HBEAG_DHBV3:0.01119)
:0.36054)
:0.21341,
(
Q64896|HBEAG_ASHV:0.05849,
P03153|HBEAG_GSHV:0.02446)
:0.12844)
:0.14364,
Q81105|HBEAG_HBVA5:0.01168)
:0.00175,
P0C625|HBEAG_HBVA3:0.00818)
:0.00110,
P0C692|HBEAG_HBVA2:0.00445)
:0.00022,
P17099|HBEAG_HBVA4:0.00942,
Q91C37|HBEAG_HBVA6:0.00927);
Phylogram
53
pdb 1QGT-C was selected as template which showed around 85.6% identity with
target sequence and the template structure was downloaded from the PDB.
Swiss-PdbViewer was launched and the following procedure was carried out.
Open Swiss model and select load raw sequence option to load target molecule.
54
Perform magic fit, iterative fit provided under FIT in order to fit the two
sequences.
55
Select “submit modeling request” under Swiss model to submit it for modeling.
Homologous modeling:
56
SWISS MODEL WORKSPACE
Model information
Evalue: 2.70e-52
Alignment
TARGET 83 LV GFGKAVLGMA AAAEELLGQH
2b8nA 4 peslkklaie ivkksieavf pdravk--et lpklnldrvi lvavgkaawr
57
TARGET hh sss sssss hhh
2b8nA hhhhhhhh hhhhhhh hhhhhh hh sss sssss hhh
58
2b8nA sssssss ssss s hhhhhhh hh hhhh hh hhhhh
Model Validation:
INTRODUCTION
Structure Analysis and Validation Server greatly simplifies computational analysis of the
molecular structure and sequence of proteins. The stereochemical validation of model
structures of proteins is an important part of the comparative molecular modeling
process. Ramachandran plot is a way to visualize dihedral angles φ against ψ of amino
acid residues in protein structure. It shows the possible conformations of φ and ψ angles
for a polypeptide. The Ramachandran plot displays the psi and phi backbone
conformational angles for each residue in a protein. The distance between two succession
alpha carbon atoms in the backbone chain and the angles between the two bonds of such
atoms in desired protein can be determined using this plot.
Software
SAVS: http://nihserver.mbi.ucla.edu/SAVS/
Procedure
The target protein structure obtained after homology modeling using deep view and
modeler is given as input for SAVS.
59
SAVES results for proj_gunjan.pdb
Procheck summary
RAMCHANDRAN POLT:
Result -----
Plot statistics SCORE %age
Residues in most favoured regions [A,B,L] 990 85.6%
Residues in additional allowed regions [a,b,l,p] 104 9.0%
Residues in generously allowed regions [~a,~b,~l,~p] 11 1.0%
Residues in disallowed regions 51 4.4%
---- ---- ------------------
Number of non-glycine and non-proline residues 1156 100.0%
Number of end-residues (excl. Gly and Pro) 8
60
Number of glycine residues (shown as triangles) 127
(59)
61
Fig . after docking
62
*Warning* Can't add all hydrogens to incomplete residue: B 62:LYS
*Warning* Can't add all hydrogens to incomplete residue: B 65:ARG
*Warning* Can't add all hydrogens to incomplete residue: B 66:LYS
*Warning* Can't add all hydrogens to incomplete residue: B 316:HIS
*Warning* Can't add all hydrogens to incomplete residue: B 370:LYS
*Warning* Can't add all hydrogens to incomplete residue: B 380:TYR
*Warning* Can't add all hydrogens to incomplete residue: B 404:THR
PDB structure has crystal symmetry elements.
PDB structure has biological symmetry elements.
Loaded PDB file: C:\Program Files\Hex 5.0/examples\2B8N.pdb, (927 residues, 7597
atoms, 1 models)
*Warning* Fractional charge (0.35) for non-terminal residue: A 52:MSE
MSE:N Radius = 1.40, Charge = -0.52
MSE:CA Radius = 1.50, Charge = 0.14
MSE:C Radius = 1.40, Charge = 0.53
MSE:O Radius = 1.50, Charge = -0.50
MSE:CB Radius = 1.70, Charge = 0.04
MSE:CG Radius = 1.70, Charge = 0.09
MSE:SE Radius = 1.90, Charge = 0.32
MSE:CE Radius = 1.90, Charge = 0.01
MSE:H Radius = 0.00, Charge = 0.25
*Warning* Fractional charge (0.41) for non-terminal residue: A 82:ASP
ASP:N Radius = 1.40, Charge = -0.52
ASP:CA Radius = 1.50, Charge = 0.25
ASP:C Radius = 1.40, Charge = 0.53
ASP:O Radius = 1.50, Charge = -0.50
ASP:CB Radius = 1.70, Charge = -0.21
ASP:CG Radius = 1.40, Charge = 0.62
ASP:H Radius = 0.00, Charge = 0.25
*Warning* Fractional charge (0.35) for non-terminal residue: A 291:MSE
MSE:N Radius = 1.40, Charge = -0.52
MSE:CA Radius = 1.50, Charge = 0.14
MSE:C Radius = 1.40, Charge = 0.53
MSE:O Radius = 1.50, Charge = -0.50
MSE:CB Radius = 1.70, Charge = 0.04
MSE:CG Radius = 1.70, Charge = 0.09
MSE:SE Radius = 1.90, Charge = 0.32
MSE:CE Radius = 1.90, Charge = 0.01
MSE:H Radius = 0.00, Charge = 0.25
*Warning* Fractional charge (0.34) for non-terminal residue: A 318:LYS
LYS:N Radius = 1.40, Charge = -0.52
63
LYS:CB Radius = 1.70, Charge = 0.04
LYS:CG Radius = 1.70, Charge = 0.05
LYS:CD Radius = 1.70, Charge = 0.05
LYS:CE Radius = 1.70, Charge = 0.22
LYS:H Radius = 0.00, Charge = 0.25
*Warning* Fractional charge (0.35) for non-terminal residue: A 375:MSE
MSE:N Radius = 1.40, Charge = -0.52
MSE:CA Radius = 1.50, Charge = 0.14
MSE:C Radius = 1.40, Charge = 0.53
MSE:O Radius = 1.50, Charge = -0.50
MSE:CB Radius = 1.70, Charge = 0.04
MSE:CG Radius = 1.70, Charge = 0.09
MSE:SE Radius = 1.90, Charge = 0.32
MSE:CE Radius = 1.90, Charge = 0.01
MSE:H Radius = 0.00, Charge = 0.25
*Warning* Fractional charge (0.12) for non-terminal residue: B 8:LYS
LYS:N Radius = 1.40, Charge = -0.52
LYS:CA Radius = 1.50, Charge = 0.23
LYS:C Radius = 1.40, Charge = 0.53
LYS:O Radius = 1.50, Charge = -0.50
LYS:CB Radius = 1.70, Charge = 0.04
LYS:CG Radius = 1.70, Charge = 0.05
LYS:CD Radius = 1.70, Charge = 0.05
LYS:H Radius = 0.00, Charge = 0.25
*Warning* Fractional charge (0.12) for non-terminal residue: B 9:LYS
LYS:N Radius = 1.40, Charge = -0.52
LYS:CA Radius = 1.50, Charge = 0.23
LYS:C Radius = 1.40, Charge = 0.53
LYS:O Radius = 1.50, Charge = -0.50
LYS:CB Radius = 1.70, Charge = 0.04
LYS:CG Radius = 1.70, Charge = 0.05
LYS:CD Radius = 1.70, Charge = 0.05
LYS:H Radius = 0.00, Charge = 0.25
*Warning* Fractional charge (0.34) for non-terminal residue: B 17:LYS
LYS:N Radius = 1.40, Charge = -0.52
LYS:CA Radius = 1.50, Charge = 0.23
LYS:C Radius = 1.40, Charge = 0.53
LYS:O Radius = 1.50, Charge = -0.50
LYS:CB Radius = 1.70, Charge = 0.04
LYS:CG Radius = 1.70, Charge = 0.05
LYS:CD Radius = 1.70, Charge = 0.05
64
*Warning* Fractional charge (0.35) for non-terminal residue: B 52:MSE
MSE:N Radius = 1.40, Charge = -0.52
MSE:CA Radius = 1.50, Charge = 0.14
MSE:C Radius = 1.40, Charge = 0.53
MSE:O Radius = 1.50, Charge = -0.50
MSE:CB Radius = 1.70, Charge = 0.04
MSE:CG Radius = 1.70, Charge = 0.09
MSE:SE Radius = 1.90, Charge = 0.32
MSE:CE Radius = 1.90, Charge = 0.01
MSE:H Radius = 0.00, Charge = 0.25
*Warning* Fractional charge (0.12) for non-terminal residue: B 62:LYS
LYS:N Radius = 1.40, Charge = -0.52
LYS:CA Radius = 1.50, Charge = 0.23
LYS:C Radius = 1.40, Charge = 0.53
LYS:O Radius = 1.50, Charge = -0.50
LYS:CB Radius = 1.70, Charge = 0.04
LYS:CG Radius = 1.70, Charge = 0.05
LYS:CD Radius = 1.70, Charge = 0.05
LYS:H Radius = 0.00, Charge = 0.25
*Warning* Fractional charge (0.34) for non-terminal residue: B 66:LYS
LYS:N Radius = 1.40, Charge = -0.52
LYS:CA Radius = 1.50, Charge = 0.23
LYS:C Radius = 1.40, Charge = 0.53
LYS:O Radius = 1.50, Charge = -0.50
LYS:CB Radius = 1.70, Charge = 0.04
LYS:CG Radius = 1.70, Charge = 0.05
LYS:CD Radius = 1.70, Charge = 0.05
LYS:CE Radius = 1.70, Charge = 0.22
LYS:H Radius = 0.00, Charge = 0.25
*Warning* Fractional charge (-0.21) for non-terminal residue: B 81:ASP
ASP:N Radius = 1.40, Charge = -0.52
ASP:CA Radius = 1.50, Charge = 0.25
ASP:C Radius = 1.40, Charge = 0.53
ASP:O Radius = 1.50, Charge = -0.50
ASP:CB Radius = 1.70, Charge = -0.21
ASP:H Radius = 0.00, Charge = 0.25
*Warning* Fractional charge (0.35) for non-terminal residue: B 291:MSE
MSE:N Radius = 1.40, Charge = -0.52
MSE:CA Radius = 1.50, Charge = 0.14
MSE:C Radius = 1.40, Charge = 0.53
MSE:O Radius = 1.50, Charge = -0.50
MSE:CB Radius = 1.70, Charge = 0.04
MSE:CG Radius = 1.70, Charge = 0.09
MSE:SE Radius = 1.90, Charge = 0.32
MSE:CE Radius = 1.90, Charge = 0.01
MSE:H Radius = 0.00, Charge = 0.25
65
*Warning* Fractional charge (0.34) for non-terminal residue: B 370:LYS
LYS:N Radius = 1.40, Charge = -0.52
LYS:CA Radius = 1.50, Charge = 0.23
LYS:C Radius = 1.40, Charge = 0.53
LYS:O Radius = 1.50, Charge = -0.50
LYS:CB Radius = 1.70, Charge = 0.04
LYS:CG Radius = 1.70, Charge = 0.05
LYS:CD Radius = 1.70, Charge = 0.05
LYS:CE Radius = 1.70, Charge = 0.22
LYS:H Radius = 0.00, Charge = 0.25
*Warning* Fractional charge (0.35) for non-terminal residue: B 375:MSE
MSE:N Radius = 1.40, Charge = -0.52
MSE:CA Radius = 1.50, Charge = 0.14
MSE:C Radius = 1.40, Charge = 0.53
MSE:O Radius = 1.50, Charge = -0.50
MSE:CB Radius = 1.70, Charge = 0.04
MSE:CG Radius = 1.70, Charge = 0.09
MSE:SE Radius = 1.90, Charge = 0.32
MSE:CE Radius = 1.90, Charge = 0.01
MSE:H Radius = 0.00, Charge = 0.25
*Warning* Fractional charge (0.23) for non-terminal residue: B 404:THR
THR:N Radius = 1.40, Charge = -0.52
THR:CA Radius = 1.50, Charge = 0.27
THR:C Radius = 1.40, Charge = 0.53
THR:O Radius = 1.50, Charge = -0.50
THR:CB Radius = 1.50, Charge = 0.21
THR:H Radius = 0.00, Charge = 0.25
Counted 104 +ve and 114 -ve formal charged residues: Net formal charge: -10
*Warning* Using PDB CONECT records to define non-standard bonds.
>2B8N A
PESLKKLAIEIVKKSIEAVFPDRAVKETLPKLNLDRVILVAVGKAAWRMAKAAY
EVLGKKIRKGVVVTKYGHSEGPIDDFEIYEAGHPVPDENTIKTTRRVLELVDQLN
ENDTVLFLLSG
GGSSLFELPLEGVSLEEIQKLTSALLKSGASIEEINTVRKHLSQVKGGRFAERVFPA
KVVALVLSDVLGDRLDVIASGPAWPDSSTSEDALKVLEKYGIETSESVKRAILQE
TPKHLSNV
EIHLIGNVQKVCDEAKSLAKEKGFNAEIITTSLDCEAREAGRFIASIMKEVKFKDR
PLKKPAALIFGGETVVHVKGNGIGGRNQELALSAAIALEGIEGVILCSAGTDGTD
GPTDAAGGI
VDGSTAKTLKAMGEDPYQYLKNNDSYNALKKS
GALLITGPTGTNVNDLIIGLIV
>2B8N B
PESLKKLAIEIVKKSIEAVFPDRAVKETLPKLNLDRVILVAVGKAAWRMAKAAY
EVLGKKIRKGVVVTKYGHSEGPIDDFEIYEAGHPVPDENTIKTTRRVLELVDQLN
ENDTVLFLLSG
66
GGSSLFELPLEGVSLEEIQKLTSALLKSGASIEEINTVRKHLSQVKGGRFAERVFPA
KVVALVLSDVLGDRLDVIASGPAWPDSSTSEDALKVLEKYGIETSESVKRAILQE
TPKHLSNV
EIHLIGNVQKVCDEAKSLAKEKGFNAEIITTSLDCEAREAGRFIASIMKEVKFKDR
PLKKPAALIFGGETVVHVKGNGIGGRNQELALSAAIALEGIEGVILCSAGTDGTD
GPTDAAGGI
VDGSTAKTLKAMGEDPYQYLKNNDSYNALKKSGALLITGPTGTNVNDLIIGLIV
Assuming C:\Program Files\Hex 5.0/examples\2B8N.pdb is a PDB file...
67
ASP:O Radius = 1.50, Charge = -0.50
ASP:CB Radius = 1.70, Charge = -0.21
ASP:CG Radius = 1.40, Charge = 0.62
ASP:H Radius = 0.00, Charge = 0.25
*Warning* Fractional charge (0.35) for non-terminal residue: A 291:MSE
MSE:N Radius = 1.40, Charge = -0.52
MSE:CA Radius = 1.50, Charge = 0.14
MSE:C Radius = 1.40, Charge = 0.53
MSE:O Radius = 1.50, Charge = -0.50
MSE:CB Radius = 1.70, Charge = 0.04
MSE:CG Radius = 1.70, Charge = 0.09
MSE:SE Radius = 1.90, Charge = 0.32
MSE:CE Radius = 1.90, Charge = 0.01
MSE:H Radius = 0.00, Charge = 0.25
*Warning* Fractional charge (0.34) for non-terminal residue: A 318:LYS
LYS:N Radius = 1.40, Charge = -0.52
LYS:CA Radius = 1.50, Charge = 0.23
LYS:C Radius = 1.40, Charge = 0.53
LYS:O Radius = 1.50, Charge = -0.50
LYS:CB Radius = 1.70, Charge = 0.04
LYS:CG Radius = 1.70, Charge = 0.05
LYS:CD Radius = 1.70, Charge = 0.05
LYS:CE Radius = 1.70, Charge = 0.22
LYS:H Radius = 0.00, Charge = 0.25
*Warning* Fractional charge (0.35) for non-terminal residue: A 375:MSE
MSE:N Radius = 1.40, Charge = -0.52
MSE:CA Radius = 1.50, Charge = 0.14
MSE:C Radius = 1.40, Charge = 0.53
MSE:O Radius = 1.50, Charge = -0.50
MSE:CB Radius = 1.70, Charge = 0.04
MSE:CG Radius = 1.70, Charge = 0.09
MSE:SE Radius = 1.90, Charge = 0.32
MSE:CE Radius = 1.90, Charge = 0.01
MSE:H Radius = 0.00, Charge = 0.25
*Warning* Fractional charge (0.12) for non-terminal residue: B 8:LYS
LYS:N Radius = 1.40, Charge = -0.52
LYS:CA Radius = 1.50, Charge = 0.23
LYS:C Radius = 1.40, Charge = 0.53
LYS:O Radius = 1.50, Charge = -0.50
68
LYS:N Radius = 1.40, Charge = -0.52
LYS:CA Radius = 1.50, Charge = 0.23
LYS:C Radius = 1.40, Charge = 0.53
LYS:O Radius = 1.50, Charge = -0.50
LYS:CB Radius = 1.70, Charge = 0.04
LYS:CG Radius = 1.70, Charge = 0.05
LYS:CD Radius = 1.70, Charge = 0.05
LYS:H Radius = 0.00, Charge = 0.25
*Warning* Fractional charge (0.34) for non-terminal residue: B 17:LYS
LYS:N Radius = 1.40, Charge = -0.52
LYS:CA Radius = 1.50, Charge = 0.23
LYS:C Radius = 1.40, Charge = 0.53
LYS:O Radius = 1.50, Charge = -0.50
LYS:CB Radius = 1.70, Charge = 0.04
LYS:CG Radius = 1.70, Charge = 0.05
LYS:CD Radius = 1.70, Charge = 0.05
LYS:CE Radius = 1.70, Charge = 0.22
LYS:H Radius = 0.00, Charge = 0.25
*Warning* Fractional charge (0.35) for non-terminal residue: B 52:MSE
MSE:N Radius = 1.40, Charge = -0.52
MSE:CA Radius = 1.50, Charge = 0.14
MSE:C Radius = 1.40, Charge = 0.53
MSE:O Radius = 1.50, Charge = -0.50
MSE:CB Radius = 1.70, Charge = 0.04
MSE:CG Radius = 1.70, Charge = 0.09
MSE:SE Radius = 1.90, Charge = 0.32
MSE:CE Radius = 1.90, Charge = 0.01
MSE:H Radius = 0.00, Charge = 0.25
*Warning* Fractional charge (0.12) for non-terminal residue: B 62:LYS
LYS:N Radius = 1.40, Charge = -0.52
LYS:CA Radius = 1.50, Charge = 0.23
LYS:C Radius = 1.40, Charge = 0.53
LYS:O Radius = 1.50, Charge = -0.50
LYS:CB Radius = 1.70, Charge = 0.04
LYS:CG Radius = 1.70, Charge = 0.05
LYS:CD Radius = 1.70, Charge = 0.05
LYS:H Radius = 0.00, Charge = 0.25
*Warning* Fractional charge (0.34) for non-terminal residue: B 66:LYS
LYS:N Radius = 1.40, Charge = -0.52
69
LYS:CD Radius = 1.70, Charge = 0.05
LYS:CE Radius = 1.70, Charge = 0.22
LYS:H Radius = 0.00, Charge = 0.25
*Warning* Fractional charge (-0.21) for non-terminal residue: B 81:ASP
ASP:N Radius = 1.40, Charge = -0.52
ASP:CA Radius = 1.50, Charge = 0.25
ASP:C Radius = 1.40, Charge = 0.53
ASP:O Radius = 1.50, Charge = -0.50
ASP:CB Radius = 1.70, Charge = -0.21
ASP:H Radius = 0.00, Charge = 0.25
*Warning* Fractional charge (0.35) for non-terminal residue: B 291:MSE
MSE:N Radius = 1.40, Charge = -0.52
MSE:CA Radius = 1.50, Charge = 0.14
MSE:C Radius = 1.40, Charge = 0.53
MSE:O Radius = 1.50, Charge = -0.50
MSE:CB Radius = 1.70, Charge = 0.04
MSE:CG Radius = 1.70, Charge = 0.09
MSE:SE Radius = 1.90, Charge = 0.32
MSE:CE Radius = 1.90, Charge = 0.01
MSE:H Radius = 0.00, Charge = 0.25
*Warning* Fractional charge (0.34) for non-terminal residue: B 370:LYS
LYS:N Radius = 1.40, Charge = -0.52
LYS:CA Radius = 1.50, Charge = 0.23
LYS:C Radius = 1.40, Charge = 0.53
LYS:O Radius = 1.50, Charge = -0.50
LYS:CB Radius = 1.70, Charge = 0.04
LYS:CG Radius = 1.70, Charge = 0.05
LYS:CD Radius = 1.70, Charge = 0.05
LYS:CE Radius = 1.70, Charge = 0.22
LYS:H Radius = 0.00, Charge = 0.25
*Warning* Fractional charge (0.35) for non-terminal residue: B 375:MSE
MSE:N Radius = 1.40, Charge = -0.52
MSE:CA Radius = 1.50, Charge = 0.14
MSE:C Radius = 1.40, Charge = 0.53
MSE:O Radius = 1.50, Charge = -0.50
MSE:CB Radius = 1.70, Charge = 0.04
MSE:CG Radius = 1.70, Charge = 0.09
MSE:SE Radius = 1.90, Charge = 0.32
MSE:CE Radius = 1.90, Charge = 0.01
70
THR:C Radius = 1.40, Charge = 0.53
THR:O Radius = 1.50, Charge = -0.50
THR:CB Radius = 1.50, Charge = 0.21
THR:H Radius = 0.00, Charge = 0.25
Counted 104 +ve and 114 -ve formal charged residues: Net formal charge: -10
*Warning* Using PDB CONECT records to define non-standard bonds.
>2B8N A
PESLKKLAIEIVKKSIEAVFPDRAVKETLPKLNLDRVILVAVGKAAWRMAKAAY
EVLGKKIRKGVVVTKYGHSEGPIDDFEIYEAGHPVPDENTIKTTRRVLELVDQLN
ENDTVLFLLSG
GGSSLFELPLEGVSLEEIQKLTSALLKSGASIEEINTVRKHLSQVKGGRFAERVFPA
KVVALVLSDVLGDRLDVIASGPAWPDSSTSEDALKVLEKYGIETSESVKRAILQE
TPKHLSNV
EIHLIGNVQKVCDEAKSLAKEKGFNAEIITTSLDCEAREAGRFIASIMKEVKFKDR
PLKKPAALIFGGETVVHVKGNGIGGRNQELALSAAIALEGIEGVILCSAGTDGTD
GPTDAAGGI
VDGSTAKTLKAMGEDPYQYLKNNDSYNALKKSGALLITGPTGTNVNDLIIGLIV
>2B8N B
PESLKKLAIEIVKKSIEAVFPDRAVKETLPKLNLDRVILVAVGKAAWRMAKAAY
EVLGKKIRKGVVVTKYGHSEGPIDDFEIYEAGHPVPDENTIKTTRRVLELVDQLN
ENDTVLFLLSG
GGSSLFELPLEGVSLEEIQKLTSALLKSGASIEEINTVRKHLSQVKGGRFAERVFPA
KVVALVLSDVLGDRLDVIASGPAWPDSSTSEDALKVLEKYGIETSESVKRAILQE
TPKHLSNV
EIHLIGNVQKVCDEAKSLAKEKGFNAEIITTSLDCEAREAGRFIASIMKEVKFKDR
PLKKPAALIFGGETVVHVKGNGIGGRNQELALSAAIALEGIEGVILCSAGTDGTD
GPTDAAGGI
VDGSTAKTLKAMGEDPYQYLKNNDSYNALKKSGALLITGPTGTNVNDLIIGLIV
Found 223 MB main memory: setting N_MAX=33.
Check threefold = 0
Docking search mode = 6D rotation + translation (optimal).
71
Culling reduced surface complexity by 75 per cent (81770 triangles, 40885 vertices).
Total contouring time: 6.14 seconds.
------------------------------------------------------------------------------
Docking 1 pair of starting orientations...
72
Coefficient rotations done in 0.91 seconds.
3D search found 0/1616412672 within threshold but NOT including start guess.
Done 21924 3D FFTs for 1616412672 orientations in 7 min, 0 sec (3848574/s).
73
Coefficient rotations done in 0.00 seconds.
Main pass found 0 minima within threshold but NOT including start guess.
Main pass done in 0 min, 0 sec (1761/s).
------------------------------------------------------------------------------
Saving top 500 orientations.
------------------------------------------------------------------------------
---- ---- ------- ------- ------- ------- ------- ------ --- -----
Clst Soln Models Etotal Eshape Eforce Eair Vshape Vclash Bmp RMS
---- ---- ------- ------- ------- ------- ------- ------- ------ --- -----
1 1 000:000 0.0 0.0 0.0 0.0 0.0 0.0 -1 -1.00
---------------------------------------------------------------------------
1 1 000:000 0.0 0.0 0.0 0.0 0.0 0.0 -1 -1.00
74
75
Conclusion
After analyzing protein sequence of Hepatitis B virus we come to conclusion that though they all
are closely related, they have an important role in survival in different species. It is interesting to
have closer look at the matter by studying at the gene level. A phylogenetic analysis can be very
helpful in understanding the evolutionary pattern
.We have noticed that same genes are present in all strains this shows that are they
evolved together..
With the finishing of the ongoing gene sequencing project on HBV, we
hope it will be possible to draw conclusive decision about the true picture of evolution in near
future and gene responsible for pathogenesis can also be identified.
Complete inference can only be drawn based on a comprehensive list
of the gene products and their function.
In order to find out unknown structure of protein present in the
different species we do homology modelling.. We forward step to present a theoretical model
using available online modelling tools.
As we study that HBeAG ( Glycerate kinase ) protein that is coded by
gene is one of the second reasons of pathogenicity of HBV. So we tried to dock this protein with
appropriate ligand, in order to inhibit their activity on the basis of which the drugs have to be
developed.
76
77
Future prospects
The work presented in this report might just be a stepping stone for any such discoveries. The
present work might be small finding of big issue.
Phylogenetics is that field of biology which deals with identifying and understanding the
relationships between the many different kinds of life on earth. This includes methods for
collecting and analysing data, as well as interpretation of those results as new biological
information.
.
The purpose of modelling is to help the Drug developers and Biotechnologists to develop the
drug more efficiently and with more effectiveness in future by analysing the modelled structure
of protein.
As the new drugs target would be identified it will open new vistas for further drug
development .The finding of our docking will be useful in finding a cure for the infectious disease
bird flu, also it will open new avenues for finding other possible drug targets in influenza A virus.
The docking results can be used to design new lead compounds and hence can aid in the new drug
discovery process.
Finally, similar process can be applied on other pathogens and hence possible therapeutic sites
can be identified in them. Similar method can also be applied to other infectious diseases and
hence we can look forward to a better disease free world.
The work presented is just a small part of big issue and lots of work still needs to be done to
establish a good phylogenetic relationship and full fledged cure for bird flu. But we are hoping
that these findings will go long way and will prove fruitful to any going in a similar area.
78
79
BIBLIOGRAPHY
[1] - Lannsing M. Prescott,John P. Harley and Donald A. Klein ,Microbiology 6th edition
McGrawHill Higher Education,Human diseases caused by viruses
[4]- plumbed
[6]- Al-Lazikani, B., Sheinerman, F.B., and Honig, B. 2001. Combining multiple
structure and sequence alignments to improve sequence detection and alignment:
Application to the SH2 domains of Janus kinases. Proc. Natl. Acad. Sci. 98: 14796–14801. [PubMed].
Aloy, P., Querol, E., Aviles, F.X., and Sternberg, M.J. 2001. Automated structure-based
prediction of functional sites in proteins: Applications to assessing the validity of
inheriting protein function from homology in genome annotation and to protein docking.
J. Mol. Biol. 311: 395–408. [PubMed].
80
(77)
Altschul, S.F., Madden, T.L., Schaffer, A.A., Zhang, J., Zhang, Z., Miller, W., and
Lipman, D.J. 1997. Gapped BLAST and PSI-BLAST: A new generation of protein
database search programs. Nucleic Acids Res. 25: 3389–3402. [PubMed].
Apweiler, R., Attwood, T.K., Bairoch, A., Bateman, A., Birney, E., Biswas, M., Bucher,
P., Cerutti, L., Corpet, F., Croning, M.D., et al. 2000. InterPro—An integrated
documentation resource for protein families, domains and functional sites. Bioinformatics
81
82
Abbreviation
• CSA: Catalytic Site Atlas
83
84
85