Ref: https://www.researchgate.net/profile/Gary_Beane/publication/264048370/figure/fig1/AS:296562925293568@1447717519197/Figure-2-
Comparison-of-the-two-common-helices-that-polyproline-shown-is-a-10-proline.png
2. In context of Ramachandran Map, define allowed, partial allowed and disallowed regions.
Ramachandran Map is a 2-D representation of sterically allowed torsion angles and of amino acid
residues in a protein. The torsion angles and indicate rotations of the polypeptide backbone around the
bonds Ci-1-Ni-C-Ci and Ni-C-Ci-Ni+1. Using theoretical estimations of limiting Vanderwaals radius of
different atoms, specific conformations of a pair of linked peptide units can be deduced. Allowed region
consists of conformations where all the distances were greater than the normal Vanderwaals limit. Partially
allowed regions consist of confirmations where one or more distances were less than the outer Vanderwaals
limit (short contact) while disallowed region consists of confirmations where some distances lie between the
two limits.
Page 1$ of 10
$
http://www.cryst.bbk.ac.uk/PPS95/course/3_geometry/rama.gif
3. What is bridge region in R-map? This region was not considered as allowed region in initial R-maps.
Why bridge region is observed to be allowed regions?
The allowed regions of the Ramachandran map consists of the alpha region and the beta-region (that include
conformations giving rise to -helices and -strands respectively), and a smaller alpha L-region as the
peptide atoms are given standard Vanderwaals radii they do not collide. The bridge region consists of
confirmations between the alpha- and beta-regions that will be allowed if the atoms are given smaller
Vanderwaals radii that represent the smallest values that could be considered possible. One of the
assumptions during the construction of the Ramachandran map that the angle (backbone angle) took a
constant value of 110. Experimental studies have shown a greater degree of freedom to , leading to greater
allowed region in the conjunction and of alpha helix and beta sheet.
Page 2$ of 10
$
Ref: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3149190/figure/fig01/
A structural domain is a conserved part of a protein structure that can fold and exist independently of the rest
of the protein chain. Based on conservation of sequence, function and folding, domains can be called as
sequence, functional and folding domains.
5. Discuss STRIDE method algorithm (provide salient features) and provide its applications.
STRIDE (Structural identification) is an algorithm for the assignment of protein secondary structure
elements given the atomic coordinates of the protein. STRIDE considers the weighted contribution of
secondary structure forming hydrogen bond and statistically derived back bone torsion angles. Similar to
DSSP it assigns Hydrogen bonds between peptide units if the electrostatic interaction energy between C=O
of one residue and NH of another residue is <0.5 kcal/mole. Also, it looks at amino acid propensities with
given torsion angles to form secondary structure like alpha-helix and Beta-sheets.
Page 3$ of 10
$
6. Why protein domain structural classification is important? Define various hierarchical levels of
domain classification.
Levels of CATH:
7. Take at least 2 PDB structure having resolution <1.5 Ang. Generate Ramachandran map and find
no. of points outside allowed region. Use WHATIF server for ramachandran map. Justify, if you can,
points outside allowed ramachandran map.
5GV7
Page 4$ of 10
$
5GV7
Structure of NADH-cytochrome b5 reductase refined with the multipolar atomic model at 0.80 A
5IYQ
Protruding domain of GII.4 human norovirus CHDC2094 in complex with HBGA type B (triglycan)
Page 5$ of 10
$
The points in the disallowed region could be due to:
Compensatory interaction between residues
Low resolution
In an alpha-helix, the first four >N-H groups and last four >C=O groups necessarily lack intrahelical
hydrogen bonds. To compensate for this lack of intrahelical hydrogen bonds, there are specific patterns of
hydrogen bonding and hydrophobic interactions found near the ends of helices in proteins called helix
capping motifs. The amide hydrogens at the helix N-terminus are predominantly interact with side-chain H-
bond acceptors while carbonyl oxygens at the C-terminus are satisfied primarily by backbone >N - H groups
from the turn following the helix.
Nomenclature
N terminalmotifs
C terminal
Page 6$ of 10
$
9. Discuss different models of helix packing observed protein tertiary structures.
Ridges into grooves helix packing: This form of helix packing consists of helix surface rows of residues
form, in effect, ridges separated by shallow grooves and helices pack with the ridges of one packing into the
grooves of the other and vice versa. The ridges and grooves are formed by residues whose separation in the
sequence is usually four and occasionally three helices.
http://www.cryst.bbk.ac.uk/PPS2/course/section9/9_helhel.html
Knobs into holes packing: In this motif, the residues on one helix are surrounded at the
helix-helix interface by four residues of the other helix and looks much like two springs slammed together.
Ref: http://caps.ncbs.res.in/coilcheck/img/knob1.gif
Page 7$ of 10
$
10.What is amino acid propensity?
Amino acid propensity represents an intrinsic property of amino acid to form a particular secondary
structure. These values are experimentally determined and used for structure prediction methods.
P = (nij/ni)/(Nj/NT)
where nij is the number of residues of type i in structure of type j, ni is the total number of residues of type i,
Nj is the total number of residues in structure of type j, and NT is the total number of residues used for
calculation.
The anfinsen hypothesis postulates that, at given physiological parameters (temperature, solvent
concentration and composition, etc.) the native structure of polypeptide is a unique, stable and has minimal
free energy.
13.What is levinthal's paradox? How can this paradox be resolved to explain protein folding?
The Levinthal paradox states that if a protein were to sample all the possible conformations before folding, it
would require more time than the universe has existed (10^16 seconds) to explore all possible conformations
and choose the appropriate one, so by reductio ad absurdum it is not true. It wont be computationally
feasible to predict of protein structures under the same basis. This paradox can be resolved by
assuming that protein doesnt sample all confirmation, but follows a guided path towards global minima of
free energy.
Page 8$ of 10
$
14.What is contact map? Give significance of contact map in protein structure prediction and
analysis?
A contact map is a binary two-dimensional matrix representing the interaction/distance between amino acid
residue pairs of a 3-D protein structure. For two residues i and j, the ij element of the matrix is 1 if the two
residues are closer than a predetermined threshold, and 0 otherwise. Various thresholds have been proposed:
The distance between the C-C atom with threshold 6-12 ;
Ref: https://www.researchgate.net/profile/Antonio_Rey/publication/233722434/figure/fig8/AS:294769948413957@1447290040184/Contact-
map-for-the-GB1-protein-The-lower-right-triangle-shows-the-native-contacts-in.png
De novo structure prediction or predicting tertiary structure from a amino acid sequence is an extremely
tedious procedure. Here, you dont have a template on which base the structure. To determine the structure,
we have to get the positional coordinates of all the atoms i.e if you have N atoms, you have 3N parameters
based on the appropriate force field. It is computationally very tedious after after you the stereochemical
restraints.
16.What are 2 broad categories of protein structure prediction methods? Highlight main differences
between them.
Protein Structure prediction can be categorised into either Ab initio (Template free) like Rossetta and
Template based methods like Phyre2. The template based methods can be of two types: Threading where a
library of template is sampled from and Comparative modelling where the template is homologous to the
query sequence. Ab initio methods are used when one cannot identify any likely homologs of known
structure, therefore no templates are used. They usually use a knowledge-based strategy like Energy function
for searching conformational space.
Page 9$ of 10
$
17. What is Monte Carlo method? What is metropolis condition of acceptance?
Monte Carlo simulations use stochastic methods to generate new configurations of a system of interest. It
can be used for statistical prediction of protein structures. Monte Carlo simulations are free from the
restrictions of solving Newtons equations of motion hence dont provide dynamic information. The
following is done to generate protein structure: A random confirmation is assumed. Here, every point that is
accessible in configuration space is reached from any other point in a finite number of Monte Carlo moves.
Confirmation of one of the atoms is changed randomly and new energy is calculated. If the energy is lower
then the newer confirmation is accepted and again the confirmation is changed till an optimum structure is
formed. Metropolis condition comes to play when we have to decide about transition to new confirmation
based on energy of the current and next state. If the energy of the newer confirmation is lower then the newer
confirmation is accepted. If the energy is higher then the newer confirmation is accepted with a probability
proportional to the Boltzmann factor.
Page 10
$ of $10