j=i
F
_
r
i j
_
(2.1)
with accuracy and in such a way that the pairwise additive interactions do not scale as N
2
. Speed
up of the evaluation of both short-range and long-range forces is possible and for that we have some
techniques. Following the standard procedure to calculate the force by the derivative of a potential,
we can integrate Newtons equation of motion and many algorithms are available. A very simple
is the leap-frog integrator [32, 33] which is a Verlet-like second-order algorithm that evaluates the
velocities at half-integer time steps and uses these velocities to compute the new positions:
r(t +t) = r(t) +tv(t +t/2) (2.2)
v(t +t/2) = v(t t/2) +t
f (t)
m
(2.3)
The more sosticated Gear predictor-corrector algorithm falls into the general nite difference
pattern, where the estimate of the positions, velocities etc. at time t +t may be obtained by Taylor
expansion about time t. These values are estimated and do not represent the true trajectory. After
calculating the forces at the new position r
p
(t +t), the trajectories are corrected and the predicted
step is fed with the new information to iterate the corrected trajectory and r
c
(t +t) is now a better
approximation to the true position.
5
2.2 Atomistic Models
As described in the previous section, the key point of MD simulations is to solve Newtons equa-
tions of motion. But usually, the systems are dened by the potential energy rather than the forces,
which can then be easily calculated by the negative gradient of the potential: F(r
i j
) = V(r
i j
).
For that, many potential energy functions were developed to simulate protein systems, the so-called
force-elds (FF) [35]. The basic idea of a FF relies on mapping all the possible physical interactions
in the system and put them into a potential, like presented in eqs. 2.4 to 2.9:
V =V
noncov
+V
cov
= (V
LJ
+V
C
) +(V
bond
+V
bend
+V
dih
) (2.4)
V
LJ
(r
i j
) = 4
i j
_
C
(12)
i j
_
i j
r
i j
_
12
C
(6)
i j
_
i j
r
i j
_
6
_
(2.5)
V
C
(r
i j
) =
1
4
0
q
i
q
j
r
r
i j
(2.6)
V
bond
(r
i j
) =
1
2
k
(bond)
i j
_
r
i j
b
i j
_
2
(2.7)
V
bend
(
i jk
) =
1
2
k
bend
i jk
_
i jk
0
i jk
_
2
(2.8)
V
dih
(
i jkl
) =
1
2
[C
1
(1+cos()) +C
2
(1cos(2))
+ C
3
(1+cos(3)) +C
4
(1cos(4))] (2.9)
where the potential is divided in bonded (or covalent) and non-bonded (non-covalent). The non-
bonded interactions contain a repulsion term, a dispersion term, and a Coulomb term. The repulsion
and dispersion terms are combined in the Lennard-Jones (or 6-12 interaction). In addition, (partially)
charged atoms act through the Coulomb term. Bonded interactions are based on a xed list of atoms.
They are not exclusively pair interactions, but include 3- and 4-body interactions as well. There are
bond stretching (2-body), bond angle (3-body), and dihedral angle (4-body) interactions given by eqs.
2.7, 2.8 and 2.9, respectively.
There are many FF codes available nowadays, the most common are: AMBER [36], CHARMM
[37], GROMOS [38] and OPLS-AA [39]. Their potential energy is parametrized against experiments
and ab initio quantum mechanical calculations.
2.2.1 Setup atomistic simulations
The Atomistic simulations can reveal several details of the system, as it treats both the protein
under consideration and the water. However, it is very difcult to reach long time and length scales
within this framework. In this case, the atomistic simulations were carried out only to extract the pa-
6
rameters to t the Coarse-Grained model. A 30 residues collagen-like sequence was also constructed
for comparison with the CG model, but a simulation of the whole collagen-like sequence would not be
affordable under the computational resources available nowadays. To extract the required properties
of the degrees of freedom of the amino acids sequence , we divided the collagen-like protein in short
peptides and simulated them separetely to achieve a good sampling of the phase space.
The rst part of the work is concerned on dividing the collagen-like sequence in small blocks of 8
to 12 amino acids (see table 2.1). Checking the periodicity of the sequence and looking for blocks that
repeat very often, we found out how to minimize the number of short sequences and, consequently,
the computational effort and the amount of data to be analyzed. We ended up with 12 unit blocks
that together are able to reconstruct the whole sequence. All the sequences were constructed using
MOLMOL [40] software. The 12 blocks are labeled in such a way that it is possible to reconstruct
the original sequence (g. 2.1) by just sticking together all the short peptides in a repetitive sequence
as follows: A1 A2 A3 A4 A5 A6 A7 A8 A9 A10 A2 A3 A4 A5 A6 A7 A11 A12.
Figure 2.1: The whole collagen sequence dened by the 20-letters amino acid code. Different collors
relate to each one of the 12 short peptides.
The peptides (including the 30 residues one) were solvated with water molecules in a rhombic
dodecahedral periodic box. Details concerned to the number of atoms in the peptides as well as water
molecules and box sizes can be found in table 2.1. It has been observed that the presence of lysine
(K) and glutamine (Q) induces a positive net charge as these amino acids have a cationic form at low
pH [41]. To neutralize the system some water molecules were randomly selected and replaced by a
Cl
A and minimum image convention was used for the short-range non-bonded interacting
terms.
Energy was minimized with the Conjugate Gradient algorithm [46] that uses the gradient in-
formation from previous steps to bring the systems very close to the local minimum. After energy
7
minimization, a short protein position restrained run was performed to soak the solvent in between
the peptide. The sample was then equilibrated for 1ns at a constant pressure of 1 bar with Parrinello-
Rahman [47, 48] coupling barostat with a coupling constant of 1ps and at temperature of 298K using
Nos e-Hoover thermostat [49, 50] with a coupling constant of 0.2ps. After this straightfoward proce-
dure [52], all simulations were performed in a NVT ensemble at room pressure and 298K temperature
with a box size depending on the short peptide and listed in the table 2.1.
Label Sequence Volume(nm
3
) Number of Atoms
Protein Solvent Ions Cl
1
2
k
(
0
)
2
+
_
A(1+cos) +B(1cos) +C(1+cos3) +D
_
1+cos
_
+
4
___
(2.10)
+
i, ji+3
4
H
S
1
_
_
r
i j
_
12
S
2
_
r
i j
_
6
_
where we have the bond angle between three consecutive beads with
0
=105
=20
H
/rad
2
. The dihedral angle between four consecutive beads can assume
different conformations depending on the region to be described, with the constants A, B, C and D
dening the shape of the distribution. The LJ potential determines the attraction-repulsion between
the beads of size with the three avors i and i separated by r
i j
: B-B interactions are attractive and
represented by S
1
= S
2
=1; S
1
=1/3 and S
2
=0 apply for L-L and L-B interactions; and N-L, N-B
and N-N interactions have the constants S
1
=1 and S
2
=0. In the original HG model the bond lengths
are constrained by the RATTLE algorithm [54]. The non-bonded potentials are plotted in the Figure
below:
Figure 2.3: Non-bonded potential between neutral, hydrophilic, hydrophobic and proline (treated as
neutral) amino acids.
2.4 Adapted HG model
Based on the original Head-Gordon model and the adapted version for the silk part [14] of
the block-copolymer, we developed a four-letters minimalist model for the collagen-like block: hy-
drophilic (L), hydrophobic (B), neutral (N) and proline (P), where we dened proline as a separated
10
Figure 2.4: Full collagen sequence transcription from the 20-letters amino acid code to the adapted
four-letters minimalist HG code based on table 2.2.
avour, due to its key role on the stiffness of the dihedral angles. In the table 2.2 below we show the
sequence mapping between 20-letter amino acid and adapted CG four letter code, which generates a
minimalist full sequence according to the g. 2.4.
Name 20 4 Name 20 4
Glycine GLY / G N Aspargine ASN / N L
Alanine ALA / A B Proline PRO / P P
Glutamic Acid GLU / E L Glutamine GLN / Q L
Lysine LYS / K L Serine SER / S N
Table 2.2: Sequence mapping between 20-letter amino acid and adapted CG four letter code.
The FF also needs to be modied to cover the new changes in the dihedral angles, bond distances
(which are no longer constrained) and non-local interactions between beads. The new FF is given by
the equations 2.11, 2.12 and 2.13 below:
H
adap
=
b
1
2
k
b
(bb
0
)
2
+
1
2
k
(
0
)
2
+
_
V
weak
() +V
sti f f
()
(2.11)
+
i, ji+3
4
H
S
1
_
_
r
i j
0
_
12
S
2
_
r
i j
0
_
6
_
V
weak
() =
h
6
k=0
_
A
k
cos
k
()
_
(2.12)
V
sti f f
() = B
0
1
2
h
B
1
(
0
)
2
(2.13)
where now the bond distances are explicitly described by the spring potential, as CM
3
D package used
for the CG simulations employs a reversible multiple time-step integrator, the stiffness is given by
k
b
= 33
h
and b
0
= 3.84
=
20
h
and
0
= 105
i=1
m
i
|r
i
r
re f
i
|
2
_1
2
(2.14)
A second order moment about the mean chain position is the radius of gyration. It describes the
overall spread of the molecule and it is dened as the root mean square distance of the collection of
atoms from their common centre of gravity:
R
g
=
_
1
M
N
i=1
m
i
|r
i
r
cm
|
2
_1
2
(2.15)
where r
cm
denotes the position of center of mass of the protein. This measure gives a valuable way to
compare our CG method with the experimental data available for the collagen-like system.
Root Mean Square Fluctuation (RMSF) is a measure of the deviation between the position of
particle i and some reference position.
RMSF =
1
T
T
t
j
=1
_
r
i
(t
j
) r
i
_
2
(2.16)
where T is the time over which one wants to average, and r
i
is the reference position of particle i.
Typically this reference position will be the time-averaged position of the same particle i, ie. r
i
. Note
that, instead of averaging over the particles (as in RMSD), RMSF averages over the simulation time,
giving a value for each particle i, usually the C atoms.
13
3 Results
The results of the MD simulations described in the chapter 2 are presented here. We analyse the
bond distances, bend and dihedral angles and order parameters obtained from the Atomistic simu-
lations and use them to t the CG adapted model for the collagen-like block copolymer. Thus, we
present the improved CG model and compare the results with the previous atomistic simulations.
Lastly, we summarize the results obtained from the adapted model analysing the order parameters.
3.1 Atomistic Simulations
Atomistic simulations of the short peptides provided enough information about bonds, bends and
dihedral distributions, while the 30 residues collagen-like simulation revealed several details about
the dynamics of the protein. Bond distance distributions (see g. 3.1 (left)), strongly peaked around
3.84 0.12
A representing the distance between C- atoms of subsequent amino acids, justify the
use of a stiff harmonic potential for the bonds. Also based on the distributions calculated from the
atomistic simulations, the rather narrow exibility of the bend angles (see g. 3.1 (right)) justies the
same treatment as in the original HG model. The dihedral angles between four subsequent C- atoms
show periodic behaviour with two minima (exible) or harmonic with one minima (stiff) leading to
an expansion of the model in such a way that these details can be taken into account.
Figure 3.1: Bond distances (left) and bend angles (right) distributions obtained from an all-atom
simulation of the short peptide A1.
14
Analyzing the dihedral distributions shown in the g. 3.2, we can see three examples of rather
different potentials that were tted either with cosine expansion or harmonic potentials depending on
the position of proline in the dihedral angle.
Figure 3.2: Negative Logarithm of the dihedrals distibutions of the sequences LNPL (left), PNLP
(center) and LLNL (right) obtained froman all-atomsimulation of the short peptides in the minimalist
four-letters description. We can easily see how the distribution changes according to the relative
position of the Proline amino acid in the sequence.
After obtaining all the required parameters for the CG model, we present the results for
the 30 residues peptide which was simulated for 60ns to have a reference system and com-
pare with the new minimalist model. This sequence was simulated under the same procedure
adopted for the small pieces of the collagen-like sequence. We chose an intermediate sequence
|GNEGQPGQPGQNGQPGEPGSNGPQGSQGNP|to sample as many different amino acids as possible and cal-
culated the order parameters (see g. 3.3) for the RMSD, R
g
and RMSF.
Figure 3.3: RMSD (left), R
g
(center) and RMSF (right) calculated for a 60ns simulation of the 30
residues sequence. It can be observed that RMSD reaches a plateau after 45ns and also R
g
does not
change its value, but instead remains at during the whole simulation. The xaxis in the RMSF
graphic represents the C atoms and it can be seen that none of the atoms is more likely to nd a
more stable position related to the others.
15
3.2 Development of the adapted CG model
We tted the dihedrals potentials according to the distributions generated by the A1-A12 short
peptides 10ns simulations. An effetctive potential can be extracted by taking the negative logarithm
of the average distribution V() = lnP(). We found in the atomistic simulations distributions
rather different from the adapted HG model for the silk part, but which could all be tted either
in a cosine expansion (weak) or with a parabolic function (stiff): where the parameters A
k
, B and
C assume a wide range of values due to the specif characteristic of each dihedral distribution to
reproduce the measured dihedral angles of the short peptide, and at the same time not be inconsistent
with the average dihedral angles in the collagen-like sequence. We summarize in the table 3.1 the
characteristics of all possible dihedral angles between four beads in the collagen-like sequence, with
their correspondent tting function.
We consider proline as a neutral for the purposes of non-bonded interactions. Alanine is treated as
hydrophobic (S
1
=S
2
= 1) with a attractive potential. Glycine and serine are treated as neutral (repul-
sive) beads and interact with the hydrophobic and hydrophilic beads under the same potential (S
1
=
1; S
2
= 0). The lysine, aspargine, glutamine and glutamic acid residues are treated as hydrophilic
(repulsive) (S
1
= 1/3; S
2
=1) and also when they interact with hydrophobic beads.
To summarize, based on all-atom simulations of a short peptide in solution we have optimized
the CG model to reproduce the structural properties of the collagen-like sequence. Our model is
given by eqs. 2.11, 2.12 and 2.13. In the gures below we compare the atomistic simulation and the
tting functions with a 30 residues CG simulation of a piece of the collagen-like for the dihedrals
distributions. We can see that the harmonic function for the LNPL (g. 3.4 (left)) sequence has a
good agreement with the atomistic model, as expected, due to the strong connement of the potential.
Indeed the weak potential represented by the PNLP 3.4 (g. (center)) with the proline on the anks
also has a nice behaviour compared to the potential. The cosine tting of the LLNL 3.4 (g. (right))
sequence seems to have some steric hindrance, or maybe the phase space was not sampled enough,
but this tting did not interfer on the nal behavior of the collagen-like, as will be shown in the nal
results. However, as will be shown later, the results for the radius of gyration are already in a good
agreement with the theory, showing that even a rough tting of some dihedral distributions do not
interfer on the mesoscopic result, as the proline governs the dynamics with its stiff potential. A list of
all the dihedral distributions with the tting parameters is presented in appendix A.
16
Figure 3.4: Comparison between the Atomistic short peptides and 30 residues CG simulations for
the negative logarithm of the dihedrals distibutions of the sequences LNPL (left), PNLP (center) and
LLNL (right). It is observed that the tting for PNLP and LNPL sequences are good enough but the
agreement for the LLNL potential seems to show some histeric hindrance, or maybe the phase space
was not sampled enough.
3.3 Analysis of the collagen-like block
The analysis of all the dihedral angles in the atomistic simulations showed a strong sequence-
dependence of the distributions, leading us to adapt the original Head-Gordon model to achieve a
more accurate description of our collagen-like proteins. From our simulations, we concluded that
the high concentration of proline randomly spread in the sequence plays an important role on the
stiffness of the dihedral angles between four consecutive C and therefore proline must be taken
into account as a separated avour. In this way, we characterized the dihedrals distributions using
four avours: hydrophilic (L), hydrophobic (B), neutral (N) and proline (P). The relation between the
amino acids present in the collagen-like and their four letters minimalist codes are given in Table 2.2.
After taking the negative logarithm of the dihedral distributions, we tted them either with cosine
expansion (
6
k=0
_
A
k
cos
k
()
) or parabolic function (B
0
1
2
B
1
(
0
)
2
) according to the position of
the proline. The results of the observations of the dihedrals stiffness can be summarized in the table
below, whegre the distributions were divided in groups according to their main characteristcs. The
rst group, where the proline is not present, shows a exible and smooth logarithmic distribution
of the dihedral angles. The second group has a proline at the third position, and it is observed that it
makes the dihedral angles stiff and the logarithmic distribution is therefore very stiff with one minima.
The third group has a proline at the second position and, despite of its stiffness, it still can be tted
by a cosine expansion. The fourth and last group has proline in one or both anks, and it makes the
dihedral angles very exible and the distribution is, therefore, smooth.
Analyzing the table 3.1 and the gs. 3.4 above, it is possible to infer some conclusions about the
role that Proline plays in the dihedrals distributions:
1. Proline makes the dihedral angles stiffer when it is on the second or third position from the rst
17
Sequence Characteristics Fitting Function
NNLN, LLNL, NLNB, BNLN, NLLN exible and smooth Cosine
NNPN, LNPL, NNPL, NBPN very stiff with one minima Parabola
BPNL, LPNL, LPNN exible Cosine
NLNP, LLNP, PNLP, PBNL, LNLP very exible Cosine
Table 3.1: List of dihedral sequences and their tting functions in the minimalist four letters model.
It can be observed that the sequences are labeled in four categories, according to the exact position of
proline in the chain.
-carbon, as can be easlily seen in g. 3.4 (left).
2. Without Proline (see g. 3.4 (right)), the distribution has a periodic behaviour that can easily
be tted with a cosine expansion.
3. When Proline is positioned on the ankes (see g. 3.4 (center)), the periodic distribution of
dihedral angles is in addition very smooth, being tted easily with a weak potential in cosine
expansion.
The radius of gyration was calculated for 30 in the adapted HG model (see g. 3.5 (left)). We
can see from the two snapshots presented in the g. 3.5 (right) that the collagen changes its con-
formation in a very exible way, going from an almost completely extended situation to collapsed
conformations, conrming the experimental observations [11].
Figure 3.5: Radius of Gyration for the 30 residues sequence calculated from the adapted HG model
(left) and snapshots (right) of the 30 residues minimalist sequence at 300K showing the extended and
collapsed (bottom left) possible conformations achieved by the collagen-like in the 50ns simulations.
The red beads represent proline, the green are hydrophilic and blue are neutral.
18
3.4 Dependence of the R
g
with the number of residues
We then calculated the radius of gyration from the output of a CG simulation runned in CM
3
D
for many collagen sizes and plotted in g. 3.6 a logarithm scale curve for R
g
as a function of the
number of residues. There is a very good agreement with the experimental value for 400 residues
within the statistical error, which serves to validate the adapted HG model for the collagen-part block-
copolymer. Calculating the slope of the curve, we can see the dependence of the radius of gyration
with the number of molecules to be R
g
= 1.391(N)
0.528
in good agreement with the Florys exponent
(0.583) [62] in a good solvent (which means that the particles affectively repel each other), where N
denotes the number of residues.
Figure 3.6: Logarithm dependence between the radius of gyration (in nm) and the number of residues
on the collagen-like sequence. The simulation values are plotted with the experimental result for 400
residues sequence and tted with a linear function.
19
4 Conclusions and Future Perspectives
Finally, we can conclude that the adapted HG model developed for the collagen-like protein
can predict the experimentally observed order parameter value. Therefore, as stated in the begin,
it is conrmed that the high concentrations of proline and the charged/hydrophilic residues in the
sequence play an important role on avoiding the structure to folds into any specic state, but instead
retains its randomness. The radius of gyration has a very good agreement with experiments, behaving
in a logarithmic dependence with the number of residues and providing a good value for the Flory
expoenent.
In the future, this adapted collagen-like CG model will be combined with the adaped CG model
previously developed for the silk-part to be applied for the whole collagen-silk-collagen block copoly-
mer. Thus, it will enable us to study the effect of the collagen-like on silk-like block folding and
self-assembling.
20
Acknowledgements
Many people contributed to the accomplishment of this work. I pay here special attention to some
of them, not neccerily the most important, but the ones who were essential, in precise moments, to
consolidate this achievement.
First of all I thank God, for having given me the ability to learn and understand, always supplying
me with vitality to face the challenges and keeping up achieving my goals, never allowing me to
surrender, but instead keeping me humble.
I thank my supervisor Peter Bolhuis, for giving me the opportunity to start this Msc. project at the
University of Amsterdam. I also thank Marieke Schor, who co-supervised me during the project, and
made my life much easier with your expertise on protein folding and coarse-grained simulation. I also
thank the Molsim group, Bernd, Francesco, Anna, Grisell, Murat, Wolfgang, Rosanne and Zerihum,
my friend, with whom I had a great time in the course of this project. I also acknowledge Sara cluster
for the computer power provided for the simulations.
I also thank my friends in Amsterdam Pedro, Dimas, Max, Igor, Raquel, Vinicius, Girry, Anthony,
Adrien, and many other students, thanks you all for the great time we had here in Amsterdam. To
my friends in Lyon Roberto, Diego, Rodrigo, Franck, Dorian, Jakub, Alex, Aion, Jana, that made that
short stay in France one of the best periods in my life.
I would like to thank the coordinators of the AtoSim Programme for accepting my application to
this course and Erasmus Mundus for the scholarship.
Finally, to all of them who contributed directly or indirectly to the accomplishment of this project.
Thank you very much!
21
APPENDIX A -- Dihedral Angles
Here we present all the dihedral angles distributions analized in the collagen-like sequence as
well as the tting parameters. It can be observed that some of them have almost the same behavior
and can be tabulated in four categories dened by the position of the proline. These groups are shown
in the table 3.1. All the sequences are listed in the gures subsequent, with the tting parameters at
the captions.
Figure A.1: The tting coefcients are NNLN: A
0
=-3.34, A
1
=1.51, A
2
=2.19, A
3
=-1.22, A
4
=-3.34,
A
5
=-0.69; LLNL: A
0
=-3.31, A
1
=-1.99, A
2
=-0.64, A
3
=2.17, A
4
=-0.14, A
5
=-0.69; NLLN: A
0
=-4.25,
A
1
=0.69, A
2
=2.27, A
3
=-0.29, A
4
=-1.32, A
5
=0.36.
Figure A.2: The tting coefcients are BNNL: A
0
=-3.98, A
1
=0.63, A
2
=-0.42, A
3
=-1.98, A
4
=0.70,
A
5
=0.89; NLNB: A
0
=-1.61, A
1
=-0.33, A
2
=-4.59, A
3
=2.08, A
4
=1.73, A
5
=-0.81; PBNL: A
0
=-2.68,
A
1
=0.08, A
2
=1.72, A
3
=0.55, A
4
=0.33, A
5
=-0.11.
22
Figure A.3: The tting coefcients are LNPL - NNPL: B
0
=-5.81, B
1
=0.051,
0
=-50.24; NNPN:
B
0
=-6.73, B
1
=0.23,
0
=-109.86; NBPN: B
0
=-6.54, B
1
=0.28,
0
=-115.27.
Figure A.4: The tting coefcients are BPNL: A
0
=-2.34, A
1
=-2.73, A
2
=-1.08, A
3
=10.75, A
4
=-12.81,
A
5
=5.07; LPNL - LPNN: A
0
=-2.33, A
1
=-0.23, A
2
=-2.84, A
3
=-0.02; LNLP: A
0
=-3.73, A
1
=-2.24,
A
2
=1.02, A
3
=1.02, A
4
=-1.41, A
5
=2.03.
Figure A.5: The tting coefcients are NLNP: A
0
=-4.59, A
1
=-1.02, A
2
=2.63, A
3
=-0.18, A
4
=-0.43,
A
5
=0.91; LLNP: A
0
=-1.61, A
1
=-0.33, A
2
=-4.59, A
3
=2.08, A
4
=1.73, A
5
=-0.81; PNLP: A
0
=-3.72,
A
1
=-2.25, A
2
=1.02, A
3
=1.01, A
4
=-1.41, A
5
=2.03.
23
Bibliography
[1] I. W. Lyo, P. Avouris, Field-Induced Nanometer-Scale to Atomic-Scale Manipulation of Silicon
Surfaces with the Stm. Science 253 173 (1991).
[2] J. Cappello, J. Crissman, M. Dorman, M. Mikolajczak, G. Textor, M. Marquet and F. Ferrari,
Genetic Engineering of Structural Protein Polymers. Biotechnol. Prog. 6, 198 (1990).
[3] M. Haider, Z. Megeed and H. Ghandehari, Genetically engineered polymers: status and
prospects for controlled release. J. Control. Rel. 95, 1 (2004).
[4] R. Langer and D. A. Tirrell Designing materials for biology and medicine. Nature, 428, 487
(2004).
[5] G. A. Silva, C. Czeisler, K. L. Niece, E. Beniash, D. A. Harrington, J. A. Kessler and S. I.
Stupp, Selective Differentiation of Neural Progenitor Cells by HighEpitope Density Nanobers.
Science, 303, 1352 (2004).
[6] D.W. Urry, Elastic molecular machines in metabolism and soft-tissue restoration. Trends
Biotechnol. 17, 249 (1999).
[7] D. A. Harrington, E. Y. Cheng, M. O. Guler, L. K. Lee, J. L. Donovan, R. C. Claussen, S. I. Stupp
Branched peptide-amphiphiles as self-assembling coatings for tissue engineering scaffolds. J.
Biom. Mat. Res. A, 78A, 157 (2006).
[8] J. Cappello, H. Ghandehari, Engineered Protein Polymers for Drug Delivery and Biomedical
Applications. Adv. Drug Deliv. Rev. 54, 1053 (2002).
[9] D. Chitkara, A. Shikanov, N. Kumar, A. J. Domb, Biodegradable Injectable In Situ Depot-
Forming Drug Delivery Systems. Macromol. Biosc. 6, 977 (2006).
[10] C. Parka, J. Yoonb and E. L. Thomas. Enabling nanotechnology with self assembled block
copolymer patterns. Polymer 44, 6725 (2003).
[11] A. A. Martens, Silk-Collagen-like Block Copolymers with Charged Blocks, self-assembly into
nanosized ribbons and macroscopic gels. PhD Thesis, Wageningen Universiteit, The Nether-
lands (2008).
[12] M. W. T. Werten, W. H. Wisselink, T. J. J. van den Bosch, E. C. de Bruin and F. A. de Wolf,
Secreted production of a custom-designed, highly hydrophilic gelatin in Pichia pastoris. Protein
Engineering 14, 447 (2001).
[13] M. T. Krejchi, E. D. T. Atkins, A. J. Waddon, M. J. Fournier, T. L. Mason and D. A. Tirrell,
Chemical Sequence Control Of Sheet Assembly In Macromolecular Crystals Of Periodic
Polypeptides. Science 265, 1427 (1994).
[14] M. Schor, B. Ensing and P. G. Bolhuis, A simple coarse-grained model for self-assembling silk-
like protein bers.Soft Matter 5, 2658 (2009). DOI: 10.1039/b902952d
24
[15] M. W. T. Werten, T. J. van den Bosch, R. D. Wind, H. Mooibroek and F. A. de Wolf High-yield
secretion of recombinant gelatins by Pichia pastoris. Yeast 15, 1087 (1999).
[16] A. A. Martens, G. Portale, M. W. T. Werten, R. J. de Vries, G. Eggink, M. A. C. Stuart and F. A.
de Wolf, Triblock Protein Copolymers Forming Supramolecular Nanotapes and pH-Responsive
Gels. Macromol. 42 1002 (2009).
[17] M. P. Allen and D. J. Tildesley, Computer Simulation of Liquids, Oxford University Press.
[18] D. Frenkel and B. Smith, Understanding Molecular Simulation - From Algorithms to Applica-
tions, Academic Press.
[19] D. Bhella, A. Ralph and R. P. Yeo, Conformational Flexibility in Recombinant Measles Virus
Nucleocapsids Visualised by Cryo-negative Stain Electron Microscopy and Real-space Helical
Reconstruction. J. Mol. Biol. 340, 319 (2004).
[20] M. Levitt, A simplied representation of protein conformations for rapid simulation of protein
folding. J. Mol. Biol, 104, 59 (1976).
[21] M. M. Tirion, Large amplitude elastic motions in proteins from a single-parameter, atomic anal-
ysis. Phys Rev Lett 77, 1905 (1996).
[22] S. Kundy, R. L. Jernigan, Molecular mechanism of domain swapping in proteins: an analysis of
slower motions. Biophys. J. 86, 3846 (2004).
[23] Y. Ueda, H. Taketomi and N. Go, Studies on protein folding, unfolding, and uctuations by
computer simulation. II. A. Three-dimensional lattice model of lysozyme. Biopolymers, 17, 1531
(1978).
[24] J. A. McCammon, S. H. Northrup, M. Karplus, R. M. Levy. Helix-coil transitions in a simple
polypeptide model. Biopol. 19, 2033 (1980).
[25] S. Brown, N. J. Fawzi, and T. Head-Gordon, Coarse-grained sequences or protein folding and
design. Proc. Natl. Acad. Sci. USA, 2003, 100, 10712-10717.
[26] N. L. Fawzi, E. H. Yap, Y. Okabe, K. L. Kohlstedt, S. P. Brown and T. Head-Gordon, Contrasting
Disease and Nondisease Protein Aggregation by Molecular Simulation. Acc. Chem. Res., 2008,
41 (8), 10371047.
[27] I. Bahar, R. L. Jernigan, Inter-residue potentials in globular proteins and the dominance of
highly specic hydrophilic interactions at close separation. J. Mol. Biol. 266, 195 (1997).
[28] A. V. Smith, C. K. Hall, helix formation: discontinuous molecular dynamics on an
intermediate-resolution protein model. Proteins 44, 344 (2001).
[29] A. V. Smith, C. K. Hall, Assembly of a tetrameric a-helical bundle: computer simulations on an
intermediate-resolution protein model. Proteins 44, 376 (2001).
[30] Hess, B., Kutzner, C., van der Spoel, D. and Lindahl, E. (2008) GROMACS 4: Algorithms for
Highly Efcient, Load-Balanced, and Scalable Molecular Simulation, J. Chem. Theory Com-
put., 4, 435-447.
[31] http://www.cmm.upenn.edu/resources/indexsoft.html
[32] R. W. Hockney,S. P. Goel, J. Eastwood, Quiet highresolution computer models of a plasma. J.
Comp. Phys. 14, 148 (1974).
25
[33] R. W. Hockney and J. W. Eastwood, Computer Simulations Using Particles. McGraw Hill, New
York (1981).
[34] S. Auerbach and A. Friedman. Long-term behaviour of numerically computed orbits: Small and
intermediate timestep analysis of one-dimensional systems. J. Comput. Phys. 93(1), 189 (1991).
[35] W. Wang, O. Donini, C. M. Reyes, P. A. Kollman1, BIOMOLECULAR SIMULATIONS: Re-
cent Developments in Force Fields, Simulations of Enzyme Catalysis, Protein-Ligand, Protein-
Protein, and Protein-Nucleic Acid Noncovalent Interactions. Annu. Rev. Biophiys. Biom. 30,
211 (2001).
[36] W. D. Cornell, P. Cieplak, C. I. Bayly, I. R. Gould, K. M. Merz, D. M. Ferguson, D. C.
Spellmeyer, T. Fox, J. W. Caldwell, P. A. Kollman, A Second Generation Force Field for the
Simulation of Proteins, Nucleic Acids, and Organic Molecules. J. Am. Chem. Soc. 117, 5179
(1995).
[37] A. D. MacKerell Jr., D. Bashford, M. Bellott, R. L. Dunbrack Jr., J. D. Evanseck, M. J. Field, S.
Fischer, J. Gao, H. Guo, S. Ha, D. Joseph-McCarthy, L. Kuchnir, K. Kuczera, F. T. K. Lau, C.
Mattos, S. Michnick, T. Ngo, D. T. Nguyen, B. Prodhom, W. E. Reiher, B. Roux, M. Schlenkrich,
J. C. Smith, R. Stote, J. Straub, M. Watanabe, J. Wi orkiewicz-Kuczera, D. Yin and M. Karplus,
All-Atom Empirical Potential for Molecular Modeling and Dynamics Studies of Proteins. J.
Phys. Chem. B 102 3586 (1998).
[38] M. Christen, P. H. Hnenberger, D. Bakowies, R. Baron, R. Brgi, D. P. Geerke, T. N. Heinz,
M. A. Kastenholz, V. Krutler, C. Oostenbrink, C. Peter, D. Trzesniak, W. F. van Gunsteren,
The GROMOS software for biomolecular simulation: GROMOS05. J. Comput. Chem. 26 1719
(2005).
[39] G. A. Kaminski, R. A. Friesner J. Tirado-Rives and W. L. Jorgensen, Evaluation and
Reparametrization of the OPLS-AA Force Field for Proteins via Comparison with Accurate
Quantum Chemical Calculations on Peptides, J. Phys. Chem. B 105 6474 (2001).
[40] R. Koradi, M. Billeter and K. Wuthrich, MOLMOL: A program for display and analysis of
macromolecular structures. J. Mol. Phys., 14, 51 (1996).
[41] V. Humblot, C. M ethivier and C. M. Pradier. Adsorption of L-Lysine on Cu(110): A RAIRS Study
from UHV to the Liquid Phase. Lagmuir 22, 3089 (2006).
[42] D. van der Spoel, P. J. van Maaren and H. J. C. Berendsen, A systematic study of water models
for molecular simulation: Derivation of water models optimized for use with a reaction eld. J.
Chem. Phys. 108, 10220 (1998).
[43] B. Hess, H. Bekker, H. J. C. Berendsen, J. G. E. M. Fraaije, LINCS: A Linear Constraint Solver
for Molecular Simulations. J. Comp. Chem. 18, 1463 (1997).
[44] T. Darden, D. York, L. Pedersen, Particle mesh Ewald: An N-log(N) method for Ewald sums in
large systems. J. Chem. Phys. 98, 10089 (1993).
[45] U. Essmann, L. Perera, M. L. Berkowitz, T. Darden, H. Lee, L. G. Pedersen, A smooth particle
mesh ewald potential. J. Chem. Phys. 103, 8577 (1995).
[46] K. Zimmerman, All purpose molecular mechanics simulator and energy minimizer. J. Comp.
Chem. 12, 310 (1991).
26
[47] M. arrinello, A. Rahman, Polymorphic transitions in single crystals: A new molecular dynamics
method. J. Appl. Phys. 52, 7182 (1981).
[48] Nos e, S., Klein, M. L. Constant pressure molecular dynamics for molecular systems. Mol. Phys.
50:10551076, 1983.
[49] S. Nos e, A unied formulation of the constant temperature molecular dynamics methods. J.
Chem. Phys. 81, 511 (1984).
[50] W. G. Hoover, Canonical dynamics: Equilibrium phase-space distributions. Phys. Rev. A 31,
1695 (1985).
[51] W. G. Hoover, Constant-pressure equations of motion. Phys. Rev. A, 34, 2499 (1986).
[52] J. Juraszek and P. G. Bolhuis, Sampling the multiple folding mechanisms of Trp-cage in explicit
solvent. Proc. Natl. Acad. Sci. 103, 15859 (2006).
[53] Z. Guo and D. Thirumalai, Kinetics and Thermodynamics of Folding of a de novo Designed four
Helix Bundle J. Mol. Biol. 263, 323 (1996).
[54] H.C.J. Andersen, Rattle: A velocity version of the shake algorithm for molecular dynamics
calculations. J. Comput. Phys. 52, 24 (1983).
[55] A. V. Smith and C. K. Hal, Protein refolding Versus aggregation: computer simulations on an
intermediate-resolution protein model. J. Mol. Biol., 2001, 312, 187-202.
[56] V. Tozzini, Coarse-grained models for proteins. Curr. Opin. Struct. Biol., 2005, 15, 144-50.
[57] H. M. Knig and A. F. M. Kilbinger, Learning from Nature: -Sheet-Mimicking Copolymers. Get
Organized. Angew. Chem. Int. Ed., 2007, 46, 8334-8340.
[58] Nomenclature and Symbolism for Amino Acids and Peptides.
IUPAC-IUB Joint Commission on Biochemical Nomenclature. 1983.
http://www.chem.qmul.ac.uk/iupac/AminoAcid/AA1n2.html. Retrieved on 2008-11-17.
[59] I. W. Lyo and P. Avouris. Field-Induced Nanometer- to Atomic-Scale Manipulation of Silicon
Surfaces with the STM. Science 253, 173 (1991).
[60] Galo J. de A. A. Soler-Illia, Clment Sanchez, Bndicte Lebeau, and Jol Patarin, Chemical Strate-
gies To Design Textured Materials: from Microporous and Mesoporous Oxides to Nanonetworks
and Hierarchical Structures. Chem. Rev. 102, 4093 (2002).
[61] M. W. T. Werten, W. H. Wisselink, T. J. Jansen-van den Bosch, E. C. de Bruin and F. A. de Wolf,
Secreted production of a custom-designed, highly hydrophilic gelatin in Pichia pastoris. Protein
Engineering 14(6), 447 (2001).
[62] P. J. Flory, Principles of Polymer Chemistry. Cornell University Press, Ithaca, New York (1953).
[63] S. Park, F. Khalili-Araghi, E. Tajkhorsid and K.Schulten, Free energy calculation from steered
molecular dynamics simulations using Jarzynskis equality J. Chem. Phys. 119, 3559 (2003).
[64] S. Park and K.Schulten, Calculating potentials of mean force from steered molecular dynamics
simulations. J. Chem. Phys. 120, 5946 (2003).