Anda di halaman 1dari 13

Improved pKa calculations through flexibility based

sampling of a water-dominated interaction scheme

JIM WARWICKER
Department of Biomolecular Sciences, University of Manchester Institute of Science and Technology (UMIST),
Manchester M60 1QD, United Kingdom
(RECEIVED April 2, 2004; FINAL REVISION June 30, 2004; ACCEPTED July 6, 2004)

Abstract
Ionizable groups play critical roles in biological processes. Computation of pKas is complicated by model
approximations and multiple conformations. Calculated and experimental pKas are compared for relatively
inflexible active-site side chains, to develop an empirical model for hydration entropy changes upon charge
burial. The modification is found to be generally small, but large for cysteine, consistent with small molecule
ionization data and with partial charge distributions in ionized and neutral forms. The hydration model
predicts significant entropic contributions for ionizable residue burial, demonstrated for components in the
pyruvate dehydrogenase complex. Conformational relaxation in a pH-titration is estimated with a mean-field
assessment of maximal side chain solvent accessibility. All ionizable residues interact within a low protein
dielectric finite difference (FD) scheme, and more flexible groups also access water-mediated Debye-Hückel
(DH) interactions. The DH method tends to match overall pH-dependent stability, while FD can be more
accurate for active-site groups. Tolerance for side chain rotamer packing is varied, defining access to DH
interactions, and the best fit with experimental pKas obtained. The new (FD/DH) method provides a fast
computational framework for making the distinction between buried and solvent-accessible groups that has
been qualitatively apparent from previous work, and pKa calculations are significantly improved for a mixed
set of ionizable residues. Its effectiveness is also demonstrated with computation of the pH-dependence of
electrostatic energy, recovering favorable contributions to folded state stability and, in relation to structural
genomics, with substantial improvement (reduction of false positives) in active-site identification by elec-
trostatic strain.
Keywords: protein electrostatics; pKas; ionization entropy; side-chain packing; active-site identification;
structural genomics

Ionizable group interactions are important factors in various and Honig 1986) have been applied to pKa calculations
biological processes (Warshel 1981, 2003; Honig and Ni- (Bashford and Karplus 1990). These can be useful when
cholls 1995; Warshel and Papazyan 1998; Simonson 2001), applied to regions with limited solvent accessibility (SA;
and are a focus of attempts to identify functional sites for Demchuk and Wade 1996; Warwicker 1998), but ␧p ⳱ 4
structural genomics (Elcock 2001; Ondrechen et al. 2001). calculations based on a single conformer have been largely
Grid-based continuum electrostatics methods, such as Finite unreliable for overall pKa analysis. Generally ␧p ⳱ 20 per-
Difference Poisson-Boltzmann (FDPB; Warwicker and forms better (Antosiewicz et al. 1994, 1996), presumably
Watson 1982; Klapper et al. 1986; Warwicker 1986) using accounting to some degree for other factors (Schutz and
a low protein-relative dielectric (typically ␧p ⳱ 4; Gilson Warshel 2001) such as conformational variation (Simonson
and Perahia 1995), proton/hydrogen-bond network relax-
ation (Nielsen et al. 1999), or specific internal water binding
Reprint requests to: Jim Warwicker, Department of Biomolecular Sci- (Fitch et al. 2002). Indeed, a Debye-Hückel (DH) model
ences, UMIST, P.O. Box 88, Manchester M60 1QD, UK; e-mail: jim. with water dielectric also gives reasonable agreement over-
warwicker@umist.ac.uk; fax: +44-(0)161-236-0409.
Article and publication are at http://www.proteinscience.org/cgi/doi/ all for pKas, and for the pH dependence of folding energy
10.1110/ps.04785604. when combined with a simple model for ionizable group

Protein Science (2004), 13:2793–2805. Published by Cold Spring Harbor Laboratory Press. Copyright © 2004 The Protein Society 2793
Warwicker

interactions in the unfolded state (Warwicker 1999). A term and underestimates such a ⌬pKa, whereas FD with the
Tanford-Kirkwood model gives reasonable estimates for Born term provides the required destabilization. Because
carboxylate pKas in ubiquitin (Sundd et al. 2002). As with DH would give a higher statistical weight than FD, a com-
FDPB/␧p ⳱ 20, the DH and Tanford-Kirkwood models fail bined scheme must exclude DH and reflect the restricted
to generate the large ⌬pKas often associated with functional conformation. Secondly (top of Fig. 1), in a relatively flex-
groups (Warwicker 1998). ible salt bridge the Born contributions are balanced by fa-
The problem of accounting for large and small ⌬pKas in vorable charge–charge interactions. For a pH shift that neu-
one scheme is being addressed with inclusion of conforma- tralizes one of the partners, without conformational relax-
tional and proton configurational relaxation in ␧p ⳱ 4 cal- ation FD will record a Born term for the remaining ionized
culations. Multiple conformers are generally sampled from group that is not balanced by charge–charge interactions
molecular dynamics simulations (You and Bashford 1995; (other than potential hydrogen bonds). The ionized partner
Zhou and Vijayakumar 1997; van Vlijmen et al. 1998; Kou- is likely to seek a more solvent accessible conformation,
manov et al. 2001; Gorfe et al. 2002), with some improve- reducing the Born energy penalty. Interactions in this water-
ment in match to experiment and significant increase in dominated situation could be estimated by DH modeling of
computational requirement. A key issue is to address those the salt bridge, as an alternative to FD calculations on a
conformational adjustments of most relevance to a pH ti- range of generated conformers.
tration. Multiconformation continuum electrostatics A mean-field algorithm is used to describe side-chain
(MCCE) samples side-chain ionization and conformation rotamer packing (Koehl and Delarue 1994; Cole and War-
(Alexov and Gunner 1997; Georgescu et al. 2002; Alexov wicker 2002), and to define access to DH interactions,
2003), yielding a pKa root-mean-square (RMS) error of through assessment of maximal SA (SAmax) for each ion-
0.83. An approximation in this model is that all conformers izable group over rotamer variation. The method, termed
are combined to produce a single dielectric boundary. A FD/DH, reproduces ⌬pKas over a wide range, and is com-
method that assigns higher or lower electrostatic screening pared with other techniques. The contribution of ionizable
functions, (applied to Coulomb potentials), according to the groups to folded state stability is discussed, with FD/DH
hydrophobicity/hydrophilicity of residue microenviron- delivering overall stabilization in contrast to much FD/
ments performs well, giving an RMS error of 0.5 for a large ␧p ⳱ 4 single conformer work. The utility of a distinction
pKa set (Mehler and Guarnieri 1999). between flexible and buried groups is considered in the
The current work also pursues two interaction schemes, context of active-site finding for structural genomics, and
the DH model that is effective for flexible groups, and with regard to automation of active-site subset selection for
FDPB/␧p ⳱ 4, which can give large ⌬pKas of functional detailed electrostatic analysis (Nielsen and McCammon
interest, with combination in a framework of conforma- 2003).
tional relaxation. FD (for FDPB/␧p ⳱ 4) interactions are An earlier empirical analysis of hydration entropy and
always sampled for the experimental conformer, and DH pKa calculations (Warwicker 1997) has been extended with
interactions are selectively introduced as a mimic for relax- study of relatively buried ionizable groups in the FD/␧p ⳱ 4
ation from this conformer. Two examples illustrate this model, and comparison made to small molecule ionization
idea. First, a buried lysine, neutral at pH 7 due to dehydra- entropies. It is concluded that pKa adjustments due to hy-
tion (lower residue in Fig. 1). The DH model omits the Born dration entropy are relatively small, except in the case of
cysteine. Changes in hydration entropy that can be impor-
tant in binding energetics (Jung et al. 2002) are discussed in
the framework of the empirical analysis.

Results and Discussion

Estimation of hydration entropy changes upon


ionization and charge burial
A subset of ionizable groups with large ⌬pKas and sur-
rounding ionizations that can be reasonably assigned at a pH
around the pKa of interest are studied, partitioning the
analysis from the wider prediction of pH dependence (Table
1). Figure 2 shows that these groups are relatively buried in
Figure 1. Schematic diagram of ionizable group relaxation with pH in two
different environments. An upper salt bridge can alter hydration upon pH
terms of subsequent FD/DH analysis (Fig. 3), and estimates
titration, while the lower basic group cannot (e.g., a lysine with a reduced of changes in hydration entropy are made with the single
pKa). conformer FD/␧p ⳱ 4 model. Polar hydrogen optimization

2794 Protein Science, vol. 13


Side-chain packing and ionizable group energetics

Table 1. Proteins and groups used for pKa calculations

PDB ID FD/DH gps Ns group Protein Source

4pti 11 — trypsin inhibitor bovine pancreas


3icb 10 — calbindin bovine intestine
1b0d 18 E35 lysozyme hen egg-white
1pga 13 — protein G Streptococcus
3rn3 15 — ribonuclease A bovine pancreas
2rn2 22 — ribonuclease H Escherichia coli
1a2p 10 — barnase Bacillus amyloliquefaciens
1ppf 11 — ovomucoid inhibitor 3rd domain turkey
1xnb 1 E172 xylanase Bacillus circulans
9pap 1 C25 papain papaya
1a21 1 C30 DsbA (reduced) Escherichia coli
fructose 1,6-bisphosphate
1ado 1 K229 aldolase rabbit muscle
1axt 1 K93 (H) immunoglobulin/aldolase mouse
1gsd 1 Y9 glutathione S-transferase A1-1 human
1nai 1 Y149 UDP-galactose 4-epimerase Escherichia coli
2trx — C32 thioredoxin Escherichia coli
1p2p — H48 phospholipase A2 porcine pancreas
4cha — H57 ␣-chymotrypsin bovine
1163 — H31 lysozyme (C54T, C97A mutant) T4 phage

FD/DH gps gives the number of ionisable groups used in FD/DH calculation. Ns group specifies those
used in Ns derivation. Monomers were used in all cases (selecting the first subunit from crystal coordi-
nates where required). For thioredoxin, a model of the reduced D26N mutant was derived from the
oxidised wild-type structure. Details (e.g. inclusion of ligands) are given in the text.

is applied for the hydroxyl groups of serine and threonine. Ns ⳱ 0.7. A second calculation, with partially charged
Tyrosine hydroxyls are included where the unionized form forms for ionized and neutral E172 and torsioning of the
is used, which for entropic term modeling is in all calcula- Y80 hydroxyl to point away from neutral E172, gives
tions other than those of Figure 2C. A previous study (War- Ns = 1.9. Without Ns modification, partial charge analysis
wicker 1997) looked at carboxylate and thiolate groups in a gives ⌬pKa ⳱ 3.7 versus 2.8 for the net charge model and
range of SA environments, deriving differential hydration 2.3 experimentally. Both calculations qualitatively capture
numbers upon ionization (Ns) of about 2 (carboxylate) and the pKa shift.
6 (thiolate). Current work extends the analysis for groups Hen egg white lysozyme is a well-known model for elec-
with large ⌬pKas, and includes polar hydrogen optimiza- trostatic calculations, particularly E35 (pKa ⳱ 6.1; Kura-
tion. mitsu and Hamaguchi 1980). Figure 2A shows the restricted
environment of E35 and the elements that give Ns ⳱ 1.5,
again with qualitative agreement to the experiment without
Hydration entropy and pKas: Aspartic
Ns modification (Warwicker 1997). If discrepancies in the
and glutamic acids, cysteine, and tyrosine
carboxylate group calculations of Figure 2A are approxi-
Bacillus circulans xylanase E172 has an elevated pKa (6.7; mated as hydration entropy within the FD framework, then
Joshi et al. 1997) that, along with E78 (pKa ⳱ 4.6), defines a related single Ns value is relatively small and positive
the pH optimum for hydrolysis. Calculated electrostatic in- (Table 2; Warwicker 1997).
teractions are shown for E172 in Figure 2A. Ionization as- In contrast, cysteine residues with large ⌬pKas require a
signment for other groups was made with reference to mea- high value of Ns to match experiment (Table 2; Warwicker
sured pKas (Joshi et al. 1997), and with model compound 1997). Figure 2B shows the active site around papain C25
pKas at neutral pH otherwise. H149 has a measured pKa of and H159, with a pKa of 3.3 for C25 (Noble et al. 2000).
<2.3 and H156 of 6.7. The distance between carboxylate Oxygen atoms were removed from oxidized C25 in the
(172) and imidazole (156) groups is about 23 Å, and the crystal structure. Calculations were made with H159 proto-
protonation state of H156 makes no significant difference to nated and carboxylates deprotonated. Of the two closest
the calculations. Modeled hydrogen bonds between E172 acidic groups, D158 has a pKa around 2.8 (Noble et al.
and N35, Y80 are shown in Figure 2A. Discrepancy be- 2000), and the mutation E50A in caricain has only a small
tween the sum of calculated ⌬pKa contributions and the affect on the pH dependence of activity (Ikeuchi et al.
measured ⌬pKa, with a burial fraction (Vf) of 0.57, gives 1998).

www.proteinscience.org 2795
Warwicker

wild-type thioredoxin, uncoupling the similar pKas of C32


and D26, with C32 pKa ⳱ 7.5 (Chivers et al. 1997) and
Ns ⳱ 4.9. Calculated Ns of 4.5, 4.2, and 4.9 imply signifi-
cant discrepancies in FD/␧p ⳱ 4 calculations of pKas for
relatively buried cysteines, without the Ns term.
For tyrosine, the active-site residue Y9 of glutathione
S-transferase A1–1 has a low pKa (8.1) in the substrate-free
enzyme (Björnestedt et al. 1995). Calculations with a mono-
mer from the dimeric crystal structure (Cameron et al. 1995)
give Ns ⳱ 1.4 (Fig. 2C). A pKa of 6.1 has been measured
for Y149 in the active site of UDP-galactose-4-epimerase
with NAD+ bound (Liu et al. 1997). Removing UDP, but
not NAD+, from the crystal structure (Thoden et al. 1996)
gives Ns ⳱ 0.8 (Fig. 2C), with a strong interaction to the
positively charged side chain of K153 and a hydrogen bond
between the deprotonated Y149 side chain and a ribose
hydroxyl group.

Hydration entropy and pKas:


Lysine, histidine, and arginine
Figure 2D shows the environment of K229 in rabbit muscle
aldolase (Blom and Sygusch 1997), implicated in Schiff
base formation, with pKa probably matching the decline of
activity around pH 6.5 (Morris and Tolan 1994). A mono-
mer from the tetramer (with product removed) gives
Ns ⳱ 0.2. The network of ionizable groups around K229
(Fig. 2D) were assigned unperturbed charges for neutral pH
calculations. A catalytic antibody with aldolase activity has
an active-site lysine (H chain K93) with a pKa of 5.5 esti-
mated from the pH dependence of enamine formation (Bar-
bas et al. 1997). Figure 2D shows the enclosed environment
of K93(H), with W103(H) adjacent and H27(L) distant, but
displayed to facilitate the view. NZ atom locations are
shown from 1axt and modeled with more solvent exposure,
because electron density for K93(H) does not extend be-
yond CE (Barbas et al. 1997). These give Ns ⳱ 2.0 (1axt,
quoted in Fig. 2D) and Ns ⳱ 0.8 (model), with the low pKa
dominated by the Born term.
Histidine 48 of porcine pancreatic phospholipase A2 is
catalytic (Thunnissen et al. 1990), with a pKa of 5.5 in the
Figure 2. Active-site groups, with conformational restriction, used to de- calcium-bound form (Verheij et al. 1980). Calculation gives
rive Ns values for hydration entropy modification. The Born and Q–Q
Ns ⳱ 2.0 (Table 2). In chymotrypsin, interactions between
(interaction with background and other ionizable group charge), ⌬pKa
contributions are calculated components, which together with the experi- partially charged/protonated or partially charged/neutral
mental ⌬pKa (Expt), and the fraction of the hydration shell buried in the
protein relative to isolated amino acid (Vf), are used to derive Ns (see
Materials and Methods). (A) Glutamic acid; (B) Cysteine; (C) Tyrosine; Table 2. Ns values for pKa calculations: derived with the
(D) Lysine. hydration shell model

Asp, Glu Cys Tyr Lys His


Calculations (Fig. 2B) with a reduced DsbA crystal struc-
Ns each site 0.7, 1.5 4.5, 4.2, 4.9 1.4, 0.8 0.2, 2.0 2.0, 0.9, 3.1
ture (Guddat et al. 1998) give Ns ⳱ 4.2, where H60, E24,
Ns average 1.1 4.5 1.1 1.1 2.0
and E38 were assigned neutral (Warwicker 1998), and the
measured C30 pKa is 3.4 (Grauschopf et al. 1995). A re- Ns calculation scheme given in Materials and Methods, and details of
duced model of the D26N mutant was made from oxidized, protein sites in Results and Discussion.

2796 Protein Science, vol. 13


Side-chain packing and ionizable group energetics

forms of H57, and other catalytic residues D102 and S195 basic side chains in amino acids, partially due to overlap-
have been evaluated, with S195 hydroxyl directed toward ping proton binding equilibria. In contrast, the analog data
neutral H57 and away from protonated H57. Comparison show similar ionization ⌬S within acidic and within basic
with the measured pKa of 6.8 (Fersht and Renard 1974) groups, distinguishing between dissociation from one neu-
gives Ns ⳱ 0.9 (Table 2). For T4 lysozyme, the noncatalytic tral to two charged species (acidic, larger ⌬S), and from one
residue H31 (pKa ⳱ 9.1) is significantly buried in a salt charged to one charged and one neutral species (basic,
bridge with D70 (Anderson et al. 1990), giving Ns ⳱ 3.1 smaller ⌬S). Differencing between average ⌬S values for
(Table 2). This Ns-based study of buried side chains with each grouping (−103.5 J/deg/mole for AH ⇒ A− + H+ and
known pKas has excluded histidine residues with compli- −17.0 J/deg/mole for BH+ ⇒ B + H+), cancelling the proton
cating factors such as strongly coupled titrations and ligand term, gives ⌬S ⳱ −86.5 J/deg/mole for AH ⇒ A− summed
involvement. Variability in Ns values for histidine (prior to with B ⇒ BH+. Approximating equal ⌬S for each of these
averaging) probably reflects relatively mixed environments unionized to ionized form transitions, ⌬S ⳱ −43.3 J/deg/
and the omission of local relaxations (Edgcomb and Mur- mole is obtained, equivalent to Ns ⳱ 1.7 (300 K) in the
phy 2002). hydration shell model and consistent overall with the pKa
There is some evidence for reduced arginine pKas (Le- calculations (other than cysteine). This favorable compari-
houx and Mitra 1999; Morillas et al. 1999), but not a clear son supports use of the empirical model.
example coupled to an atomic structure. Arginine side chain A relatively small hydration entropy modification for
and N-terminal groups are assigned (Ns) as the lysine side- pKas is expected where charged and neutral species differ
chain average, and C-terminal groups follow the average of little in water ordering for a first hydration shell, due to the
the carboxylate containing side chains. propensity of both forms to accommodate multiple hydro-
gen bonds. This is consistent with high Ns for cysteine,
which is relatively nonpolar in the neutral form. Table 4
Comparison with ionization data compares ⌬S values measured for small carboxylic acids
for amino acids and analogs and thiols (Irving et al. 1964). Consistently more negative
The empirical modification for hydration entropy is gener- ⌬S for thiol group dissociation compared to carboxylic ac-
ally small in ⌬pKa terms compared with the Born and ids qualitatively matches the results of the hydration shell
charge–charge contributions, with the exception of cysteine model, supporting a significant role for hydration entropy in
(Fig. 2; Table 2). The sense of the modification is in line cysteine pKa calculation.
with water molecule release upon ionizable group burial
being greater (albeit slightly in most cases) for ionized ver-
sus unionized forms. Table 3 shows the enthalpies and en- Entropic contributions to interactions within
tropies of ionization for amino acid side chains and molecu- the pyruvate dehydrogenase complex
lar analogs (Izatt and Christensen 1976). Hydration entro-
The interaction between dihydrolipoyl dehydrogenase (E3)
pies make the major contributions to ionization entropies
dimer and the peripheral subunit-binding domain (PSBD) of
(Alberty 1983), which show a wide variation for acidic and
dihydrolipoyl acyltransferase (E2) in the pyruvate dehydro-
genase complex of Bacillus stearothermophilus has been
Table 3. Amino acid side chain and analog ionization data subject to thermodynamic investigation (Jung et al. 2002).
The binding (⌬Go ⳱ −52.7 kJ/mole) is entropically driven
Amino acid/analog pKa ⌬H (kJ/mole) ⌬S (J/deg/mole)
(−T⌬So ⳱ −61.9 kJ/mole), and alanine mutations demon-
Aspartic acid 3.87 4.0 −60.6 strate that one side chain (R135) makes a major entropic
Glutamic acid 4.27 1.6 −76.5 contribution (Table 5; Jung et al. 2002), leading the authors
Cysteine 8.39 36.0 −39.7 to conclude that water liberation upon charge network for-
Tyrosine 10.05 25.1 −108.7
Arginine 12.48 51.8 −64.8
mation in the complex is important.
Lysine 10.53 48.5 −38.9
Histidine 6.00 29.9 13.0
Propanoic acid 4.87 −0.6 −95.3 Table 4. Comparison of ionization ⌬S values for carboxylates
Phenol 9.98 23.6 −111.6 and thiolates
phenyl-arginine 12.40 50.0 −20.9
Monomethylamine 10.63 54.7 −19.7 Compound ⌬S (X⳱COOH) ⌬S (X⳱SH) ⌬⌬S (COOH-SH)
Imidazole 6.99 36.7 −10.5
CH3CH2X −95.3 −112.9 17.6
(CH3)2CHX −107.0 −132.5 25.5
Measurements at 25°C and ionic strength between 0 and 0.15 M for the
proton dissociations, AH <⳱> A− + H+ (acid) and BH+ <⳱> B + H+ (CH3)3CX −106.6 −140.0 33.4
(base) (Izatt and Christensen 1976). Only side-chain ionization pKa shown
for amino acids. Data from Irving et al. (1964), at 25°C, with ⌬S units of J/deg/mole.

www.proteinscience.org 2797
Warwicker

Table 5. Binding energies for PSBD mutants relative to wild type, in complex with E3
dimer, and estimated entropic contributions

Experiment (kJ/mole) Calculation (kJ/mole)

⌬⌬G 0
⌬⌬H 0
⌬(T⌬S )0
⌬(T⌬S)QWAT ⌬(T⌬S)SC ⌬(T⌬S)NP ⌺{⌬(T⌬S)}

R135A −11.7 +20.1 +31.8 +20.8 −1.0 −0.3 +19.5


R139A −10.5 −10.5 0.0 +12.1 −3.5 −6.0 +2.6

Experimental data from Jung et al. (2002), differenced for wild type-mutant values. Calculations differ-
enced for complexation, T⌬S ⳱ TScomplex − [TSE3dimer + TSPSBD], followed by wild type to mutant
differencing giving ⌬(T⌬S). Charge hydration entropy ⌬(T⌬S)QWAT derived from fraction of first hy-
dration shell that is buried at interface for ionizable groups, multiplied by the cysteine shell value from
pKa fitting of 36 kJ/mole. Side chain entropy ⌬(T⌬S)SC taken from mean-field calculations (Koehl and
Delarue 1994; Cole and Warwicker 2002). Entropy changes associated with nonpolar surface burial
⌬(T⌬S)NP were approximated with the multiplicative factor 0.1 kJ/mole/Å2.

Whereas hydration modeling for pKa calculations in- relaxation not explicitly included (nonlibrary rotamer pack-
volves differences for ionized/unionized forms and for ing and main-chain movement), and the best value deter-
charge burial, hydration modeling of complexation needs to mined with respect to experimental pKas (Fig. 5B). VdWtol
consider primarily charge burial (although pKa changes
upon burial could play a secondary role). Making the ap-
proximation that unionized cysteine has a weakly bound
hydration shell, then pKa modification for this case is used
to estimate the change in hydration entropy ⌬(T⌬S)QWAT
for burial of a full ionizable group hydration shell in the E3
dimer:PSBD interface. The fractions of hydration shell oc-
cluded upon complexation are summed and multiplied by
the 36 kJ/mole of a full shell. Mutant complexes were mod-
elled with arginine side chain reduction to alanine (Fig. 4).
Component parts (E3 dimer and PSBD) were modeled with-
out conformational relaxation.
Estimates of changes upon complexation for side chain
rotameric entropy and water structure associated with bur-
ied nonpolar area are added (Table 5). Side-chain rotamer
entropies derived from a mean-field calculation (Cole and
Warwicker 2002). Nonpolar surface burial is converted to a
free energy estimate (assumed due to water ordering) by the
empirical factor 0.1 kJ/mole/Å2. The entropic term from
ionizable group burial is the most significant in the wild-
type to mutant differences. Summed values are qualitatively
consistent with experiment for the two mutants (Table 5),
demonstrating potential for the hydration model in binding
studies.

Combination of FDPB and DH models:


The FD/DH interaction scheme
The empirical modification of the FD/␧p ⳱ 4 single con-
former pKa calculations for hydration entropy is applied to
the FD component of the FD/DH scheme. Figure 3 sche- Figure 3. Schematic diagram for determination of access to DH interac-
matizes side-chain flexibility in different environments, and tions. (A) Solvent probing of a side chain gives two SA arcs according to
the derivation of SAmax and access to DH interactions. A whether probed in the context of the drawn rotamer set (SAconfig) or fixed
atoms only (SAfixed-atoms). (B) A different rotamer set gives higher solvent
VdW clash tolerance (VdWtol) is included in mean-field accessibility, such that [SAmax/SAfixed-atoms] ⱖ 0.75 and access to DH
calculations that determine allowed rotamers for SA analy- interactions is allowed. (C) In this case, the rotamer set of B is not possible,
sis. This parameter is varied to approximate conformational SA remains small, and DH interactions are disallowed.

2798 Protein Science, vol. 13


Side-chain packing and ionizable group energetics

2002), has a large degree of internal water, and is compa-


rable overall to DH. FD/DH performs moderately better
with hydration entropy modification than without, with the
largest difference for the active-site cysteine pKas of papain
and DsbA (Table 7). Overall results are improved somewhat
with the simple polar hydrogen optimization algorithm,
consistent with previous observations (Nielsen et al. 1999).
Differences associated with application of either the hydra-
tion entropy or polar hydrogen placement schemes are gen-
erally smaller than those from FD/DH introduction relative
to FD (Table 6). The MCCE method gives an RMS pKa
error for the 110 groups of 0.92 (Georgescu et al. 2002),
Figure 4. PSBD (blue backbone):E3 dimer (green and orange surfaces) compared with 0.58 for DH and 0.79 for FD/DH, while the
interface, showing mutated charge network. single conformer FDPB method used by these authors gives
2.05, close to the current FD value of 2.10.
Scatter plots for ⌬pKas and pKas (Fig. 6) show that FD
of around 0.8 Å is typically required to give a packing
has a bias toward overestimation of pKa shifts, while for DH
solution for united atom VdW radii, so that VdWtol > 0.8 Å
an underestimation of ⌬pKas is evident. These trends are
represents additional flexibility that will lead to greater
SAmax and successive entry of more ionizable groups to the
DH calculation regime.

Variation of RMS pKa errors with VdWtol


Averages of fractional SAmax, over ionizable groups in rep-
resentative proteins 1b0d and 1ado, show that most groups
attain close to full SA by VdWtol ⳱ 0.8 Å, illustrative of
general surface location (Fig. 5A). Specific groups (E35 of
1b0d and K229 of 1ado shown) retain SAmax < 0.75 (Figs.
3, 5A) at VdWtol ⳱ 1.4 Å. A fraction of 0.75 describes a
pivot point largely separating relatively buried, active-site
groups from surface, flexible side chains.
Figure 5B shows that for the 110 groups, excluding most
of the large active-site ⌬pKas, FD performs poorly and DH
well in comparison to the null hypothesis. For the 117 set,
the large ⌬pKas impair DH performance, but have relatively
little impact overall on FD, which remains dominated by
inaccuracy within the 110 set. A broad region of optimal fit
to experiment is found as VdWtol is varied for FD/DH.
Increasing VdWtol adds flexibility that reduces the errone-
ously large ⌬pKas for surface groups in the 110 set (sche-
matically at the top of Fig. 1), while too much flexibility
(beyond VdWtol ⳱ 1.4 Å) allows buried group relaxation
(bottom of Fig. 1) to an extent that is inconsistent with
experiment. From the optimal region, VdWtol ⳱ 1.4 Å is
taken for FD/DH pKa calculations.

Comparison of pKa calculation methods


Figure 5. VdWtol variation, SAmax, and RMS pKa errors. (A) Represen-
Table 6 gives overall RMS pKa errors for various methods. tative averages over a protein and active-site groups are shown for SAmax
FD/␧p ⳱ 20 (Antosiewicz et al. 1994) is similar to DH, vs. VdWtol. Intersection of the SAmax ⳱ 0.75 and VdWtol ⳱ 1.4 lines
gives FD/DH parameterization. (B) RMS pKa errors are shown for FD/DH
consistent with a water-like environment (Warwicker calculations for each of the 110 and 117 group sets (with FD only and DH
1999). Similarly, FD/␧p ⳱ 4 with SA derived from VdW only at each extreme), and horizantal lines (“null”) give RMS errors for
radii rather than a probe reentrant surface (Dong and Zhou model compound pKas.

www.proteinscience.org 2799
Warwicker

Table 6. RMS pKa errors compared across calculations

FD FD FD/DH no FD/DH no
Null ␧p ⳱ 20 VdW SA DH FD/DH ⌬(T⌬S)QWAT Hmove FD

110 gps 0.77 0.78 1.06 0.58 0.79 1.02 1.11 2.10
117 gps 1.24 1.25 1.46 1.18 0.86 1.25 1.16 2.06

Sets of 110 and 117 groups explained in Materials and Methods. All calculations include polar hydrogen
optimization except for the column labeled “no Hmove”. FD/DH schemes include hydration modification
for the FD component except for the “no ⌬(T⌬S)QWAT” column. Of the FD calculations, only the FD
column uses this modification. It is not appropriate for either ␧p ⳱ 20 or VdW solvent accessibility
schemes, which are closer to DH in terms of water domination. FD/DH calculations were made with
VdWtol ⳱ 1.4 Å. Null refers to model compound pKas in place of calculated values. RMS pKa error ⳱
sqrt {[⌺(pKaexpl − pKacalc)2]/N}, where the sum runs over the N ionizable groups with measured pKas.

corrected to a large extent in the FD/DH model, although Of the seven additional proteins with individual groups,
several groups remain significantly in error. 1xnb/E172 and 9pap/C25 give the worst results for FD/DH
(Table 7). For 1xnb, this discrepancy (calculated
pKa ⳱ 4.4, measured ⳱ 6.7) is a failure of the FD/DH
Remaining discrepancies with the FD/DH method scheme, because FD alone gives a large positive ⌬pKa for
Several of the groups used in the Ns analysis have FD E172 with or without hydration entropy modification.
Again, more detailed FD analysis of coupled clusters may
⌬pKa > 1, despite derivation of Ns values from experimental
improve the FD/DH model, a conclusion supported for
pKas (Table 7). Discrepancy arises upon moving to multiple
xylanase by pKa calculations using subsets of ionizable
ionizable groups from a fixed charge environment, and from
groups (Nielsen and McCammon 2003).
uniform Ns for each class of ionizable group. The DH model
The FD/DH pKa for C25 in papain (9pap) is 5.9, com-
gives low RMS pKa errors for some proteins, but also some
large values for active-site pKas. FD/DH largely matches pared with 3.3 by experiment, and in contrast the FD pKa is
the low RMS errors of DH within the 110 set, illustrating a good match. Both C25 and H159 are accessible to the DH
DH dominance of statistical weightings (Table 7). Ribo- scheme at VdWtol ⳱ 1.4 Å. Because the FD result corre-
nucleases 3rn3 and 2rn2 are the biggest exceptions. sponds to significant stabilization of C25, it might be ex-
For 3rn3, groups with ⌬pKa to experiment >1.5 are H48 pected to figure more strongly in FD/DH calculation at
and H119. There are alternate locations for H119 (“A” was VdWtol ⳱ 1.4 Å. Of the four possible samplings of C25 and
used), and a sulphate ion that may give overestimation of H159 net charged forms, only one (both FD) will give a
stabilizing interactions (FD/DH pKa ⳱ 8.3 and measured
pKa ⳱ 6.1). H48 of 3rn3 has an FD/DH pKa of 4.6 and a
Table 7. RMS pKa errors compared for DH, FD/DH, and
measured pKa of 6.3. H48 is relatively buried and excluded FD methods
from the DH scheme. FD interactions are dominated by an
ion pair with D14 that varies between ribonuclease A struc- FD/DH no
tures (Georgescu et al. 2002). PDB ngps DH FD/DH ⌬(T⌬S)QWAT FD
Calculations for 2rn2 were made without magnesium to 4pti 11 0.29 0.35 0.35 1.90
match experiment. FD/DH discrepancies with experiment 3icb 10 0.40 0.37 0.39 0.97
>1.5 pKa units are E48 and H114. For E48 (1.7 calculated 1b0d 18 0.74 0.47 0.61 1.49
1pga 13 0.46 0.80 0.47 1.68
and 4.4 measured), DH is incorporated at VdWtol ⳱ 1.4 Å 3rn3 15 0.45 0.87 2.02 2.43
but not at VdWtol ⳱ 1.2 Å. However, the calculated pKa 2rn2 22 0.59 1.17 1.20 1.97
remains the same, because interactions are dominated by 1a2p 10 0.58 0.76 0.59 3.22
hydrogen bonding to background charges in the FD scheme. 1ppf 11 0.81 0.77 0.74 2.76
It is possible that unfavorable interactions with the D10, 1xnb 1 2.48 2.35 2.34 0.62
9pap 1 4.85 2.64 4.95 0.65
D70, and D134 charge cluster surrounding E48 have been 1a21 1 4.86 0.71 5.35 1.14
underestimated. These groups are DH accessible, so that 1ado 1 4.84 0.84 0.33 1.44
relatively weak DH interactions dominate. For H114 an 1axt 1 6.04 1.06 1.85 1.55
FD/DH pKa at VdWtol ⳱ 1.4 Å of 1.0 compares with a 1gsd 1 1.59 1.36 1.69 0.22
measured pKa of 5.0. This residue is relatively buried, with 1nai 1 3.34 0.23 1.9 0.64
an enforced FD-only scheme. The key (FD) interaction is a
All FD component calculations are single conformer ␧p ⳱ 4, and use the
hydrogen bond to the peptide NH of C63, so that H114 ⌬(T⌬S)QWAT modification unless stated. Polar hydrogen optimization is
protonation must involve some change in structure. applied in all cases. FD/DH calculations made with VdWtol ⳱ 1.4 Å.

2800 Protein Science, vol. 13


Side-chain packing and ionizable group energetics

large favorable interaction. Alteration from FD to FD/DH with FD and FD/DH calculation result from the strong bind-
may therefore result from relative weighting toward weak ing of calcium ions by acidic side chains.
(DH) interactions. Active-site residue Y9 of 1gsd is allowed
access to the DH scheme at VdWtol ⳱ 1.4 Å, (FD/DH
Improved active-site prediction
pKa ⳱ 9.5 and measured 8.1), but not at VdWtol ⳱ 1.3 Å
(calculated pKa ⳱ 8.7). It is tightly coupled to R20, which Figure 7 demonstrates the utility of a method that focuses on
also has a DH accessibility transition over this VdWtol large ⌬pKas in a background of smaller ⌬pKas. Four rep-
range. resentative examples are shown, with ionizable groups dis-
allowed from DH interactions (VdWtol ⳱ 1.4 Å), further
restricted to those with calculated pKas between 3 and 10.
Ionizable group electrostatic energy contributions
This latter feature excludes side chains such as tyrosine or
A move away from neutral pH generally destabilizes the histidine that are largely buried but remain neutral over a pH
folded state (Fink et al. 1994), consistent with a stabilizing range around physiological, and buried groups in particu-
network of positive and negative charges (Wada and Naka- larly stable charge networks. In Figure 7, just active-site
mura 1981), and with destabilization by charge reversal of residues remain, including D31 of 1nai that lies on the op-
surface amino groups (Hollecker and Creighton 1982). posite end of the NAD binding groove to Y149. The large
Computational methods with FD/␧p ⳱ 20 or DH interac- number of tyrosine side chains that are relatively buried
tions accommodate such pH dependence, albeit with adjust- (fractional SAmax < 0.75 at VdWtol ⳱ 1.4 Å), but with
ment for pH-dependent effects in unfolded states (Schaefer pKa > 10, show the potential for FD/DH to improve active-
et al. 1997; Warwicker 1999). site identification methods (Elcock 2001; Ondrechen et al.
However, use of single conformer FD/␧p ⳱ 4 has given 2001) that tend to give such groups as false positives. In
rise to questions of salt-bridge stability (Hendsch and Tidor addition, FD/DH allows detailed investigation of active-site
1994; Dong and Zhou 2002), as is shown with the mostly electrostatics, coupled to existing biochemical data or on the
unfavorable contributions of ionizable group interactions to basis of structural genomics predictions.
stability at pH 7 (FD column of Table 8). In contrast to the
DH column, FD calculations suggest that most of these
Conclusions
folded proteins would be more stable with global replace-
ment of ionizable groups, at variance with the prior discus- Extension of the hydration entropy modification for pKa
sion. The FD/DH method recovers favorable contributions calculations (Warwicker 1997) with a set of active-site
to neutral pH stability. In many cases FD/DH matches DH ⌬pKas (Fig. 2; Table 2) gives results that are consistent with
closely, with DH interactions dominating FD overall. The small molecule ionization data (Tables 3,4). For cysteine,
particularly favorable values in Table 8 for 3icb/calbindin the modification can significantly influence calculated

Figure 6. Scatter plots (calculation vs. experiment) for ⌬pKas and pKas with FD, DH, and FD/DH calculations (117 group set).

www.proteinscience.org 2801
Warwicker

Table 8. Predicted ionizable group array energies at pH water structure (Koumanov et al. 2002). Tightly coupled
7 (kJ/mole) clusters, with groups that are borderline for access to the DH
scheme at VdWtol ⳱ 1.4 Å, may benefit from FD analysis
FD DH FD/DH
of individual rotamer combinations. This refers to relatively
4pti 24.1 −5.8 −5.2 few groups; the majority would be included in FD/DH
3icb −368.9 −57.4 −423.3
sampling.
1b0d 5.3 −26.6 −35.3
1pga −11.3 −23.5 −43.1 Generally, the FD/DH model is improved by hydration
3rn3 26.3 −21.7 −22.0 entropy modification and polar hydrogen optimization, but
2rn2 67.4 −29.7 −30.5 not to the extent of FD/DH improvement over FD (Tables
1a2p −21.3 −38.9 −69.4 6,7). Two areas further demonstrate potential. First (Table
1ppf 0.1 −8.3 −17.5
8), FD/DH recovers overall stabilizing contributions for
1xnb 53.6 −29.1 −22.4
9pap 73.3 −44.2 −24.9 ionizable group interactions at pH 7, in contrast to the de-
1a21 51.8 −39.4 −42.9 stabilization of the FD/␧p ⳱ 4 single-conformer model.
1ado 90.4 −101.5 −92.9 Second, Figure 7 demonstrates active-site identification,
1axt 84.4 −101.2 −106.3 which could be used in a structural genomics context or
1gsd 135.6 −72.8 −65.3
provide subsets of residues to seed active-site–centered
1nai 81.2 −81.6 −73.4
electrostatics calculations (Nielsen and McCammon 2003).
Calcium ions were included in all 3icb calculations leading to significant The computational speed of the FD/DH method, coupled to
stabilization. Ionizable group energy calculations given in Materials and accuracy over large and small pKa deviations, make it suit-
Methods. able for large-scale database analysis.

pKas. In general, the empirical modification makes rela- Materials and methods
tively small changes (Tables 6,7), agreeing with the fruitful
application of FD methods to active-site pKas in previous Coordinates and pKas
work. For some binding processes, entropy gain upon de-
hydration makes a significant contribution (Fig. 4; Jung et Proteins and coordinate sets (Berman et al. 2000) are given in
al. 2002), and the empirical hydration model could prove Table 1. A previous study of computational methods (Georgescu et
al. 2002) has collated pKa data for the first eight proteins in Table
useful (Table 5). 1, covering 126 ionizable groups, although the authors report that
The effectiveness of FD/DH calculations in combining
FD/␧p ⳱ 4 and DH computation is illustrated in Figure 5B.
The simple DH model is relatively good for the mostly
small ⌬pKas of the 110 groups set (Table 6), with an RMS
pKa error of 0.58, matching the performance of more com-
plex methods (Mehler and Guarnieri 1999; Georgescu et al.
2002). However, DH performance is eroded when larger
⌬pKas are included (117 groups). The FD/␧p ⳱ 4 single-
conformer method fails to consistently predict ⌬pKas in the
110 set. As the system is relaxed in the FD/DH model, with
selective access to DH interactions mimicking side-chain
and possibly main-chain readjustment on a limited scale, an
optimal relaxation is evident before key groups are errone-
ously solvent-exposed. Large-scale conformational change
cannot be reliably predicted, and is not the subject of this
work. A VdWtol parameter controls conformational relax-
ation through mean-field rotamer packing and estimation
of SAmax for each ionizable group (Figs. 3, 5A). At
VdWtol ⳱ 1.4 Å, FD/DH gives an RMS pKa error of 0.86
for the 117 groups, compared with null, DH and FD values
of 1.24, 1.18, and 2.06, respectively. The tendencies of FD
to overestimate small ⌬pKas and DH to underestimate large
Figure 7. Active-site identification using FD/DH. Four proteins (green
⌬pKas are clear in Figure 6. Factors that could further im- backbones) used in pKa calculations are shown, with ionizable groups
prove the FD/DH scheme include modeling of larger scale assessed as disallowed from DH interactions in orange, and a subset with
flexibility and pH-dependent ion binding, and detailed calculated pKas between 3 and 10 shown in purple and labeled.

2802 Protein Science, vol. 13


Side-chain packing and ionizable group energetics

some of these ionizations may overlap either a limit of the mea-


sured pH range and/or a protein unfolding transition. Further in-
⬍xm⬎ = 兺x
states,s
mexp(−GsⲐkBT 兲 Ⲑ 兺 exp(−G Ⲑk T兲
states,s
s B

vestigation of the literature cited by Georgescu et al. (2002), with


regard to these criteria, lead to the exclusion of the following where a full evaluation of the partition function (Z) is possible for
groups in the current study: D54, E73, D93, D101 of 1a2p; Y23, small numbers of groups using the reduced sites method (Bashford
Y35, and K41 of 4pti; D66 of 1b0d; D7, D27, Y31, and C-t of and Karplus 1990) at pH extremes, or alternatively monte carlo
1ppf; K13 of 1pga; D14 of 3rn3; D102 and D148 of 2rn2. These sampling of lowest energy states is used (Beroza et al. 1991). The
16 exclusions reduce 126 groups for the eight proteins studied by calculated pKa of site m is that pH for which 〈xm〉 ⳱ 0.5.
Georgescu et al. (2002) to 110, which also excludes the C-t of Electrostatic energy contributions from ionizable groups were
1pga. calculated by summing increments over pH according to ⭸(⌬G)/
This set of 110 groups was supplemented by a further seven ⭸pH ⳱ 2.303RT⌬Q, where ⌬G, ⌬Q differences are relative to pH
active-site groups, with large ⌬pKas for FD/DH analysis, forming titration of the same set of ionizable groups with model compound
the “117” set. Groups and coordinates used for development of the pKas (Antosiewicz et al. 1994). These sums were extended to an
empirical hydration model (Warwicker 1997), including these extreme pH at which a full evaluation of ionization states was
seven active-site residues, are listed in Table 1. In addition, coor- possible, G ⳱ −RTlnZ, again with differencing to a calculation for
dinate set 1ebd was used to study interactions within the pyruvate the same groups titrating with model compound pKas.
dehydrogenase complex.

Hydration entropy modification


Electrostatics calculations (FD and DH) and polar hydrogen placement
DH (Warwicker 1999) and FD (␧p ⳱ 4, single conformer) (War- An empirical estimate of the contribution that water ordering
wicker 1998) calculations followed previous work, except for hy- makes to the change in ionization energy upon transfer into protein
dration entropy and polar hydrogen optimization modifications is written as ⌬G(T⌬S)QWAT ⳱ Vf·Es (Warwicker 1997), where Vf is
(next section). Model compound pKas used and charge assignment the fractional change in first hydration shell volume, and Es is a
for ionizable group atoms were Arg 12.0 0.5/NH1 0.5/NH2; Lys free energy associated with water ordering for the complete shell.
10.4 1.0/NZ; His 6.3 0.5/ND1 0.5/NE2; Asp 4.0–0.5/OD1–0.5/ Vf is calculated from FD grids for a group in the protein relative to
OD2; Glu 4.4–0.5/OE1–0.5/OE2; Cys 8.3–1.0/SG (where not di- that whole amino acid extracted from the protein. It is convenient
sulfide bonded); Tyr 10.2–1.0/OH; N-t 7.5 1.0/N; C-t 3.8–0.5/O– to consider Es in terms of a notional number of water molecules
0.5/OXT. Cysteine side chains were excluded from pKa calcula- (Warwicker 1997) in the first hydration shell (Ns), using 25 J/K/
tions, except for papain C25, DsbA C30 and C33, and thioredoxin mole entropy cost of immobilization of a single water molecule or
C32 and C35. Partial charges were assigned from the GROMOS 7.5 kJ/mole at 300 K. In pKa calculations, Es and Ns correspond to
library (van Gunsteren and Berendsen 1987). Where specified, differences between water structure in ionized and unionized
␧p ⳱ 20 was used in place of ␧p ⳱ 4 for the FD model. A water forms. Averages of Ns are taken over groups within a class, and
relative dielectric of 78.4 was used throughout. An alternative to a individual Ns values derived with charge environments fixed ac-
solvent-probed (probe radius ⳱ 1.4 Å) reentrant surface was cording to expected ionization at physiological pH (see Results
tested, a VdW sphere derived surface giving greater water pen- and Discussion). Thus, Ns derivation was separated from global
etration into the protein. All DH and FD computations were made pKa computation in the FD/DH method through the use of separate
with a linear ion response at 0.15 Molar. calculations with fixed ionization on all groups but one. The Ns
The energy of a protonation microstate relative to the fully parameter gives the empirically derived difference in hydration
unprotonated state for M titratable groups is (modified from Bash- numbers between ionized and unionized forms, for a complete
ford and Karplus 1990): hydration shell. It is uniform for a class of ionizable group, but
each individual modification is proportional to burial (Vf).
Gs = 兺 x 共2.303兲k T(pH-pK 兲
m=1,M
m B m,model
Optimization of polar hydrogen locations can improve pKa cal-
culations (Nielsen et al. 1999; Koumanov et al. 2003). A simple
+ 兺 ⱍx +q ⱍ (⌬G
m m
0
+ ⌬G + ⌬G
Born (T⌬S)QWAT back 兲
procedure was implemented which optimizes polar hydrogen po-
m=1,M sitions for the OH groups of Ser and Thr side chains (and Tyr if
+ 0.5 兺 兺 ⱍx +q ⱍⱍx +q ⱍ⌬G
m m
0
n n
0
mn
present in its neutral form). Torsions are sampled, with 12° reso-
m=1,M n=1,M,n⫽m lution, against coulombic interactions from charges other than
those belonging to the set of OH groups being optimized. Ionizable
where xm,xn are members of vectors (lengths M) taking the values groups are specified in their ionized forms, to focus pKa calcula-
0 (unprotonated) and 1 (protonated), qmo,qno are the charges of the tions on the net difference upon titration.
unprotonated sites m,n, and pKm,model is the model compound pKa
for site m. Interactions of group m with the protein dielectric and
nonionizable (background) charge environments are differenced to Mean-field packing algorithm and access
calculations for each ionizable group extracted from the protein, to the DH interaction scheme
but remaining on the same FD grid (⌬GBorn, ⌬Gback), while ⌬Gmn
gives the interaction between ionizable groups. The term A mean-field algorithm (Koehl and Delarue 1994), adjusted with
⌬G(T⌬S)QWAT arises from the hydration entropy model discussed hard sphere clashes replacing Lennard-Jones interactions (Cole
in the next section. This definition of Gs applies to either FD or DH and Warwicker 2002), establishes a set of allowed rotamers from
calculations, but in the DH case ⌬GBorn and ⌬G(T⌬S)QWAT terms a standard library (Tuffery et al. 1997) as well as those in the
are zero. In addition ⌬G(T⌬S)QWAT is set to zero for ␧p ⳱ 20 experimental structure. The conformational matrix (CM) or weight
calculations. Average fractional protonation for site m is de- for rotamer k of side chain i (of N side chains), depends on clashes
rived as: with fixed atoms and the rotamers of other side chains j:

www.proteinscience.org 2803
Warwicker


1, no clash with fixed atoms
CM (i,k兲 = 0, clash with fixed atoms 冎 payment of page charges. This article must therefore be hereby
marked “advertisement” in accordance with 18 USC section 1734

兿 兺再 冎
N Kj solely to indicate this fact.
1, no clash with 共j,1兲
⳯ 0, clash with 共j,1兲 CM共j,1兲
j=1,j⫽i 1=1
References
Weights for rotamers l ⳱ 1 to Kj of side chain j are summed,
Alberty, R.A. 1983. Physical chemistry, 6th ed. John Wiley & Sons, New York.
and the multiplication of these side-chain and fixed-atom weights Alexov, E. 2003. Role of the protein side-chain fluctuations on the strength of
requires that each be nonzero for a packing solution for rotamer k pair-wise electrostatic interactions: Comparing experimental with computed
of residue i. CM values are iterated from an initial assignment of pKas. Proteins 50: 94–103.
equal weights. These mean-field methods were used previously to Alexov, E. and Gunner, M.R. 1997. Incorporating protein conformational flex-
assess the entropy associated with side-chain rotamer distributions, ibility into the calculation of pH-dependent protein properties. Biophys. J.
but are used here simply to estimate possible rotamers for a given 74: 2075–2093.
Anderson, D.E., Becktel, W.J., and Dahlquist, F.W. 1990. pH-induced denatur-
structure and packing tolerance (VdWtol). ation of proteins: A single salt bridge contributes 3–5 kcal/mol to the free
Hydrogen atoms are absent in the packing calculations, which energy of folding of T4 lysozyme. Biochemistry 29: 2403–2408.
use united atom VdW radii, so that a VdWtol of about 0.8 Å is Antosiewicz, J., McCammon, J.A., and Gilson, M.K. 1994. Prediction of pH-
typically required to repack rotamers of the experimental structure, dependent properties in proteins. J. Mol. Biol. 238: 415–436.
due particularly to side-chain/main-chain clashes. Above this value ———. 1996. The determinants of pKas in proteins. Biochemistry 35: 7819–
clashes may partially account for conformational relaxation, such 7833.
Barbas III, C.F., Heine, A., Zhong, G., Hoffmann, T., Gramatikova, S.,
as variation from library side-chain rotamers or small main-chain Björnestedt, R., List, B., Anderson, J., Stura, E.A., Wilson, I.A., et al. 1997.
movements. Immune versus natural selection: Antibody aldolases with enzymic rates but
An estimate of maximal SA (SAmax) is obtained for each ion- broader scope. Science 278: 2085–2092.
izable group. For each neighboring side chain, the (allowed) rota- Bashford, D. and Karplus, M. 1990. pKa’s of ionizable groups in proteins:
mer that provides the most SA for the charged atom of the ioniz- Atomic detail from a continuum electrostatic model. Biochemistry 29:
able group under study is used to mark SA information on a 10219–10225.
Berman, H.M., Westbrook, J., Feng, Z., Gilliland, G., Bhat, T.N., Weissig, H.,
spherical polar grid around that atom. This grid is scanned after Shindyalov, I.N., and Bourne, P.E. 2000. The Protein Data Bank. Nucleic
complete neighbor analysis for the overall SAmax. Maximal Acids Res. 28: 235–242.
SA estimates for an ionizable group with more than one charged Beroza, P., Fredkin, D.R., Okamura, M.Y., and Feher, G. 1991. Protonation of
atom in our model are averaged. The likelihood that small con- interacting residues in a protein by a Monte Carlo method: Application to
formational adjustment could lead to substantial solvent expo- lysozyme and the photosynthetic reaction center of Rhodobacter sphaeroi-
sure of each group is assessed with SAmax (Fig. 3). If SAmax/ des. Proc. Natl. Acad. Sci. 88: 5804–5808.
SAfixed-atoms ⱖ 0.75, where SAfixed-atoms is calculated in the con- Björnestedt, R., Stenberg, G., Widersten, M., Board, P.G., Sinning, I., Jones,
T.A., and Mannervik, B. 1995. Functional significance of arginine 15 in the
text of fixed atoms only (i.e., excluding atoms that move in the active site of human class ␣ glutathione transferase A1–1. J. Mol. Biol. 247:
rotamer library), then that group is allowed to sample DH (water- 765–773.
dominated) interactions. Blom, N. and Sygusch, J. 1997. Product binding and role of the C-terminal
region in class I D-fructose 1,6-bisphosphate aldolase. Nat. Struct. Biol. 4:
36–39.
Cameron, A.D., Sinning, I., L’Hermite, G., Olin, B., Board, P.G., Mannervik,
Combined FD/DH calculations B., and Jones, T.A. 1995. Structural analysis of human ␣-class glutathione
transferase A1–1 in the apo-form and in complexes with ethacrynic acid and
In the FD/DH combination, each site m in the equation for mi- its glutathione conjugate. Structure 3: 717–727.
crostate energy Gs is sampled as protonated or unprotonated in Chivers, P.T., Prehoda, K.E., Volkman, B.F., Kim, B.M., Markley, J.L., and
each of FD and DH (if allowed) schemes, with the component Raines, R.T. 1997. Microscopic pKa values of Escherichia coli thioredoxin.
Biochemistry 36: 14985–14991.
energies of Gs assigned appropriately. Thus, a standard single
Cole, C. and Warwicker, J. 2002. Side-chain conformational entropy at protein–
conformer pKa calculation with 2M states for M groups, ap- protein interfaces. Protein Sci. 11: 2860–2870.
proaches 4M states in FD/DH, because most sites are relatively Demchuk, E. and Wade, R.C. 1996. Improving the continuum dielectric ap-
flexible and have access to the DH scheme. Note that for any group proach to calculating pKas of ionizable groups in proteins. J. Phys. Chem.
m sampled as DH, all mn (and nm) interactions are DH whether 100: 17373–17387.
group n is sampled as FD or DH, that is, the water-dominated Dong, F. and Zhou, H.X. 2002. Electrostatic contributions to T4 lysozyme
scheme persists for any interaction involving a group sampled in a stability: Solvent-exposed charges versus semi-buried salt bridges. Biophys.
J. 83: 1341–1347.
presumed water-rich environment. Ionizable group array energies Edgcomb, S.P. and Murphy, K.P. 2002. Variability in the pKa of histidine
and pKas in FD/DH calculations are derived as for FD or DH side-chains correlates with burial within proteins. Proteins 49: 1–6.
alone. No attempt is made to assign a density of conformational Elcock, A.H. 2001. Prediction of functionally important residues based solely
states, other than DH access or not, so that electrostatic energy on the computed energetics of protein structure. J. Mol. Biol. 312: 885–896.
determines the relative weightings of FD and DH. A moderate Fersht, A.R. and Renard, M. 1974. pH-dependence of chymotrypsin catalysis.
energy (favorable or unfavorable) from DH interaction would out- Appendix: Substrate binding to dimeric ␣ chymotrypsin studied by x-ray
weigh a highly unfavorable FD interaction, while a large and fa- diffraction and equilibrium method. Biochemistry 13: 1416–1426.
Fink, A.L., Calciano, L.J., Goto, Y., Kurotsu, T., and Palleros, D.R. 1994.
vorable FD term will dominate DH. Classification of acid denaturation of proteins: Intermediates and unfolded
states. Biochemistry 33: 12505–12511.
Fitch, C.A., Karp, D.A., Lee, K.K., Stites, W.E., Lattman, E.E., and Garcia-
Acknowledgments Moreno, E.B. 2002. Experimental pKa values of buried residues: Analysis
with continuum methods and role of water penetration. Biophys. J. 82:
3289–3304.
The European Union is thanked for funding during the course of Georgescu, R.E., Alexov, E., and Gunner, M.R. 2002. Combining conforma-
this work. tional flexibility and continuum electrostatics for calculating pKas in pro-
The publication costs of this article were defrayed in part by teins. Biophys. J. 83: 1731–1748.

2804 Protein Science, vol. 13


Side-chain packing and ionizable group energetics

Gilson, M.K. and Honig, B.H. 1986. The dielectric constant of a folded protein. Nielsen, J.E. and McCammon, J.A. 2003. Calculating pKa values in enzyme
Biopolymers 25: 2097–2119. active sites. Protein Sci. 12: 1894–1901.
Gorfe, A.A., Ferrara, P., Caflisch, A., Marti, D.N., Bosshard, H.R., and Jelesa- Nielsen, J.E., Andersen, K.V., Honig, B., Hooft, R.W.W., Klebe, G., Vriend, G.,
rov, I. 2002. Calculation of protein ionization equilibria with conformational and Wade, R.C. 1999. Improving macromolecular electrostatics calcula-
sampling: pKas of a model leucine zipper, GCN4 and barnase. Proteins 46: tions. Protein Eng. 12: 657–662.
41–60. Noble, M.A., Gul, S., Verma, C.S., and Brocklehurst, K. 2000. Ionization char-
Grauschopf, U., Winther, J.R., Korber, P., Zander, T., Dallinger, P., and acteristics and chemical influences of aspartic acid residue 158 of papain
Bardwell, J.C. 1995. Why is DsbA such an oxidizing catalyst? Cell 83: and caricain determined by structure-related kinetic and computational tech-
947–955. niques: Multiple electrostatic modulators of active-centre chemistry. Bio-
Guddat, L.W., Bardwell, J.C., and Martin, J.L. 1998. Crystal structures of re- chem. J. 351:723–733.
duced and oxidized DsbA: Investigation of domain motion and thiolate Ondrechen, M.J., Clifton, J.G., and Ringe, D. 2001. THEMATICS: A simple
stabilization. Structure 6: 757–767. computational predictor of enzyme function from structure. Proc. Natl.
Hendsch, Z.S. and Tidor, B. 1994. Do salt-bridges stabilize proteins? A con- Acad. Sci. 98: 12473–12478.
tinuum electrostatic analysis. Protein Sci. 3: 211–226. Schaefer, M., Sommer, M., and Karplus, M. 1997. pH-dependence of protein
Hollecker, M. and Creighton, T.E. 1982. Effect on protein stability of reversing stability: Absolute electrostatic free energy differences between conforma-
the charge on amino groups. Biochim. Biophys. Acta 701: 395–404. tions. J. Phys. Chem. B 101: 1663–1683.
Honig, B. and Nicholls, A. 1995. Classical electrostatics in biology and chem- Schutz, C.N. and Warshel, A. 2001. What are the dielectric “constants” of
istry. Science 268: 1144–1149. proteins and how to validate electrostatic models? Proteins 44: 400–417.
Ikeuchi, Y., Katerelos, N.A., and Goodenough, P.W. 1998. The enhancing of a Simonson, T. 2001. Macromolecular electrostatics: Continuum models and their
cysteine protease activity at acidic pH by protein engineering, the role of growing pains. Curr. Opin. Struct. Biol. 11: 243–252.
glutamic 50 in the enzyme mechanism of caricain. FEBS Lett. 437: 91–96. Simonson, T. and Perahia, D. 1995. Internal and interfacial dielectric properties
Irving, R.J., Nelander, L., and Wadso, I. 1964. Thermodynamics of the ioniza- of cytochrome c from molecular dynamics in aqueous solution. Proc. Natl.
tion of some thiols in aqueous solution. Acta. Chem. Scand. 18: 769–787. Acad. Sci. 92: 1082–1086.
Izatt, R.M. and Christensen, J.J. 1976. Heats of proton ionization, pK, and Sundd, M., Iverson, N., Ibarra-Molero, B., Sanchez-Ruiz, J.M., and Robertson,
related thermodynamic quantities. In Handbook of biochemistry and mo- A.D. 2002. Electrostatic interactions in ubiquitin: Stabilization of carbox-
lecular biology: Physical and chemical data (ed. G.D. Fasman), 3rd ed., ylates by lysine amino groups. Biochemistry 41: 7586–7596.
Vol. I, pp. 151–269. CRC Press, Cleveland, OH. Thoden, J.B., Frey, P.A., and Holden, H.M. 1996. Crystal structures of the
Joshi, M.D., Hedberg, H., and McIntosh, L.P. 1997. Complete measurement of oxidized and reduced forms of UDP-galactose 4-epimerase isolated from
the pKa values of the carboxyl and imidazole groups in Bacillus circulans Escherichia coli. Biochemistry 35: 2557–2566.
xylanase. Protein Sci. 6: 2667–2670. Thunnissen, M.M.G.M., Eiso, A.B., Kalk, K.H., Drenth, J., Dijkstra, B.W.,
Jung, H.-I., Cooper, A., and Perham, R.N. 2002. Identification of key amino Kuipers, O.P., Dijkman, R., de Haas, G.H., and Verheij, H.M. 1990. X-ray
acid residues in the assembly of enzymes into the pyruvate dehydrogenase structure of phospholipase A2 complexed with a substrate-derived inhibitor.
complex of Bacillus stearothermophilus: A kinetic and thermodynamic Nature 347: 689–691.
analysis. Biochemistry 41: 10446–10453. Tuffery, P., Etchebest, C., and Hazout, S. 1997. Prediction of protein side chain
Klapper, I., Hagstrom, R., Fine, R., Sharp. K., and Honig, B. 1986. Focusing of conformations: A study on the influence of backbone accuracy on confor-
electric fields in the active site of Cu-Zn superoxide dismutase: Effects of mation stability in the rotamer space. Protein Eng. 10: 361–372.
ionic strength and amino-acid modification. Proteins 1: 47–59. van Gunsteren, W.F. and Berendsen, H.J.C. 1987. GROMOS manual. Univer-
sity of Groningen, The Netherlands.
Koehl, P. and Delarue, M. 1994. Application of a self-consistent mean field
van Vlijmen, H.W., Schaefer, M., and Karplus, M. 1998. Improving the accu-
theory to predict side-chains conformation and estimate their conforma-
racy of protein pKa calculations: Conformational averaging versus the av-
tional entropy. J. Mol. Biol. 239: 249–275.
erage structure. Proteins 33: 145–158.
Koumanov, A., Karshikoff, A., Friis, E.P., and Borchert, T.V. 2001. Confor-
Verheij, H.M., Volwerk, J.J., Jansen, E.H.J.M., Puyk, W.C., Dijkstra, B.W.,
mational averaging in pK calculations: Improvement and limitations in pre-
Drenth, J., and de Haas, G.H. 1980. Methylation of histidine 48 in pancre-
diction of ionization properties of proteins. J. Phys. Chem. B 105: 9339–
atic phospholipase A2. Role of histidine and calcium ion in the catalytic
9344.
mechanism. Biochemistry 19: 743–750.
Koumanov, A., Ruterjans, H., and Karshikoff, A. 2002. Continuum electrostatic Wada, A. and Nakamura, H. 1981. Nature of the charge distribution in proteins.
analysis of irregular ionization and proton allocation in proteins. Proteins Nature 293: 757–758.
46: 85–96. Warshel, A. 1981. Calculations of enzymatic reactions: Calculations of pKa,
Koumanov, A., Benach, J., Atrian, S., Gonzalez-Duarte, R., Karshikoff, A., and proton transfer reactions, and general acid catalysis reactions in enzymes.
Ladenstein, R. 2003. The catalytic mechanism of Drosophila alcohol de- Biochemistry 20: 3167–3177.
hydrogenase: Evidence for a proton relay modulated by the coupled ion- ———. 2003. Computer simulations of enzyme catalysis: Methods, progress,
ization of the active site Lysine/Tyrosine pair and a NAD+ ribose OH and insights. Annu. Rev. Biophys. Biomol. Struct. 32: 425–443.
switch. Proteins 51: 289–298. Warshel, A. and Papazyan, A. 1998. Electrostatic effects in macromolecules:
Kuramitsu, S. and Hamaguchi, K. 1980. Analysis of the acid-base titration curve Fundamental concepts and practical modelling. Curr. Opin. Struct. Biol. 8:
of hen lysozyme. J. Biochem. 87: 1215–1219. 211–217.
Lehoux, I.E. and Mitra, B. 1999. (S)-Mandelate dehydrogenase from Pseudo- Warwicker, J. 1986. Continuum dielectric modelling of the protein–solvent
monas putida: Mechanistic studies with alternate substrates and pH and system, and calculation of the long-range electrostatic field of the enzyme
kinetic isotope effects. Biochemistry 38: 5836–5848. phosphoglycerate mutase. J. Theor. Biol. 121: 199–210.
Liu, Y., Thoden, J.B., Kim, J., Berger, E., Gulick, A.M., Ruzicka, F.J., Holden, ———. 1997. Improving pKa calculations with consideration of hydration en-
H.M., and Frey, P.A. 1997. Mechanistic roles of tyrosine 149 and serine 124 tropy. Protein Eng. 10: 809–814.
in UDP-galactose 4-epimerase from Escherichia coli. Biochemistry 36: ———. 1998. Modeling charge interactions and redox properties in DsbA. J.
10675–10684. Biol. Chem. 273: 2502–2504.
Mehler, E.L. and Guarnieri, F. 1999. A self-consistent, microenvironment ———. 1999. Simplified methods for pKa and acid pH-dependent stability
modulated screened coulomb potential approximation to calculate pH-de- estimation in proteins: Removing dielectric and counterion boundaries. Pro-
pendent electrostatic effects in proteins. Biophys. J. 75: 3–22. tein Sci. 8: 418–425.
Morillas, M., Goble, M.L., and Virden, R. 1999. The kinetics of acylation and Warwicker, J. and Watson, H.C. 1982. Calculation of the electric potential in the
deacylation of penicillin acylase from Escherichia coli ATCC11105: Evi- active site cleft due to ␣-helix dipoles. J. Mol. Biol. 157: 671–679.
dence for lowered pKa values of groups near the catalytic centre. Biochem. You, T.J. and Bashford, D. 1995. Conformation and hydrogen ion titration of
J. 338: 235–239. proteins: A continuum model with conformational flexibility. Biophys. J.
Morris, A.J. and Tolan, D.R. 1994. Lysine-146 of rabbit muscle aldolase is 69:1721–1733.
essential for cleavage and condensation of the C3–C4 bond of fructose Zhou, H.X. and Vijayakumar, M. 1997. Modeling of protein conformational
1,6-bis(phosphate). Biochemistry 33: 12291–12297. fluctuations in pKa predictions. J. Mol. Biol. 267: 1002–1011.

www.proteinscience.org 2805

Anda mungkin juga menyukai