FULL PAPER
WWW.C-CHEM.ORG
Site-Directed Analysis on Protein Hydrophobicity Song-Ho Chong and Sihyun Ham* Hydrophobicity of a protein is considered to be one of the major intrinsic factors dictating the protein aggregation propensity. Understanding how protein hydrophobicity is determined is, therefore, of central importance in preventing protein aggregation diseases and in the biotechnological production of human therapeutics. Traditionally, protein hydrophobicity is estimated based on hydrophobicity scales determined for individual free amino acids, assuming that those scales are unaltered when amino acids are embedded in a protein. Here, we investigate how the hydrophobicity of constituent amino acid residues depends on the protein context. To this end, we analyze the hydration free energy—free energy change on hydration quantifying the hydrophobicity—of the wild-type and 21
mutants of amyloid-beta protein associated with Alzheimer’s disease by performing molecular dynamics simulations and integral-equation calculations. From detailed analysis of mutation effects on the protein hydrophobicity, we elucidate how the protein global factor such as the total charge as well as underlying protein conformations influence the hydrophobicity of amino acid residues. Our results provide a unique insight into the protein hydrophobicity for rationalizing and predicting the protein aggregation propensity on mutation, and open a new avenue to design aggregation-resistant proteins as biotherC 2014 Wiley Periodicals, Inc. apeutics. V
Introduction
In this article, we investigate the protein hydrophobicity for the wild-type and 21 mutants of the 42-residue form of amyloid-b (Ab42) protein, an intrinsically disordered protein whose aggregation is associated with Alzheimer’s disease.[17] This is done by first carrying out all atom, explicit-water molecular dynamic simulations to sample equilibrium solution structures of these proteins, followed by the integral-equation calculation of the hydration free energy that quantifies the affinity of a whole protein toward water, that is, protein overall hydrophobicity. After demonstrating that the computed protein hydrophobicity is indeed significantly correlated with the experimental aggregation propensity, we perform the sitedirected thermodynamic analysis[18,19] to address intrinsic molecular factors determining the protein hydrophobicity. This novel analysis method enables us to resolve macroscopic thermodynamic properties in terms of atomic-level details of protein conformation and hydration structure and has been applied to elucidate the role of hydrophobic and hydrophilic residues in protein misfolding[20] and aggregation[21] processes. We demonstrate how the protein hydrophobicity is affected by the protein global factor such as the total charge and by underlying protein conformations, which are not taken into account in the conventional biophysical hydrophobicity scales. The identification of the factors dictating the protein hydrophobicity makes it possible to rationalize and predict
Protein aggregation has been implicated in a number of human diseases.[1,2] Elucidating intrinsic factors of proteins that promote aggregation-prone nature in water has, therefore, been a critical issue to understand and prevent protein aggregation diseases such as Alzheimer’s, Parkinson’s, and type II diabetes.[3–6] It has been generally considered that the protein aggregation is associated with the hydrophobic nature of protein surface that tends to cluster together in aqueous solutions.[7–10] Traditionally, hydrophobicity scales determined for individual free amino acids or side-chain analogs have been used to deduce hydrophobic interaction between side chains in proteins,[11,12] and a change in protein hydrophobicity resulting from mutation has been estimated from the difference in hydrophobicity scales of wild-type and mutant residues.[3] However, it has already been recognized decades ago that such “biophysical” hydrophobicity scales, in particular, those for charged amino acids are markedly modified by flanking peptide bonds.[13] Recently, “biological” hydrophobicity scales have been argued in an attempt to predict transmembrane helices from amino acid sequences.[14] Conspicuous dependences of the hydrophobicity on the position along the sequence as well as on the charge of the flanking residues have been demonstrated,[15] adding a new dimension to the concept of protein hydrophobicity. Furthermore, it has been reported for the entire ensemble of Escherichia coli proteins that the conventional sequence-based hydrophobicity fails to explain why proteins with higher contents of negatively charged residues tend to be less aggregation-prone than those of positively charged ones,[16] indicating that the rationalization of inherent aggregation propensity requires more than sequence-based analyses about the protein. 1364
Journal of Computational Chemistry 2014, 35, 1364–1370
DOI: 10.1002/jcc.23631
S.-H. Chong, S. Ham Department of Chemistry, Sookmyung Women’s University, Cheongpa-ro 47gil 100, Yongsan-Ku, Seoul 140-742, Korea E-mail:
[email protected] Contract grant sponsor: Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education, Science and Technology; Contract grant numbers: 20120007855, 2012-0003068, and 2012R1A2A01004687 C 2014 Wiley Periodicals, Inc. V
WWW.CHEMISTRYVIEWS.COM
FULL PAPER
WWW.C-CHEM.ORG
Table 1. Change in experimental aggregation propensity and hydration free energy. Mutant
Mutated sites
Hydrophobic ! Hydrophilic (negative) DQ I41D, A42Q DS I41D, A42S HD I41H, A42D EL I41E, A42L Hydrophobic ! Hydrophilic (neutral) HN I41H, A42N Flemish A21G TN I41T, A42N TQ I41T, A42Q LN I41L, A42N QY I41Q, A42Y QL I41Q, A42L TM I41T, A42M TI I41T, A42I Hydrophobic ! Hydrophilic (positive) KA I41K KL I41K, A42L RR I41R, A42R IR A42R Hydrophilic (negative) ! Hydrophilic (neutral) Arctic E22G Iowa D23N Dutch E22Q Hydrophilic (negative) ! Hydrophilic (positive) Italian E22K
DQ[a]
Exp.[b]
DGhyd [c]
DGmut
DGch
21 21 21 21
20.964 20.913 20.708 20.445
2166.8 2197.7 2168.2 236.9
2169.6 2182.9 2159.9 2141.5
211.7 223.9 210.7 63.5
14.5 9.1 2.4 41.1
0 0 0 0 0 0 0 0 0
20.837 20.671 20.605 20.590 20.561 20.382 20.295 20.292 20.075
2109.4 25.5 265.3 236.2 255.1 214.5 216.7 224.5 223.3
229.3 24.9 226.6 219.4 214.8 215.9 28.1 216.0 1.0
269.7 27.5 235.7 221.6 230.9 25.4 1.0 210.0 227.9
210.3 6.9 23.0 4.8 29.4 6.7 29.6 1.5 3.6
11 11 12 11
20.518 20.379 20.324 20.034
25.0 215.1 214.1 21.7
29.5 21.7 43.5 53.0
14.7 240.6 227.9 258.5
210.2 3.7 229.8 3.8
11 11 11
0.160 0.238 0.317
142.0 112.1 116.1
104.2 101.7 111.9
36.2 15.2 3.9
1.6 24.8 0.3
12
0.328
164.7
106.7
52.7
5.3
DGn-ch
[a] Change in the protein total charge on mutation. [b] Change in the experimental aggregation propensity on mutation. [c] Change in the hydration free energy on mutation in units of kcal/mol.
mutation effects on the protein aggregation propensity, and will also contribute to design aggregation-resistant proteins.
protein in vivo for synthetic mutants (Fig. 1A of Ref. [23]). The resulting log ðfmut =fwt Þ values are presented in Table 1.
Computational Methods
Molecular dynamics simulations
Systems We studied the wild-type and 21 mutants of Ab42 protein listed in Table 1. The wild-type protein has the sequence ð1Þ DAEFRHDSGY EVHHQKLVFF AEDVGSNKGA IIGLMVGGVV IAð42Þ . The mutants include pathogenic familial mutants[22] such as Flemish (A21G), Arctic (E22G), Dutch (E22Q), Italian (E22K), and Iowa (D23N) as well as synthetic mutants[23] in which one or both of the two hydrophobic residues (I41 and A42) at the C-terminus are mutated to hydrophilic residues. They can be classified into five different types (see Table 1) based on the conventional hydrophobicity/hydrophilicity of the mutated residue(s). Relevant to this study are the strong differences in the experimental aggregation propensity of these mutants.[23–25] The change in the aggregation propensity on mutation was quantified by log ðfmut =fwt Þ of mutant (fmut ) and wild-type Ab42 (fwt ), where f (or 1=f ) refers to a measure of the aggregation tendency (or solubility) available from the literature. We have chosen for f the inverse of the level of soluble protein in vivo for Flemish, Arctic, Dutch, and Italian mutants (taken from Table 1 of Ref. [24]); in vitro thioflavin-T fluorescence intensity for Iowa mutant (Fig. 3B of Ref. [25]); and the inverse of the fluorescence intensity when fused with green fluorescent
For each of the wild-type and 21 mutant proteins, we carried out all atom, explicit-water molecular dynamics simulations at the temperature T 5 300 K and the pressure P 5 1 bar using the AMBER11 simulation package.[26] Ten independent production runs of 100 ns length (i.e., an aggregated simulation time of 1 ms) were conducted for each system. We used the ff99SB force field[27] for protein and the TIP4P-Ew model[28] for water. The simulations were performed under neutral pH, where Lys and Arg are positively charged and Glu and Asp are negatively charged. The wild-type Ab42 is in the 23 charged state, while the protein total charge of mutants varies depending on the mutation (see Table 1). The particle mesh Ewald method[29] was applied for long-range electrostatic interactions, while a 10 A˚ cutoff was used for short-range nonbonded interactions. Hydrogen atoms were constrained to the equilibrium bond length using the SHAKE algorithm,[30] which enables simulations with a 2 fs time step. Temperature and pressure were controlled by Berendsen’s thermostat and barostat with coupling constants of 1.0 and 2.0 ps, respectively.[31] As Ab42 protein in the monomeric state is intrinsically disordered and no atomic structures in aqueous environments are available experimentally, the initial protein structures for the 10 independent production runs were prepared via heating/ annealing simulations as described elsewhere.[32] Briefly, highJournal of Computational Chemistry 2014, 35, 1364–1370
1365
FULL PAPER
WWW.C-CHEM.ORG
temperature (600 K) simulations were carried out to generate 10 independent protein conformations, which were then gradually annealed to 300 K. The starting conformation for the heating/annealing simulations of the wild-type Ab42 was taken from the NMR structure determined in apolar solvents (Protein Data Bank (PDB) ID: 1IYT[33]). The starting structures for the Ab42 mutants were generated by Swiss PDB Viewer[34] based on the 1IYT structure. We previously demonstrated that the simulated structures so generated for the wild-type Ab42 are in consistent with the experimentally observed NMR J-coupling constants, CD spectra, NMR chemical-shift-index analysis, and nuclear Overhauser effect (NOE) intensity analysis.[32,35] Hydration free energy and its site-directed analysis For each simulated system, we took 20,000 protein conformations from the production run of 1 ms (103100 ns) length with a 50 ps time interval. We applied the three-dimensional reference interaction site model (3D-RISM) theory[36–38] to each of the simulated protein conformations for the hydration free energy calculation. The 3D-RISM theory is an integral-equation theory based on statistical mechanics for obtaining the 3D distribution function gc ðrÞ of the water site c at position r around a molecular solute such as protein. For a solute–solvent system at infinite dilution, the 3D-RISM equation is given by h i X hc ðrÞ5 cc0 ðrÞ wcvv0 c ðrÞ1qhvv c0 c ðrÞ
(1)
c0
Here, hc ðrÞ and cc ðrÞ refer to the 3D total and direct correlation functions of the water site c, respectively; the asterisk denotes a convolution integral; wcvv0 c ðrÞ and hvv c0 c ðrÞ are the site– site intramolecular and total correlation functions of water; and q represents the average number density of water. This equation is to be supplemented by an approximate closure relation, and in this study, we adopted the one developed by Kovalenko and Hirata[38] ( hc ðrÞ5
exp ½dc ðrÞ21
for
dc ðrÞ 0
dc ðrÞ
for
dc ðrÞ > 0
(2)
in which dc ðrÞ52uc ðrÞ=ðkB TÞ1hc ðrÞ2cc ðrÞ with kB denoting Boltzmann’s constant. uc ðrÞ refers to the interaction potential acting on the water site c which is generated by atoms in protein, and is represented by a sum of Lennard-Jones (LJ) and Coulomb electrostatic terms centered on the protein interacP elec tion site a of position ra , uc ðrÞ5 a ½uLJ ac ðjr2ra jÞ1uac ðjr2ra jÞ. 12 6 LJ elec Here, uac ðrÞ54ac ½ðrac =rÞ 2ðrac =rÞ and uac ðrÞ5qa qc =r with ac , rac , qa , and qc being the LJ parameters and atomic charges. We used the same numerical procedure as described in Ref. [38] to solve eqs. (1) and (2) self-consistently. The water distribution function is then obtained via gc ðrÞ5hc ðrÞ11. The hydration free energy Ghyd and its partitioning into contribution Ga from atom a in protein can be calculated from the water distribution function based on the Kirkwood charging formula[18]: 1366
Journal of Computational Chemistry 2014, 35, 1364–1370
X Ghyd 5 Ga
with
elec Ga 5GLJ a 1Ga
(3)
a
in which ð X ð1 @uLJ ac ðr; k1 Þ GLJ dk1 r 2 dr gac ðr; k1 ; k2 50Þ a 54pq @k1 0 c X ð1
Gelec a 54pq
c
0
ð
dk2 r 2 dr
@uelec ac ðr; k2 Þ gac ðr; k1 51; k2 Þ @k2
(4)
(5)
Here, k1 and k2 are the parameters for scaling the LJ parameter (k1 rac ) and the atomic charge (k2 qa ) of the protein, respectively, and the resulting interaction potentials are denoted as elec uLJ ac ðr; k1 Þ and uac ðr; k2 Þ. gac ðr; k1 ; k2 Þ refers to the radial distribution function, associated with the 3D distribution function Ð for the parameters k1 and k2 via gac ðr; k1 ; k2 Þ5ð1=4pÞ d^r gc ðra 1r; k1 ; k2 Þ with ^r 5r=r and r5jrj. Equations (3)–(5) allow us to obtain the decompositions of Ghyd into contributions from groups of atoms (e.g., residues) as well as into individual components of potential energy terms (e.g., electrostatic term). The main limitation of the 3D-RISM theory lies in the use of an approximate closure relation. In this work, we are primarily interested in the change in the hydration free energy on mutation. As demonstrated before,[18,21] the hydration free energy change is dominated by the electrostatic component. We also showed in our previous work that the electrostatic contribution as computed by the 3D-RISM theory is reasonably accurate (see Supporting Information of Ref. [39]). It is, therefore, expected that our results to be presented below do not significantly suffer from the limitation of the integral-equation theory.
Results Protein hydrophobicity and aggregation propensity For the wild-type and 21 mutants of Ab42 protein listed in Table 1, we carried out extensive explicit-water molecular dynamics simulations to sample their equilibrium solution structures. Hydration free energy Ghyd was then computed by applying the integral-equation theory to the simulated protein conformations, solving the integral equations separately to each geometry, and the resulting average Ghyd is identified as the protein overall hydrophobicity: a larger (i.e., more positive) Ghyd value is associated with more increased protein hydrophobicity, and vice versa. The change in the protein hydrophobicity on mutation is quantified by the difference in the hydration free energy, DGhyd 5Ghyd ðmut Þ2Ghyd ðwt Þ, of the mutant (mut) and wild-type (wt) proteins. As demonstrated in Figure 1, we find a statistically significant correlation between the computed hydration free energy change DGhyd and the experimental aggregation propensity change on mutation. A consistent result has been obtained also for acylphosphatase[39,40] and the N-terminal domain of the E. coli protein HypF[39]: an experimentally higher (lower) aggregation propensity was found to be associated with increased (decreased) hydration free energy on mutation. These results indicate that the protein hydrophobicity quantified by the hydration free WWW.CHEMISTRYVIEWS.COM
FULL PAPER
WWW.C-CHEM.ORG
Figure 1. Correlation between experimental protein aggregation propensity and protein hydrophobicity quantified by hydration free energy. Change in the experimental aggregation propensity on mutation, log ðfmut =fwt Þ, is plotted versus the difference in the hydration free energy, DGhyd 5Ghyd ðmut Þ2Ghyd ðwt Þ, of the mutant (mut) and wild-type (wt) Ab42. Pearson correlation coefficient (R) and statistical significance (P value) are also displayed.
energy serves as a measure of protein aggregation propensity in aqueous environments. “Residual hydrophobicity” in protein context To elucidate molecular factors determining the protein hydrophobicity, we resolved the hydration free energy change on mutation (DGhyd ) into residue contributions (DGi ’s), DGhyd 5 P i DGi (Figs. 2 and 3 for representative and Figs. S1–S4 in Supporting Information for all the mutants studied). We note that Gi is the hydration free energy of ith residue under the presence of the other residues in a protein,[18] and in general takes a value different from the one for a free amino acid. In fact, DGi 5Gi ðmut Þ2Gi ðwt Þ would be zero except for the mutated residues if Gi for free amino acids were used, and the presence of substantial changes in DGi for residues other than the mutated sites indicates that Gi depends on the protein context. DGi thus describes a change in such “residual hydrophobicity” on mutation. DGi ’s shall be grouped into contributions from the mutated residues (DGmut ), charged residues (DGch ), and non-charged remainder (DGn2ch ) so that DGhyd 5DGmut 1DGch 1DGn2ch
(6)
The results of this partitioning are summarized in Table 1. We find from Table 1 that the mutated sites (DGmut ) provide in most cases a dominant contribution to the protein hydrophobicity change (DGhyd ). DGmut for most types of mutations can be qualitatively understood in terms of the conventional hydropathy scales: the decrease in the hydrophobicity or the increase in the hydrophilicity of the mutated sites is associated with a negative change in DGmut indicating more favorable hydration, and vice versa. For example, negative changes in DGmut of DQ (I41D, A42Q), EL (I41E, A42L), HN (I41H, A42N), and TN (I41T, A42N) mutants (Fig. 2) are caused by the substitution of hydrophobic side chains to hydrophilic ones. Quite a large decrease in DGmut at the residue 41 of DQ (I41D, A42Q) and EL (I41E, A42L) mutants (Fig. 2a) reflects the fact that the charged residues are much more favorably hydrated than neutral residues. Con-
Figure 2. Each-residue contribution to the hydration free energy change on mutation. Representative mutants (indicated in each panel) in which hydrophobic residues are mutated to negatively charged residues (a) and to neutral hydrophilic residues (b) are displayed. Contributions from the mutated sites (DGmut ) are colored with green, those (DGch ) from positively charged residues with blue, those (DGch ) from negatively charged residues with red, and those (DGn-ch ) from non-charged remainder with black.
versely, the mutation of a charged residue to a neutral one significantly increases DGmut as shown in Supporting Information Figure S4 for Arctic (E22G), Iowa (D23N), and Dutch (E22Q) mutants. Total-charge effect on residual hydrophobicity On the other hand, DGmut for the types of mutations listed in Table 1 that involve positively charged residues cannot be rationalized solely by the side-chain characteristics. According to the conventional hydropathy scales, the hydrophilicity of positively (Lys and Arg under neutral pH) and negatively (Asp and Glu) charged residues is comparable.[12] However, the decrease in DGmut at the residue 41 of KL (I41K, A42L) and RR (I41R, A42R) mutants (Figs. 3a and 3b), in which a hydrophobic residue (I41) is mutated to a positively charged residue (K or R), is remarkably reduced than what would be expected from the hydrophobicity of free amino acids (Fig. 3d). In addition, a significant increase in DGmut is observed in Italian (E22K) mutant (Fig. 3c) in which a negatively charged residue (E) is mutated to a positively charged residue (K). Based on the conventional hydropathy scales, DGmut of this mutant would be insignificant as the hydrophilicity of E22 and K22 is comparable (Fig. 3d). In fact, the sequence-based model predicts only a tiny change in the aggregation propensity of Italian (E22K) mutant of Ab42 protein (see Fig. 2b of Ref. [4]). Such “nonstandard” behavior of mutations involving positively charged residues can be understood in terms of the long-distance hydration structure that reflects the protein total charge.[39] Due to the long-range nature of the electrostatic interaction, the long-distance hydration structure of a charged Journal of Computational Chemistry 2014, 35, 1364–1370
1367
FULL PAPER
WWW.C-CHEM.ORG
Figure 3. Each-residue contribution to the hydration free energy change on mutation. Representative mutants (indicated in each panel) in which hydrophobic residues are mutated to positively charged residues (a, b) and a negatively charged residue is mutated to a positively charged residue (c) are displayed. Contributions from the mutated sites (DGmut ) are colored with green, those (DGch ) from positively charged residues with blue, those (DGch ) from negatively charged residues with red, and those (DGn-ch ) from non-charged remainder with black. In (b), the positive change in DGmut at the residue 42 of RR (I41R, A42R) mutant reflects the neutralization of the C-terminal end possessing the COO2 group. (d) Results in the panels a–c are redrawn, but with the expected results (dashed vertical bars) at the residues 41 and 22 based on Gi for free amino acids.[39]
residue is also affected by nearby charged residues: the equilibrium water distribution is determined in leading order by the net charge produced by those residues. The negative net charge (23 for the wild type) of Ab42 protein yields such an equilibrium orientational distribution of surrounding water molecules in which water hydrogen is directed toward the protein. This results in unfavorable electrostatic interaction between positively charged residues and water molecules, which explains why the hydration free energy for positively charged residues in Ab42 takes a much more positive value than the one for free amino acids as is observed in Figure 3. This demonstrates the impact of protein net charge on the residual hydrophobicity of charged residues. Structural effects on residual hydrophobicity So far, we focused on the contribution from the mutated sites (DGmut ) to the protein hydrophobicity, but the contributions from the other residues (DGch and DGn-ch ) also largely influence the protein hydrophobicity (Table 1). As aforementioned, non-zero DGch and DGn-ch reflect the very fact that these residues are embedded in the protein context. We show below that a full rationalization of the non-zero variations in DGch and DGn-ch requires taking into account protein conformational changes on mutation, in addition to the protein totalcharge effect considered above. 1368
Journal of Computational Chemistry 2014, 35, 1364–1370
We observe that DGch values are subjected to systematic variations arising from the change in the protein total charge (DQ) on mutation (listed in Table 1): when the protein total charge is decreased (DQ < 0) on mutation, positively (negatively) charged residues always provide positive (negative) contributions to DGch (see, e.g., Fig. 2a), and just the opposite trend is observed (see Figs. 3a–3c) when the protein net charge is increased on mutation (DQ > 0). Such contributions to DGch shall, therefore, be modeled as being proportional to DQ. The change in the protein structure, in particular, the formation of salt-bridges or hydrogen bonds involving charged side chains that causes dehydration also significantly affects the hydration free energy.[18,40] We modeled those protein structural effects in DGch in terms of the solvent accessible surface area of charged side chains (SASA side ch ): DGch 5A
DQ DSASA side ch 2B jQðwt Þj SASA side ch ðwt Þ
(7)
Here, the changes in the protein total charge DQ5Qðmut Þ2Q ðwt Þ and in the solvent accessible surface area DSASA side ch 5 side SASA side ch ðmut Þ2SASA ch ðwt Þ on mutation are normalized by the values of the wild-type protein so that the coefficients A and B become dimensionless. The minus sign in the second term comes from the fact that the dehydration (negative change in DSASA side ch ) is associated with the increase in WWW.CHEMISTRYVIEWS.COM
WWW.C-CHEM.ORG
hydrophobicity (positive change in DGch ). We find that DGch can in fact be rationalized by the terms proportional to DQ and to the change in SASA side ch (Fig. 4a), with the latter providing a larger contribution (i.e., A < B). This shows the relevance of protein structural effects in determining the residual hydrophobicity, and hence, the protein hydrophobicity. In mutations that do not change the protein total charge (DQ50), the non-charged residue contributions (DGn-ch ) become relatively important (Table 1). As demonstrated in Figure 4b, we observe a substantial correlation between DGn-ch and the change in the solvent accessible surface area of mainchain oxygen and nitrogen atoms (SASA main n2ch ): DGn-ch 52C
DSASA main n2ch SASA main n2ch ðwt Þ
(8)
We note here that SASA main n-ch is sensitive to the main-chain hydrogen bonding, that is, to the contents of secondary structures in a protein. While Ab42 protein is intrinsically disordered in aqueous environments, it is not a homogeneous statistical random coil polymer and instead exhibits a certain amount of residual secondary structures, whose contents vary on mutation.[32] The result shown in Figure 4b indicates that the protein hydrophobicity is also affected by the protein secondary structure.
Discussion Hydrophobic effect is considered as one of the principal driving factors for protein folding and biological organization.[41– 43] In a broad sense, the protein aggregation can also be conceived to result from “phobia” for water as aggregating proteins prefer being beside each other than fully surrounded by water. However, protein surface normally comprises both hydrophobic and hydrophilic patches, and it remains a fundamental challenge to elucidate how the hydrophobicity is determined and involved in actual biological processes.[44–50] In fact, on average approximately 70% of protein–protein interfaces were found to be composed of hydrophilic residues, including approximately 37% of charged residue contributions,[51] implying more relevance of hydrophilic residues in protein–protein interactions. The overall protein hydrophobicity quantified by the hydration free energy represents the affinity of a whole protein toward water that is influenced by both hydrophobic and hydrophilic constituent residues. It, therefore, applies generically to proteins whose surface exhibits a wide range of chemical heterogeneity and has also been shown to be a crucial factor whose comprehension will contribute to understand the protein aggregation propensity.[39,40] We demonstrate here through the site-directed analysis of the protein hydrophobicity that the residual hydrophobicity— hydrophobicity of an amino acid residue embedded in the protein context—is markedly deviated from the conventional hydrophobicity scale of a corresponding free amino acid. In particular, the hydrophobicity of charged residues on protein surface is shown to be strikingly different from the conventional biophysical one for free charged amino acids.
FULL PAPER
Figure 4. Rationalization of the charged-residue (DGch ) and non-chargedresidue (DGn-ch ) contributions. a) Modeling of DGch by ADq2BDsasa side ch . Here Dq denotes the dimensionless change in the protein total charge on mutation, Dq5½Qðmut Þ2Qðwt Þ=Qðwt Þ, and Dsasa side refers to the correch sponding dimensionless change in the solvent accessible surface area (SASA) of side-chain oxygen and nitrogen atoms in the charged residues. main b) Modeling of DGn-ch by 2CDsasa main n-ch , where Dsasa n-ch represents the dimensionless change in the SASA of main-chain oxygen and nitrogen atoms in the non-charged residues. The parameters A, B, and C in units of kcal/mol determined from the least-squares fits are shown. Pearson correlation coefficient (R), P value, and the slope of the fitted curve are also displayed.
Furthermore, contrasting behavior of positively and negatively charged residues is observed, which can be explained in terms of the long-distance hydration structure reflecting the net charge of a protein. This can rationalize the asymmetrical role of positively and negatively charged residues in dictating the protein aggregation nature in the E. coli proteins[16] since the net charge of those proteins is mostly negative under physiological conditions. The protein hydrophobicity is also affected by the underlying protein conformation. In particular, the salt-bridge formation between charged residues on protein surface effectively neutralizes and causes the dehydration of those salt-bridged residues. This significantly weakens the residue–water interaction at a short-distance regime and acts to increase the hydration free energy of salt-bridged residues on protein surface. Such protein conformational changes and associated hydration structural effects are not taken into account in the sequencebased models for predicting the protein aggregation propensity.[3–6] For example, we find that the charged side chain of Arg41 in RR (I41R, A42R) mutant forms salt-bridges in approximately 33% of the simulated conformations (Supporting Information Fig. S5). This significantly increases the protein hydrophobicity of RR mutant, and explains the experimental observation in Ref. [23] on why RR mutant exhibits a much higher aggregation propensity than what would be expected from the increased biophysical hydrophilicity at the C-terminal end. The structural effects on protein hydrophobicity also have implications on the interplay between protein hydration and protein–protein interactions during the aggregation process. We have recently demonstrated that the water-mediated attraction, which can be quantified by the hydration free energy, is primarily responsible for two aggregating proteins to approach each other from large separations to a contact distance.[21] After two monomers start to make atomic contacts, direct protein–protein interactions come into play, leading to the formation of intermonomer salt-bridges, hydrogen Journal of Computational Chemistry 2014, 35, 1364–1370
1369
FULL PAPER
WWW.C-CHEM.ORG
bonds, and van der Waals contacts. Such structural changes necessarily involve dehydration of the interface region (i.e., a decrease in the solvent accessible surface area), causing an increase in the hydrophobicity of the dimer formed [see eqs. (7) and (8)]. The increased hydrophobicity of the dimer will in turn act as the driving force to attract other proteins in the subsequent oligomerization processes.[52]
Conclusions Understanding molecular determinants of protein hydrophobicity is of central importance in rationalizing and predicting the protein aggregation propensity. Here, we investigate how the hydrophobicity of amino acid residue (residual hydrophobicity) depends on the context they are embedded in a protein, which is not taken into consideration in the conventional hydrophobicity scales. We find that the residual hydrophobicity is significantly influenced by the protein global factor such as the total charge as well as by underlying protein conformations, and the resulting protein hydrophobicity cannot be comprehensively addressed by the amino acid sequence alone. Our site-directed analysis method that simultaneously deals with the protein three-dimensional structure as well as its hydration thermodynamics enables the prediction of mutation effects on the protein aggregation propensity, and will find a wide range of applications in protein sciences and biotherapeutics. Keywords: protein aggregation amyloid beta protein solvation free energy molecular dynamics simulation integralequation theory
How to cite this article: S.-H. Chong, S. Ham J. Comput. Chem. 2014, 35, 1364–1370. DOI: 10.1002/jcc.23631
]
Additional Supporting Information may be found in the online version of this article.
[1] C. A. Ross, M. A. Poirier, Nat. Med. 2004, 10, S10. [2] F. Chiti, C. M. Dobson, Annu. Rev. Biochem. 2006, 75, 333. [3] F. Chiti, M. Stefani, N. Taddei, G. Ramponi, C. M. Dobson, Nature 2003, 424, 805. [4] A.-M. Fernandez-Escamilla, F. Rousseau, J. Schymkowitz, L. Serrano, Nat. Biotechnol. 2004, 22, 1302. [5] G. G. Tartaglia, M. Vendruscolo, Chem. Soc. Rev. 2008, 37, 1395. [6] M. Belli, M. Ramazzotti, F. Chiti, EMBO Rep. 2011, 12, 657. [7] D. Chandler, Nature 2005, 437, 640. [8] M. G. Krone, L. Hua, P. Soto, R. Zhou, B. J. Berne, J. E. Shea, J. Am. Chem. Soc. 2008, 130, 11066. [9] B. J. Berne, J. D. Weeks, R. Zhou, Annu. Rev. Phys. Chem. 2009, 60, 85. [10] D. Thirumalai, G. Reddy, J. E. Straub, Acc. Chem. Res. 2012, 45, 83. [11] Y. Nozaki, C. Tanford, J. Biol. Chem. 1971, 246, 2211. [12] J. Kyte, R. F. Doolittle, J. Mol. Biol. 1982, 157, 105. [13] M. A. Roseman, J. Mol. Biol. 1988, 200, 513. [14] T. Hessa, H. Kim, K. Bihlmaier, C. Lundin, J. Boekel, H. Andersson, I. Nilsson, S. H. White, G. von Heijne, Nature 2005, 433, 377.
1370
Journal of Computational Chemistry 2014, 35, 1364–1370
[15] T. Hessa, N. M. Meindl-Beinker, A. Bernsel, H. Kim, Y. Sato, M. LerchBader, I. Nilsson, S. H. White, G. von Heijne, Nature 2007, 450, 1026. [16] T. Niwa, B.-W. Ying, K. Saito, W. Jin, S. Takada, T. Ueda, H. Taguchi, Proc. Natl. Acad. Sci. USA 2009, 106, 4201. [17] J. A. Hardy, G. A. Higgins, Science 1992, 256, 184. [18] S.-H. Chong, S. Ham, J. Chem. Phys. 2011, 135, 034506. [19] S.-H. Chong, S. Ham, Chem. Phys. Lett. 2012, 535, 152. [20] S.-H. Chong, M. Park, S. Ham, J. Chem. Theory Comput. 2012, 8, 724. [21] S.-H. Chong, S. Ham, Proc. Natl. Acad. Sci. USA 2012, 109, 7376. [22] D. J. Selkoe, M. B. Podlisny, Annu. Rev. Genomics Hum. Genet. 2002, 3, 67. [23] W. Kim, M. H. Hecht, J. Biol. Chem. 2005, 280, 35069. [24] C. Nilsberth, A. Westlind-Danielsson, C. B. Eckman, M. M. Condron, K. Axelman, C. Forsell, C. Stenh, J. Luthman, D. B. Teplow, S. G. Younkin, J. N€aslund, L. Lannfelt, Nat. Neurosci. 2001, 4, 887. [25] K. Murakami, K. Irie, A. Morimoto, H. Ohigashi, M. Shindo, M. Nagao, T. Shimizu, T. Shirasawa, J. Biol. Chem. 2003, 278, 46179. [26] D. A. Case, T. A. Darden, T. E. Cheatam III, C. L. Simmerling, J. Wang, R. E. Duke, R. Luo, K. M. Merz, D. A. Pearlman, M. Crowley, R. C. Walker, W. Zhang, B. Wang, S. Hayik, A. Roitberg, G. Seabra, K. F. Wong, F. Paesani, X. Wu, S. Brozell, V. Tsui, H. Gohlke, L. Yang, C. Tan, J. Mongan, V. Hornak, G. Cui, P. Beroza, D. H. Mathews, C. Schafmeister, W. S. Ross, P. A. Kollman, AMBER 11; University of California: San Francisco, 2010. [27] V. Hornak, R. A. A. Okur, B. Strockbine, A. Roitberg, C. Simmerling, Proteins 2006, 65, 712. [28] H. W. Horn, W. C. Swope, J. W. Pitera, J. D. Madura, T. J. Dick, G. L. Hura, T. Head-Gordon, J. Chem. Phys. 2004, 120, 9665. [29] T. Darden, D. York, L. Pedersen, J. Chem. Phys. 1993, 98, 10089. [30] J.-P. Ryckaert, G. Ciccotti, H. J. C. Berendsen, J. Comput. Phys. 1977, 23, 327. [31] H. J. C. Berendsen, J. P. M. Postma, W. F. van Gunsteren, A. DiNola, J. R. Haak, J. Chem. Phys. 1984, 81, 3684. [32] S.-H. Chong, J. Yim, S. Ham, Mol. BioSyst. 2013, 9, 997. [33] O. Crescenzi, S. Tomaselli, R. Guerrini, S. Salvadori, A. M. D’Ursi, P. A. Temussi, D. Picone, Eur. J. Biochem. 2002, 269, 5642. [34] N. Guex, M. C. Peitsch, Electrophoresis 1997, 18, 2714. [35] S.-H. Chong, S. Ham, Comput. Theor. Chem. 2013, 1017, 194. [36] D. Beglov, B. Roux, J. Chem. Phys. 1996, 104, 8678. [37] D. Beglov, B. Roux, J. Phys. Chem. B 1997, 101, 7821. [38] T. Imai, Y. Harano, M. Kinoshita, A. Kovalenko, F. Hirata, J. Chem. Phys. 2006, 125, 024911. [39] S.-H. Chong, S. Ham, Angew. Chem. Int. Ed. 2014, 53, 3961. [40] S.-H. Chong, C. Lee, G. Kang, M. Park, S. Ham, J. Am. Chem. Soc. 2011, 133, 7075. [41] W. Kauzmann, Adv. Protein Chem. 1959, 14, 1. [42] C. Tanford, Science 1978, 200, 1012. [43] K. A. Dill, Biochemistry 1990, 29, 7133. [44] N. Giovambattista, C. F. Lopez, P. J. Rossky, P. G. Debenedetti, Proc. Natl. Acad. Sci. USA 2008, 105, 2274. [45] S. Granick, S. C. Bae, Science 2008, 322, 1477. [46] G. Reddy, J. E. Straub, D. Thirumalai, Proc. Natl. Acad. Sci. USA 2010, 107, 21459. [47] P. Ball, Nature 2011, 478, 467. [48] A. J. Patel, S. Garde, J. Phys. Chem. B 2014, 118, 1564. [49] S. Amrhein, S. A. Oelmeier, F. Dismer, J. Hubbuch, J. Phys. Chem. B 2014, 118, 1707. [50] L. H. Kapcha, P. J. Rossky, J. Mol. Biol. 2014, 426, 484. [51] S. Ansari, V. Helms, Proteins 2005, 61, 344. [52] S.-H. Chong, S. Ham, Phys. Chem. Chem. Phys. 2012, 14, 1573.
Received: 13 March 2014 Revised: 14 April 2014 Accepted: 21 April 2014 Published online on 10 May 2014
WWW.CHEMISTRYVIEWS.COM