Supporting Information
Steered molecular dynamics simulations for studying protein-ligand interaction in cyclin-dependent kinase 5 Jagdish Suresh Patel, Anna Berteotti, Simone Ronsisvalle, Walter Rocchia, Andrea Cavalli
Contents Steered Molecular Dynamics (SMD) details
SI2
Potential of Mean Force Calculation in SMD
SI2
Calculation of average force profiles
SI4
Details about the choice of the pulling velocities, spring constant, pulling length
SI4
Ligand solvation as a function of SMD simulation time
SI8
Calculation of total number of hydrogen bonds (H-bonds)
SI9
Stability of the binding poses during the MD simulation
SI13
References
SI15
SI1
Steered Molecular Dynamics (SMD) details In SMD simulations, a time-dependent external force is applied to the ligand to facilitate its unbinding from the protein, which usually cannot be achieved by standard MD simulation. In particular, in SMD the transition between two states, here the bound and unbound ones, is achieved by adding to the standard Hamiltonian a harmonic time-dependent potential U(r,t) acting on a descriptor s(r) (e.g., the protein-ligand distance), which holds the following time dependency:
, 0 (1) where s0 is the value of the descriptor in the initial state, t is the time, and k is a constant representing the strength of the applied force. After a predetermined amount of time, the harmonic constraint will be centered in its final position: 1 . (2) Therefore, at constant pulling velocity, if the spring constant k is large enough (stiff-spring regime), it is reasonable to assume that, at the final time t1, the system has approximately reached the point s1. During this transition, the value of the exerted force F is calculated using the following equation: λ . (3) The external work ∆W performed on the system was calculated by integrating the power along the entire transition time:
∆ . (4)
Potential of Mean Force Calculation in SMD Jarzynski1,2 developed an equality connecting the free energy difference between two states with a suitable average of the out-of-equilibrium work profiles obtained in SMD simulations. Indeed, he provided the connection between equilibrium free energy differences and the work done through many realizations of non-equilibrium processes:
!
〈 #$,→& 〉 , ( SI2
)*
(5)
where T is the temperature of the system and kB the Boltzmann constant. Here, the ensemble average is taken over a number of realizations of non-equilibrium work done on the system during pulling simulations. The Jarzynski equality thus provides an estimate for Δ, for a set of N work values given by
#$,→& Δ,-./0 ln 3 ∑4 8 . (6) 67
4
In practice, its direct application is hampered by the limited number of collectable trajectories along with the complex nature of the biological systems, which often result in a standard deviation of the work (9: ) several times higher than kBT.3 The Jarzynski equality requires a weighted average of the work values with an exponential set of weights. The net effect of this averaging is that only trajectories with significantly low work, corresponding to the left tail of Gaussian work distribution, give a relevant contribution to the average.3 Moreover, SMD pulling trajectories often sample the region around the peak rather than the tails of the Gaussian work distribution. In the recent years, this statistical uncertainty has been tackled in various studies giving rise to more effective applications of the Jarzynski’s equality.4,5 By assuming a Gaussian distribution for the work values, eq. 5 can be simplified using the second order cumulant expansion.4 This approach (eq. 7) has been widely applied4-7 for reconstructing the free energy profile of bio molecular processes. This was here used to compare the free energy profiles of all the ligands in this study.
Δ, 〈〉 (9: , 9: 〈 〉 〈〉 . (7) The above equation gives the free energy differences (∆A) between two states from the mean work 〈〉 over all the trajectories and subtracting the variance 9: contribution. The idea is that highly irreversible work trajectories are likely to have higher 9: so that, in theory, one can obtain the same ∆A value using above equation with different amounts of dissipated work (i.e., at different pulling velocities).
SI3
Calculation of average force profiles The average force profiles along the time shown in the Figure 1 and 2 for each ligand were calculated in the following way:
〈〉 ∑4 67 6 . (8) 4
Details about the choice of the pulling velocities The choice of the pulling velocities was based on a compromise between accuracy and speed, which is a classical tradeoff in drug design. In particular, four different pulling velocities have been tested: 0.005 Å/ps, 0.006 Å/ps, 0.007 Å/ps and 0.025 Å/ps, keeping all the other parameters identical. The ligand is outside the protein and fully solvated after 5 ns, 4 ns, 3 ns, and 1 ns, respectively. A comparison between the two regimes is reported in Figure S1, where the average force profiles are reported for compounds 1-6 for the four velocities. The slow regime using a velocity of 0.007 Å/ps was chosen because the correlation between the average force profile and the biological activity is better as compared to the fast regime (0.025 Å/ps) and it needs reasonable computer time. However, given the fact that our protocol was not able to discriminate different chemical compounds, see the force profile for the Series II, we were aware that a possible explanation for this failure could be related to the SMD pulling velocities, which should be very low to allow a proper reconstruction of accurate PMF profiles, which can then be quantitatively correlated with experimental data. This can be seen in the Figure S1 where the Force profiles as a function of simulation time are reported also for lower velocities. The main atomistic features of the unbinding process are conserved through the all 4 velocities regime, in fact the forces’ profiles are very similar in all the 4 velocities regimes, see the position of the peaks. The ability of SMD to discriminate between molecules with very similar IC50 values is increased with lower SMD pulling velocities. At lower velocities, see for example 0.006 and 0.005 Å/ps the decay of first force peak is very well distinguishable for molecules 1, 2, 3, 4, while in case of fast velocity regime is not possible to distinguish between molecules 1, 2, 3, 4. It is possible to envision that the use of very SI4
low velocity will improve the ability of SMD, but in the compromise between simulation time and results, higher velocities regime can still be useful for drug design purpose. However, as mentioned, discrimination between binders and non-binders can still be achieved using high pulling velocities, which require simulation resources in line with the drug discovery requirements of speed and costeffectiveness.
Figure S1. Average force profile as a function of time for compounds 1-6 in the four regimes: 0.025 Å/ps (A), 0.007 Å/ps (B), 0.006 Å/ps (C) and 0.005 Å/ps (D).
Details about the choice of the spring constant The spring constant was chosen hard enough in order to steer the system to the target point (see Figure S2). From the plot in fact it can be seen that the distance closely follows the constraint center of the harmonic constraining potential in line with ref.4,8 .
SI5
Figure S2. Line in black displays the behavior of the system along the SMD simulation of compound 2, using a spring constant of 100 kcal/(Å2 ·mol). Red line represents the center of the moving harmonic constraining potential.
We also checked that the chosen force constant together with the adopted velocity regime was not able to disrupt the protein structure. In Figure S3, the RMSD of the Cα of the binding pocket as a function of the SMD simulation time is plotted for compound 2. The RMSD is calculated on the Cα atoms belonging to the following residues: Ile10, Gly11, Glu12, Gly13, Phe80, Glu81, Phe82, Cys83, Asp84, Gln85, Asp86, Ala143 and Asp144. As it can be seen, there are no dramatic changes in the conformation of the binding pocket.
SI6
Figure S3. Cα atoms RMSD of the binding pocket during the SMD simulation run for compound 2.
Details about the choice of the pulling length The choice of the pulling length was based on the ligand solvation. Basically, from the solvation plot reported in Figure S2 it is reasonable to establish the total pulling length to 20 Å. At a pulling distance of 25 Å the ligand is completely out of the binding site and fully solvated.
SI7
Figure S4. The solvation of compound 1 as a function of SMD pulling distance. At a pulling distance of 25 Å the ligand is fully solvated and the number of water molecules around the ligand after this point remains constant over time.
Details about the pulling distance and the cylindrical harmonic restraint The pulling distance was defined between the center of mass of selected atoms of the ligand (see Table S1) and the center of mass of the following atoms of the protein: the Cα atoms of Val30, Ala31, Leu78, the nitrogen backbone atoms of Lys33, Met185, the oxygen backbone atoms of Lys33, Leu78 and Asp184, and the carbon atom belonging to the carbonyl backbone group of Leu78 and Leu142. As explained in the Methods section of the paper, we used a cylindrical harmonic restraint that hinders the interaction of the ligand with the surface of the protein while the former is out of the binding site. This restraint is implemented as the distance of a point to the axis of the cylinder. The point is the center of mass of selected atoms of the ligand while the axis connects the center of mass of two atom groups of the protein. One of these two groups is the same used to define the pulling distance for the ligand to be pulled out of the binding pocket. The second group is the center of mass of the following residues: the Cα atoms of Ile10, Ala31, Gln85, Asp86, Leu87, Leu133, Asn135, Ile191, the nitrogen backbone atoms of Val30 and Asp86, the carbon atom belonging to the carbonyl backbone group of Lys9, Phe19, Val30, Asp84 and the oxygen atom of the carbonyl backbone group of Phe19.
Ligand solvation as a function of SMD simulation time Here below we report the ligand 1 solvation as a function of SMD simulation time as representative case for all the 9 ligands. The ligand solvation has been calculated by means of ProDy9 tool through all the 50 SMD trajectories and the reported values are average values. The solvation is calculated as the number of water molecules around the ligand within a cut-off of 4 Å. SI8
Figure S5. The solvation of compound 1 as a function of SMD simulation time. After 2 ns the ligand is completed solvated and the number of water molecules around the ligand after this time remains constant.
Calculation of total number of hydrogen bonds (H-bonds) The total number of H-bonds shown in Figures 5 and 6 in the Results section are calculated as the sum of contacts between all the H-bond acceptor (A) and donor (D) atoms of the ligand and all the H-bond acceptor and donor atoms of the binding pocket, using the following equation: ; ∑6∈> ∑