Modeling the sequence-dependent diffusion ... - UT Mathematics

Report 2 Downloads 90 Views
THE JOURNAL OF CHEMICAL PHYSICS 129, 165105 共2008兲

Modeling the sequence-dependent diffusion coefficients of short DNA molecules O. Gonzalez1,a兲 and J. Li2 1

Department of Mathematics, University of Texas, Austin, Texas 78712, USA Graduate Program in Computational and Applied Mathematics, University of Texas, Austin, Texas 78712, USA

2

共Received 4 June 2008; accepted 8 September 2008; published online 28 October 2008兲 A boundary element model for the computation of sequence-dependent hydrodynamic properties of short DNA molecules is introduced. The hydrated surface is modeled as a curved tube of uniform radius with ends capped by hemispheres, and the axis of the tube is a general space curve whose length and curvature are determined locally by the sequence using a rigid basepair model of double-helical DNA with parameters based on x-ray crystallography. Diffusion coefficients for families of random and periodic DNA sequences are computed and compared with theories for straight tubes and experimental data. Our results indicate that sequence-dependent curvature can have a measurable impact on both the translational and rotational diffusion coefficients, even for relatively short fragments of lengths less than about 150 basepairs, and that previous estimates of the hydrated radius of DNA are likely to be underestimates. Moreover, our results suggest a possible method for refining the rigid basepair model parameters for DNA in solution as well as the hydrated radius. © 2008 American Institute of Physics. 关DOI: 10.1063/1.2992080兴 I. INTRODUCTION

There are many experimental methods for probing the structure of macromolecules in solution. Examples range from more modern techniques such as light scattering,1 fluorescence polarization,2 and electric dichroism and birefringence3 to more classic techniques such as centrifugation4–6 and electrophoresis,7 including newer variants such as membrane-confined8 and capillary9 electrophoresis, which employ a solvent rather than a gel support medium. When a macromolecule is relatively rigid and its surrounding medium can be modeled as a Newtonian fluid, there exist well-defined theories connecting experimental observables to the overall size and shape of the macromolecule. Hydrodynamic properties such as the translational and rotational diffusion coefficients of the macromolecule at infinite dilution play a central role in these theories. Indeed, experimental measurements often lead to direct estimates of these diffusion coefficients, which are then connected to the size and shape of a macromolecule through an appropriate hydrodynamic model. Recent advances in experimental, modeling, and computational techniques now allow the structure of macromolecules in solution to be studied with more resolution than ever before. Methods for solution, combined with methods for crystalline states such as x-ray diffraction, together provide a powerful means of exploring macromolecular structure in various applications to proteins10–13 and DNA.14–17 A hydrodynamic model of a macromolecule involves two main modeling assumptions. The first assumption is that the macromolecule can be represented by an effective hydrated surface which represents not only the macromolecule a兲

Electronic mail: [email protected].

0021-9606/2008/129共16兲/165105/12/$23.00

itself but also any tightly bound solvent molecules. In classic treatments, this surface has traditionally been modeled as a simple shape such as an ellipsoid or a straight circular cylinder.18 The second assumption is that the solvent medium exterior to the hydrated surface can be modeled as a Newtonian fluid. For the description of relatively slow diffusive translational and rotational motions of a relatively large macromolecule, the steady Stokes equations with no-slip boundary conditions have had great success and have become standard.19,20 Except for the simplest of geometries such as spheres and ellipsoids, the Stokes equations are, in general, too complex to solve exactly and, hence, approximations must be made. Two common classes of approximations are boundary element methods,12,21,22 which provide convergent approximations to the Stokes equations and the no-slip boundary condition, and bead-type methods,13,23 which provide only heuristic approximations. In this article, we introduce a boundary element model for the stable and accurate computation of sequencedependent hydrodynamic properties of short DNA fragments. We model the hydrated surface as a curved cylinder or tube of uniform radius with ends capped by hemispheres. The axis of the tube is a general space curve whose length and curvature are determined locally by the sequence. This axis is constructed using a rigid basepair model of double-helical DNA in which the relative displacement and orientation between adjacent basepairs 共dimer step兲 is described by a set of six parameters depending on the dimer composition.24–26 For the case of B-DNA, parameters for all possible dimer compositions have been estimated from crystallographic data.27 Using these parameters, we construct an axis and a corresponding tubular surface which twists and bends in space according to the sequence. This rigid surface is expected to

129, 165105-1

© 2008 American Institute of Physics

Author complimentary copy. Redistribution subject to AIP license or copyright, see http://jcp.aip.org/jcp/copyright.jsp

165105-2

O. Gonzalez and J. Li

provide a reasonable approximation for DNA fragments with lengths from about 20 to 150 basepairs. At longer lengths, the rigidity assumption becomes more of an issue since the persistence length of DNA is about 150 basepairs.28 At shorter lengths, we expect an atomistic-type surface which captures local fine-scale features to be more appropriate.16 Studies on straight tube models have shown that, for lengths greater than about 20 basepairs, the details of the capped ends have a negligible effect on global hydrodynamic properties such as the translational and rotational diffusion coefficients, provided the end-to-end length of the axial curve is the same.29–31 Thus a tube model with ends capped by hemispheres is expected to produce nearly identical results as a model with ends capped by disks provided the length of the axial curve is matched. We prefer to work with hemispherical caps since they lead to a better conditioned hydrodynamic problem because sharp edges are avoided. The basic hydrodynamic problem is to solve the steady Stokes equations with no-slip boundary conditions in the three-dimensional domain exterior to the hydrated surface. To this end, we employ a new stable, accurate, and singularity-free boundary integral formulation described elsewhere.32 The formulation is a Fredholm equation of the second kind and employs a mixed combination of the classic double- and single-layer potentials defined on parallel surfaces. In contrast to formulations of the first kind based on the single-layer potential alone,15,21,22 the mixed formulation is not subject to ill conditioning in the limit of fine discretizations, and also avoids issues associated with null vectors, which pose a difficulty independent of the conditioning problem.33 By virtue of being singularity-free, the mixed formulation can be discretized using standard high-order Gauss quadrature rules. Moreover, because parametrizations of the tubular surface geometry are available, all integrals can be formulated on the exact curved surface without a further approximation by flat elements. Thus curvature effects in both the circumferential and axial directions can be accurately resolved without the need for fine meshes or heuristic area correction factors.21 Numerical experiments on a variety of tubular surfaces with relatively large curvatures indicate that global hydrodynamic quantities such as resistance matrices and diffusion coefficients can be computed with a typical accuracy of about 0.1% using moderate meshes. Compared with typical experimental errors in these quantities, the proposed numerical method can be viewed as delivering nearly exact results. Other formulations of the second kind are possible,12,20,34,35 but these typically require the use of product integration rules36,37 or specialized coordinate transformations and projections38 to accurately approximate weakly singular integrals, and some also require that one or more distinguished points within the hydrated surface be identified for the placement of point singularities.12,34 For purposes of validation, we use our boundary element method to accurately compute diffusion coefficients for a straight tube model of DNA and compare our results with various empirical formulas17,31 and experimental data.39–45 A classic model for the hydrated surface of an arbitrary DNA sequence of length up to about 150 basepairs is that of a straight circular tube. The length of the tube is determined by

J. Chem. Phys. 129, 165105 共2008兲

the rise per basepair, whose average value is about 3.4 Å,46 and the radius of the tube is a prescribed constant, whose value is estimated to be between 10 Å46 and 13 Å.17 The first estimate is based on x-ray diffraction data, whereas the second is based on experimental data in solution and empirical formulas for the diffusion coefficients. Using our boundary element method, we compute the translational and rotational diffusion coefficients for a straight tube model as a function of length 共number of basepairs兲 for different radius values. For the translational diffusion coefficient we observe a good agreement between the boundary element results and the empirical formulas, but for the rotational diffusion coefficient we observe more significant differences. Our boundary element results confirm the previous observations that a hydrated radius of about 10– 13 Å is consistent with the experimental data on the translational diffusion coefficient,17 whereas a radius of 13– 17 Å is consistent with the experimental data on the rotational diffusion coefficient. We surmise that this difference may largely be due to the limited validity of the straight tube assumption, which is a critical assumption in various experimental techniques for measuring rotational diffusion.1,18 To illustrate the effects of the sequence, we compute the translational and rotational diffusion coefficients as a function of length for families of random and periodic DNA sequences using the rigid basepair model with crystallographic data.27 Our results show that while the data for the sequencedependent curved tube model exhibit the same general trends as that for the straight tube model, there is a noticeable systematic offset due to curvature effects. For each value of the radius, the straight model uniformly underestimates both the translational and rotational diffusion coefficients over the entire range of lengths compared to the sequence-dependent curved model. This result suggests that sequence-dependent curvature can have a measurable impact on the diffusion coefficients, even for relatively short fragments of lengths less than about 150 basepairs, and that previous estimates of the hydrated radius are likely to be underestimates. The variability of the sequence-dependent data also suggests a possible method for refining rigid basepair model parameters for DNA in solution. In particular, from experimental data on known sequences of known length, the hydrodynamic model introduced here could be used to numerically fit the relative displacement and rotation parameters for all the different possible dimer steps, as well as the hydrated radius. Indeed, the possibility of refining these structural parameters has been a major motivation for this work. The presentation is structured as follows. In Sec. II we outline a rigid basepair model of DNA, translate crystallographic data27 to the parametrization used here, and describe our method for constructing a tubular surface model for a given DNA sequence. In Sec. III we briefly outline the hydrodynamic equations and numerical boundary element procedure for computing the resistance and mobility matrices and diffusion coefficients for a given surface. In Sec. IV we present numerical results for both the straight and the sequence-dependent curved tube models and compare them with previous results and experimental data. In Sec. V we end with some concluding remarks.

Author complimentary copy. Redistribution subject to AIP license or copyright, see http://jcp.aip.org/jcp/copyright.jsp

165105-3

J. Chem. Phys. 129, 165105 共2008兲

Sequence-dependent diffusion

FIG. 1. Illustration of conventions used in the rigid basepair model of DNA. 共a兲 Indexing scheme for basepairs 共X , ¯X兲a 共a = 1 , . . . , n兲. 共b兲 Reference point q and frame 兵di其 for an arbitrary basepair 共X , ¯X兲. 共c兲 Relative rotation and displacement between adjacent basepairs 共X , ¯X兲a and 共X , ¯X兲a+1 with associ¯ a其. ated middle frame 兵d i

II. MODEL

Here we outline a rigid basepair model for describing the three-dimensional sequence-dependent structure of a DNA molecule. We introduce the quantities necessary to define the configuration of a basepair, define coordinates for describing the relative rotations and displacements between adjacent basepairs, translate crystallographic data27 to the parametrization used here, and describe our method for constructing a tubular surface model for a given DNA sequence. A. Basepairs, configurations

We consider right-handed double-helical DNA in which bases T, A, C, and G are attached to two oriented antiparallel backbone strands and form only the standard Watson–Crick pairs 共A,T兲 and 共C,G兲. Choosing one backbone strand as a reference, a DNA molecule consisting of n basepairs is identified with a sequence of bases X1X2 . . . Xn, listed in the 5⬘ – 3⬘ direction along the strand, where Xa 苸 兵T , A , C , G其. The basepairs associated with this sequence are denoted by 共X , ¯X兲1 , 共X , ¯X兲2 , . . . , 共X , ¯X兲n, where ¯X is defined as the Watson–Crick complement of X in the sense that ¯A = T and ¯ = G. The notation 共X , ¯X兲 for a basepair indicates that base C a X is attached to the reference strand while ¯X is attached to the opposite strand as illustrated in Fig. 1共a兲. We adopt a standard model of DNA24–26 in which each basepair is modeled as a flat rigid object. The configuration of an arbitrary basepair 共X , ¯X兲a is specified by giving the location of a reference point qa 苸 R3 fixed in the basepair and the orientation of a right-handed orthonormal frame 兵dai 其, dai 苸 R3 共i = 1 , 2 , 3兲, attached to the basepair. The reference point and the frame vectors are defined according to the Tsukuba convention.26 The vector da1 points in the direction of the major groove along the perpendicular bisector of the C1⬘-C1⬘ axis of an ideal basepair, whereas da2 points in the direction of the reference strand and is parallel to the C1⬘-C1⬘ axis. As a result, da3 = da1 ⫻ da2 is perpendicular to the plane of 共X , ¯X兲a and normally points in the direction of 共X , ¯X兲a+1. The reference point qa is located at the intersection of the perpendicular bisector of the C1⬘-C1⬘ axis with the axis defined by the pyrimidine C6 and the purine C8 atoms as illustrated in Fig. 1共b兲. There are four possible basepairs 共X , ¯X兲a corresponding to the choice Xa 苸 兵T , A , C , G其. In a rigid basepair model, the positions of the nonhydrogen atoms in each of these base-

pairs with respect to qa and 兵dai 其 are considered to be constant. As a result, once the reference point and frame of a basepair are specified, so too are the positions of all of its nonhydrogen atoms. Estimated values for these positions for basepairs in their ideal forms have been tabulated by Olson et al.26 Thus the configuration of a DNA molecule consisting of n basepairs is completely defined by the reference points qa and frames 兵dai 其 共a = 1 , . . . , n兲. B. Rotation, displacement coordinates

In the present model, the three-dimensional shape of a DNA molecule is determined by the relative rotation and displacement between adjacent basepairs as illustrated in Fig. 1共c兲. The relative rotation and displacement between 共X , ¯X兲a and 共X , ¯X兲a+1 can be described in the general form 3

3

da+1 j

=兺

⌳aijdai ,

q

a+1

= q + 兺 ␩ai ¯dai ,

共1兲

a

i=1

i=1

where ⌳a 苸 R3⫻3 is a rotation matrix which describes the a a 3 orientation of frame 兵da+1 i 其 with respect to 兵di 其, ␩ 苸 R is a a+1 with coordinate vector which describes the position of q ¯ a其 is a right-handed orthonormal frame respect to qa, and 兵d i ¯a between 兵dai 其 and 兵da+1 i 其. The frame 兵di 其 is often referred to as a middle frame and will be defined below. From Eq. 共1兲 we deduce that the entries in ⌳a and ␩a are given by ⌳aij = dai · da+1 j ,

␩ai = ¯dai · 共qa+1 − qa兲.

共2兲

A rotation matrix ⌳a can be parametrized by a coordinate vector ␪a 苸 R3 in a variety of ways. In this work, we parametrize rotation matrices using the Cayley 共also referred to as Euler–Rodrigues or Gibbs兲 formula47 ⌳a = cay关␪a兴 ª I +





4 1 关␪a ⫻兴 + 关␪a ⫻兴2 , 2 4 + 兩 ␪ a兩 2

共3兲

where I is the identity matrix and 关␪a⫻兴 denotes the skewsymmetric matrix 关␪a ⫻兴 =



0

− ␪a3

␪a3

0



␪a2

␪a1

␪a2



− ␪a1 . 0

共4兲

The Cayley formula can be explicitly inverted as

␪a = cay−1关⌳a兴 ª

2 vec关⌳a − 共⌳a兲T兴, tr关⌳a兴 + 1

共5兲

where tr关⌳a兴 and 共⌳a兲T denote the trace and the transpose of ⌳a and, for an arbitrary skew-symmetric matrix A, we define vec关A兴 = 共A32 , A13 , A21兲. Equations 共3兲 and 共5兲 provide a oneto-one correspondence between rotation matrices ⌳a and coordinates ␪a, provided that tr关⌳a兴 ⫽ −1. Matrices for which tr关⌳a兴 = −1 can be shown to correspond to a rotation through ␲-radians 共180°兲, which are unlikely to occur between adjacent basepairs in our application. The Cayley parametrization of a rotation matrix has a straightforward geometrical interpretation. The matrix ⌳a in Eq. 共3兲 corresponds to a right-handed rotation about a unit vector ␯a through an angle ␾a 苸 关0 , ␲兲, where

Author complimentary copy. Redistribution subject to AIP license or copyright, see http://jcp.aip.org/jcp/copyright.jsp

165105-4

J. Chem. Phys. 129, 165105 共2008兲

O. Gonzalez and J. Li

TABLE I. Estimates of tilt-roll-twist in the Cayley coordinates.

3

1 ␯ = a 兺 ␪ai dai , 兩␪ 兩 i=1

共6a兲

a

冉 冊

兩 ␪ a兩 ␾a = 2 arctan . 2

共6b兲

From Eq. 共6a兲 we deduce that a simple rotation about the frame vector da1 is obtained when ␪a = 共␪a1 , 0 , 0兲, where the angle of rotation is determined by Eq. 共6b兲. Similar conclusions can be drawn for simple rotations about the other frame vectors. In general, a rotation about a given unit vector ma 3 = 兺i=1 ␮ai dai through a given angle ␺a 苸 关0 , ␲兲 is obtained when

冉 冊

␪ai = 2 tan

␺a a ␮i . 2

共7兲

¯ a其 兵d i

The middle frame used to describe relative displacements can now be defined. Let ⌳a be the relative rotation a matrix for frame 兵da+1 i 其 with respect to 兵di 其. Then the coora a a dinates ␪ , axis ␯ , and angle ␾ associated with this rotation are as given in Eqs. 共5兲, 共6a兲, and 共6b兲. The middle frame is here defined by a relative rotation about the same axis ␯a, but through an angle of ␾a / 2. Using Eq. 共7兲 we obtain 3

¯da = 兺 cay 关¯␪a兴da, ij j i i=1

冉 冊

a a ¯␪a = 2 tan ␾ ␪i . i 4 兩 ␪ a兩

共8兲

Thus the relative rotation and displacement between an arbitrary pair of adjacent basepairs 共X , ¯X兲a and 共X , ¯X兲a+1 are completely described by the coordinates ␪a and ␩a. The definitions above satisfy all the qualitative guidelines set forth in the Cambridge convention,24 including the symmetry conditions associated with a change in reference strand. Accordingly, we refer to ␪a as tilt-roll-twist coordinates and ␩a as shift-slide-rise coordinates. For a given configuration 兵qa , dai 其 of basepair 共X , ¯X兲a, the above definitions establish a one-toone correspondence between 兵␪a , ␩a其 and the configuration a ¯ 兵qa+1 , da+1 i 其 of basepair 共X , X兲a+1. Notice that ␪ are not conventional angular coordinates as employed by many authors. Rather, they are abstract coordinates defined via the parametrization in Eq. 共3兲. These abstract coordinates can be put into correspondence with conventional angular ones, and are nearly identical in the case of small rotations when the angular ones are measured in radians.

A

T

G

C

A

C

G

0.00 1.98 5.23 2.56 1.28 6.32 0.18 1.27 5.64 3.09 8.17 5.72

−2.56 1.28 6.32 0.00 6.09 6.85 −2.76 3.49 6.56 0.92 8.66 6.76

−0.18 1.27 5.64 2.76 3.49 6.56 0.00 0.55 6.04 0.18 6.55 5.91

−3.09 8.17 5.72 −0.92 8.66 6.76 −0.18 6.55 5.91 0.00 9.92 6.52

␪1 ⫻ 102 ␪2 ⫻ 102 ␪3 ⫻ 101 ␪1 ⫻ 102 ␪2 ⫻ 102 ␪3 ⫻ 101 ␪1 ⫻ 102 ␪2 ⫻ 102 ␪3 ⫻ 101 ␪1 ⫻ 102 ␪2 ⫻ 102 ␪3 ⫻ 101

Xa 共rows兲 and Xa+1 共columns兲. Notice that only 10 of the 16 entries in each table are independent, as required by symmetry considerations based on the Watson–Crick pairing rules. In particular, each entry below the main diagonal in each table is a simple multiple of the corresponding entry above. Notice that the parameters ␪a in Table I are dimensionless, whereas the parameters ␩a in Table II have dimensions of length. In this work, we use Tables I and II to define the threedimensional structure of a DNA molecule. Given a molecule with sequence X1X2 . . . Xn, there are n − 1 dimer steps XaXa+1. To each such step we associate a unique coordinate set 兵␪a , ␩a其 from the tables. For a given configuration 兵qa , dai 其 of basepair 共X , ¯X兲a, the configuration 兵qa+1 , da+1 i 其 of basepair 共X , ¯X兲a+1 is determined by 兵␪a , ␩a其. Thus the threedimensional shape of a molecule is built up recursively beginning from an arbitrary choice of configuration for the first basepair. The three-dimensional structure constructed from the data in Tables I and II is to be interpreted as an approximation of the relaxed ground-state shape of a molecule. Sufficiently short molecules are expected to remain close to this shape under appropriate experimental conditions. At long lengths, the shape of a DNA molecule is expected to be highly variable due to its intrinsic flexibility. D. Hydrated surface

We model the hydrated surface of a DNA molecule as a curved cylinder or tube of uniform radius with ends capped TABLE II. Estimates of shift-slide-rise 共Å兲 in the Cayley midframe.

C. Crystallographic estimates

In the simplest model of sequence-dependent DNA structure, the coordinate set 兵␪a , ␩a其 is completely determined by the composition of basepairs 共X , ¯X兲a and 共X , ¯X兲a+1. Thus there are 16 different coordinate sets corresponding to the 16 possible basepair combinations defined by Xa , Xa+1 苸 兵T , A , C , G其. Crystal structures of B-form DNA were analyzed by Olson et al.27 In that work, estimates for a set of relative rotation and displacement coordinates were reported for all of the 16 possible combinations of adjacent basepairs. These estimates were translated into the coordinates defined here and are summarized in Tables I and II as a function of

T

A

T

G

C

T

A

C

G

0.00 −5.95 3.31 0.35 −0.82 3.27 −1.35 −5.82 3.36 −0.90 −2.68 3.34

−0.35 −0.82 3.27 0.00 0.31 3.42 −2.85 0.79 3.37 0.94 5.04 3.33

1.35 −5.82 3.36 2.85 0.79 3.37 0.00 −3.81 3.40 −0.56 −2.37 3.42

0.90 −2.68 3.34 −0.94 5.04 3.33 0.56 −2.37 3.42 0.00 3.80 3.39

␩1 ⫻ 101 ␩2 ⫻ 101 ␩3 ⫻ 100 ␩1 ⫻ 101 ␩2 ⫻ 101 ␩3 ⫻ 100 ␩1 ⫻ 101 ␩2 ⫻ 101 ␩3 ⫻ 100 ␩1 ⫻ 101 ␩2 ⫻ 101 ␩3 ⫻ 100

Author complimentary copy. Redistribution subject to AIP license or copyright, see http://jcp.aip.org/jcp/copyright.jsp

165105-5

J. Chem. Phys. 129, 165105 共2008兲

Sequence-dependent diffusion

by hemispheres. The axis of this tube is a space curve defined by the basepair reference points. For a given radius r and sequence S = X1X2 . . . Xn, we denote the tubular surface by ⌫共r , S兲 and the axial curve by ␥共S兲. We model ␥共S兲 as a biarc curve,48 which is a space curve formed by a concatenation of circular arcs. A main motivation for this choice is that it leads to a simple construction of the surface ⌫共r , S兲. In particular, ⌫共r , S兲 is constructed by superimposing a toroidal segment of radius r on each arc of ␥共S兲. In the simplest approach, the curve ␥共S兲 is defined by interpolating the basepair reference points qa 共a = 1 , . . . , n兲. However, to maintain smoothness of the surface ⌫共r , S兲, the maximum curvature of the curve ␥共S兲 must be bounded by 1 / r, otherwise the surface ⌫共r , S兲 will locally self-intersect and develop kinks. While such irregularities in ⌫共r , S兲 would likely have a negligible effect on global hydrodynamic quantities, they can significantly increase the computational effort required to solve the hydrodynamic equations. Thus the problem of defining ␥共S兲 can be described as that of finding a biarc curve which best fits the reference points qa 共a = 1 , . . . , n兲 subject to the restriction ␬␥共S兲 ⱕ 1 / r, where ␬␥共S兲 denotes the maximum curvature of ␥共S兲. Guided by the above characterization we devised an iterative procedure for determining the axial curve ␥共S兲. Given a control parameter 0 ⬍ ␨ ⬍ 1, the procedure begins with an initial set of nodes qak = qa, constructed as described in Sec. II, and an interpolating curve ␥k共S兲, defined as the unique biarc interpolant of qak ,48 with tangent vectors computed from a cubic spline. If ␬␥k共S兲 ⱕ ␨ / r, the procedure is terminated and ␥共S兲 = ␥k共S兲. Otherwise, segments of high curvature on ␥k共S兲 are identified, and the indices of all interior high curvature nodes are collected into a set Jk. To smoothen the curve we perform the local Gaussian convolution a qk+1 =



1 a−1 4 qk qak ,

+

1 a 2 qk

+

1 a+1 4 qk ,

a 苸 Jk a 苸 Jk .



a The nodes qk+1 are then interpolated as before to obtain ␥k+1共S兲. The procedure is repeated until the curvature condition is satisfied. By design, the procedure smooths local finescale kinks of the axial curve while preserving large-scale features, where the scale is given by r. In practice, we used the parameter value ␨ = 1 / 2 and found that for realistic values of r 共for example, r = 13 Å兲 the procedure is terminated for most sequences after only one or two iterations. The resulting smoothened axial curves were found to differ from the unsmoothened curves by less than 0.1% in the l2-norm. The result of this procedure is a biarc curve ␥共S兲 which closely follows the path of the basepair reference points qa 共see Fig. 2兲. The curve begins at q1, ends at qn, and satisfies the curvature condition ␬␥共S兲 ⱕ 1 / r. The surface ⌫共r , S兲 is defined by superimposing a toroidal segment of radius r on each circular arc of the biarc curve ␥共S兲. We close the ends of ⌫共r , S兲 using hemispherical caps of radius r. In the simplest approach, the hemispheres would be centered at the points q1 and qn, which would cause the surface to extend beyond the first and last basepairs. To eliminate this effect, we translate the centers of the hemispheres along ␥共S兲 by a distance r from each end. Thus the

(a)

(c)

(b)

(d)

FIG. 2. Illustration of curved and straight tube models for the 40 basepair sequence 共CTTTTAAAGAG兲3CTTTTAA with radius r = 13 Å. 共a兲 Reference points qa. 共b兲 Reference points qa and smoothened axial curve ␥共S兲. 共c兲 Curved tube model ⌫共r , S兲. 共d兲 Straight tube model ⌫str共r , S兲.

points q1 and qn are located, approximately, at the apexes of the hemispheres. The resulting surface ⌫共r , S兲 can be shown to be continuously differentiable because ␥共S兲 is continuously differentiable and satisfies the curvature condition, and because the ends of ⌫共r , S兲 are capped by hemispheres. The overall procedure is illustrated in Fig. 2. For reference, the figure also shows a straight tube model ⌫str共r , S兲. The straight tube is constructed in a similar way from equally spaced reference points placed along a line with spacing 3.4 Å, which corresponds to a commonly accepted value for the average rise per basepair.46 This value is in close agreement with the average value of rise obtained from Table II, which is 3.36 Å. III. THEORY

Here we briefly outline the standard theory for modeling the hydrodynamics of a rigid molecule in an incompressible Newtonian fluid at infinite dilution. We state the basic flow equations, define various associated flow quantities, and introduce the resistance and mobility matrices and diffusion coefficients for a rigid body of arbitrary shape. We discuss

Author complimentary copy. Redistribution subject to AIP license or copyright, see http://jcp.aip.org/jcp/copyright.jsp

165105-6

J. Chem. Phys. 129, 165105 共2008兲

O. Gonzalez and J. Li

various properties of the diffusion coefficients relevant to our study and then briefly outline our numerical procedure, which is described in detail elsewhere.

ant force F and torque T, about the reference point c, are given by F=



A. Flow equations

We consider the slow diffusive motion of a relatively large rigid macromolecular body in an incompressible Newtonian fluid of absolute viscosity ␮ ⬎ 0. In a body-fixed frame, we denote the hydrated surface by ⌫, the interior body domain by B, and the exterior fluid domain by Be. Given a velocity field v : ⌫ → R3, we seek to determine the hydrodynamic loads exerted by the fluid on the body. According to standard theory,19,20 these loads are determined by the fluid velocity field u : Be → R3 and pressure field p : Be → R which satisfy the classic Stokes equations

␮ui,jj − p,i = 0,

x 苸 Be ,

共9a兲

ui,i = 0,

x 苸 Be ,

共9b兲

u i = v i,

x 苸 ⌫,

共9c兲

ui,p → 0,

兩x兩 → ⬁.

共9d兲

Equation 共9a兲 is the local balance law of linear momentum for the fluid and Eq. 共9b兲 is the local incompressibility constraint. Equation 共9c兲 is the no-slip boundary condition which states that the fluid and the body velocities coincide at each point of the hydrated surface. 共For the case of small macromolecules, a different type of boundary condition may be more appropriate.21,49兲 The limits in Eq. 共9d兲 are boundary conditions that are consistent with the fluid being at rest at infinity. Unless mentioned otherwise, all vector quantities are referred to a body-fixed frame and indices take values from 1 to 3. Here and throughout we will use the usual conventions that a pair of repeated indices implies summation, and that indices appearing after a comma denote partial derivatives. B. Body velocities, hydrodynamic loads

When the body B is rigid, the boundary velocity field v in Eq. 共9c兲 takes the general form v = V + ⍀ ⫻ 共x − c兲,

共10兲

where ⫻ denotes the cross product. Here V is the linear velocity of a given reference point c, and ⍀ is the angular velocity of the body-fixed frame. Under mild assumptions on the surface ⌫,33 the system in Eqs. 共9a兲–共9d兲 has a unique solution 共u , p兲 for given 共V , ⍀兲. The fluid stress field ␴ : Be → R3⫻3 associated with this solution is defined by

␴ij = − p␦ij + ␮共ui,j + u j,i兲,

共11兲

where ␦ij is the Kronecker delta symbol, and the traction field f : ⌫ → R3 exerted by the fluid on the body surface 共force per unit area兲 is defined by f = ␴␯ ,



共12兲

where ␯ is the outward unit normal on ⌫. The hydrodynamic loads of the fluid on the body are found by integrating this traction field over the body surface. In particular, the result-

f共x兲dAx,

T=





共x − c兲 ⫻ f共x兲dAx ,

共13兲

where dAx denotes an infinitesimal area element at x 苸 ⌫. The resultant loads 共F , T兲 are purely hydrodynamic in the sense that they vanish when the body velocities 共V , ⍀兲 vanish. Resultant loads due to hydrostatic effects, that is, buoyancy loads caused by a body force field, may be accounted for separately. C. Resistance, mobility matrices

Linearity of Eqs. 共9兲–共13兲 implies the existence of matrices La 苸 R3⫻3 共a = 1 , . . . , 4兲 such that F = − L1V − L3⍀,

T = − L2V − L4⍀.

共14兲

These matrices are called the hydrodynamic resistance matrices associated with the body B. They are intrinsic properties of the shape of B and are proportional to the fluid viscosity ␮. With the exception of L1, they depend on the choice of reference point c appearing in Eqs. 共10兲 and 共13兲.19,20,50 Self-adjointness of the differential operator associated with the system in Eqs. 共9a兲–共9d兲 implies that the overall resistance matrix L=

冉 冊 L1 L3 L2 L4

共15兲

is symmetric.19,20,51 Furthermore, an energy dissipation inequality associated with the system in Eqs. 共9a兲–共9d兲 implies that L is also positive-definite. Throughout our developments we denote the inverse of L by M so that M=



冊冉 冊

M1 M3 L1 L3 = M2 M4 L2 L4

−1

.

共16兲

The block entries M a 苸 R3⫻3 are called the mobility matrices associated with B and are related to the resistance matrices La 苸 R3⫻3 through the expressions −1 −1 −1 M 1 = L−1 1 + L 1 L 3S L 2L 1 ,

M 2 = − S−1L2L−1 1 ,

−1 M 3 = − L−1 1 L 3S ,

M 4 = S−1 ,

共17兲

where S = L4 − L2L−1 1 L3. Since L is symmetric positivedefinite, we find that M is also symmetric positive-definite. In particular, the block entries L1, L4, M 1, and M 4 are all symmetric positive-definite. It can be shown that, with the exception of M 4, the mobility matrices also depend on the choice of the reference point c.50 In view of Eqs. 共14兲 and 共16兲, we have V = − M 1F − M 3T,

⍀ = − M 2F − M 4T.

共18兲

The resistance matrix L and the mobility matrix M provide a fundamental relation between velocities and loads for a rigid body immersed in a Newtonian fluid. For a body subject to prescribed velocities 共V , ⍀兲, the resultant hydrodynamic loads 共F , T兲 predicted by the solution of the system in Eqs. 共9a兲–共9d兲 are unique and given by 共F , T兲 = −L共V , ⍀兲.

Author complimentary copy. Redistribution subject to AIP license or copyright, see http://jcp.aip.org/jcp/copyright.jsp

165105-7

J. Chem. Phys. 129, 165105 共2008兲

Sequence-dependent diffusion

Conversely, for a body subject to prescribed hydrodynamic loads 共F , T兲, the resulting velocities 共V , ⍀兲 must also be unique and given by 共V , ⍀兲 = −M共F , T兲. The matrices L and M satisfy a simple transformation law under a change in reference point. Let c and c⬘ be two arbitrary reference points with associated matrices 共L , M兲 and 共L⬘ , M ⬘兲, and let r = c⬘ − c be the vector from c to c⬘. Then50 L⬘ = QTLQ,

M ⬘ = Q−1MQ−T ,

共19兲

where Q and its inverse Q−1 are matrices given in block form by, employing the notation from Sec. II B, Q=



I 关r⫻兴 0

I



,

Q−1 =



I − 关r⫻兴 0

I



.

共20兲

D. Diffusion, mobility coefficients

As derived by Brenner,52,53 the translational and rotational diffusion coefficients for a rigid body immersed in a Newtonian fluid are Dt = kT␽t,

Dr = kT␽r ,

共21兲

where k is the Boltzmann constant, T is the absolute temperature of the fluid, and ␽t and ␽r are mobility coefficients defined by

␽t = 31 tr共M 1兲,

␽r = 31 tr共M 4兲.

共22兲

The diffusion coefficients Dt and Dr quantify the meansquare linear and angular displacements of the body per unit time in free Brownian motion. In contrast, the mobility coefficients ␽t and ␽r quantify the mean linear and angular velocities of the body per unit load under the assumption of hydrodynamic force and torque balance. The relations in Eq. 共21兲 can be viewed as generalizations to a rigid body of arbitrary shape of the classic Stokes–Einstein relations for bodies of spherical or ellipsoidal shape.18,54,55 The diffusion coefficients Dt and Dr play central roles in the interpretation of various different types of experimental data. The orientational averages implicit in the above definitions account for the rotational aspect of the Brownian motion of a macromolecule in solution. In experiments in which a macromolecule has no preferred orientation, these orientationally averaged coefficients are expected to provide accurate descriptions of translational and rotational diffusive motions. 共Although only one such mode of motion may be measurable in any given experiment.兲 On the other hand, in experiments in which a macromolecule has a preferred orientation, the above coefficients would not be expected to provide accurate descriptions unless M 1 and M 4 are nearly isotropic, as in the case of a nearly spherical macromolecule. When M 1 and M 4 are far from isotropic, the translational and rotational diffusive motions of a macromolecule with a preferred orientation are better described by individual components of M 1 and M 4 which correspond to the directions in which diffusion is measured. For the analysis of translational diffusion data we use the orientationally averaged coefficient Dt defined in Eq. 共21兲.

However, due to the nature of the corresponding experimental data, for the analysis of rotational diffusion we use a transversely averaged coefficient defined by Dr⬜ = kT␽r⬜,

␽r⬜ = 21 tr⬜共M 4兲,

共23兲

where tr⬜共M 4兲 is the sum of the smallest two eigenvalues of M 4. This coefficient characterizes diffusive rotational motion which occurs transverse to the axis of a nearly cylindrical body. In particular, this average explicitly excludes the largest eigenvalue of M 4, which characterizes diffusive rotational motion parallel to the axis. The definition in Eq. 共23兲 can be viewed as a natural generalization to the case of curved cylinders of the transverse rotational diffusion coefficient of straight cylinders.12,17 In this work, we follow standard convention and refer all diffusion coefficients to the center of diffusion.12,50 This point is the unique point at which the mobility matrices M 2 and M 3 are equal and, consequently, symmetric because the overall matrix M is symmetric. It also has the interesting property that, among all points in space, it is the unique point at which the coefficients ␽t and ␽r achieve minimum values.12,50 共This is trivially true for ␽r since it is independent of the reference point.兲 The center of diffusion can also be described as that point at which the translational and rotational diffusive motions are most independent. For symmetric bodies such as spheres, ellipsoids, and straight cylinders, the center of diffusion is coincident with the center of volume. However, for bodies of arbitrary shape, these two points are generally distinct. The use of the center of diffusion as the reference point guarantees that the predicted diffusion coefficients for a body are not overstated. The center of diffusion of a body can be determined provided the mobility matrix is known at some given reference point. In particular, if c is a given reference point with associated mobility matrix M, then the center of diffusion c⬘ is given by, employing the notation from Sec. II B, c⬘ = c + 共tr关M 4兴I − M 4兲−1 vec共M 3 − M T3 兲.

共24兲

This expression is equivalent to those in previous works,12,50 but may be more convenient because it avoids the principal axes frame employed there. Once c⬘ is known, the associated mobility matrix M ⬘ can be found using Eq. 共19兲, and the associated diffusion coefficients Dt⬘ and Dr⬘⬜ can be obtained from relations analogous to Eqs. 共21兲 and 共23兲. The resulting expressions take the form Dr⬘⬜ = Dr⬜ , Dt⬘ = Dt −

kT vec共M 3 − M T3 兲 · 共tr关M 4兴I − M 4兲−1 vec共M 3 − M T3 兲. 3 共25兲

E. Numerical procedure

The basic hydrodynamic problem is to solve the steady Stokes system in Eqs. 共9a兲–共9d兲 in the three-dimensional domain exterior to the hydrated surface ⌫共r , S兲 of a given DNA molecule. To this end, we employ a boundary element method described in detail elsewhere.32 Choosing an arbi-

Author complimentary copy. Redistribution subject to AIP license or copyright, see http://jcp.aip.org/jcp/copyright.jsp

165105-8

J. Chem. Phys. 129, 165105 共2008兲

O. Gonzalez and J. Li

trary body-fixed frame for the surface ⌫共r , S兲, we first use a numerical integration procedure to determine its center of volume, which we take as our initial reference point c. For a given linear velocity V and angular velocity ⍀, the boundary element method yields an approximate resultant hydrodynamic force F and torque T. From the relation 共F , T兲 = −L共V , ⍀兲, we deduce that the columns of −L are the resultant loads corresponding to unit velocities, that is, −L=

冉冏冉 冊冏 F T

V=共1,0,0兲 ⍀=共0,0,0兲

¯

冏冉 冊冏 F T

V=共0,0,0兲 ⍀=共0,0,1兲



.

IV. RESULTS

Here we use our boundary element method to numerically compute translational and rotational diffusion coefficients for the straight and sequence-dependent curved tube models introduced in Sec. II. For convenience, all results are presented in terms of the dimensionless coefficients ⬜ 3 ¯ ⬘⬜ = ␮ᐉ Dr⬘ , D r kT

(d)

共26兲

Thus the determination of L for ⌫共r , S兲 requires six independent computations. From L we compute the mobility matrix M = L−1 by matrix inversion, and then compute the center of diffusion c⬘, the associated mobility matrix M ⬘, and the diffusion coefficients Dt⬘ and Dr⬘⬜ as described above. The boundary element method we employ is defined by four parameters: the size h of curved quadrilateral elements used to represent the surface ⌫共r , S兲, the number q2 of Gauss quadrature points used in each element, and two parameters ␭ 苸 共0 , 1兲 and ␾ 苸 共0 , ␾⌫共r,S兲兲 which control the conditioning of the method. Here ␾⌫共r,S兲 is a purely geometrical parameter which in the present case is equal to r. Previous studies32 indicate that accurate results on tubular geometries can be obtained with moderate values of h using q = 1, ␭ = 1 / 2, and ␾ / ␾⌫共r,S兲 = 1 / 2. In our numerical experiments below, we choose an h value that is small enough so that the relative errors in computed hydrodynamic quantities are below 0.1% uniformly over the entire range of lengths considered. Thus numerical errors can be considered negligible.

¯ ⬘ = ␮ᐉDt⬘ , D t kT

(a)

共27兲

where ᐉ is a characteristic length scale equal to 34 Å. We compute diffusion coefficients for straight tubes and compare our results with various empirical formulas and experimental data. We then compute diffusion coefficients for curved tubes corresponding to random and periodic DNA sequences and compare these results with those for straight tubes. A. Straight tube model

Figure 3 illustrates convergence results for our numerical method as a function of the element size parameter h and the number of basepairs n for a straight tube of fixed radius. The parameter h is proportional to the average size of a quadrilateral element in a given mesh. The results summarized in this figure were used to select and justify the numerical parameters 共q , ␭ , ␾ / ␾⌫兲 and the element size h that were used in all subsequent calculations.

(b)

(e)

(c)

FIG. 3. Convergence results for straight tubes of fixed radius. 共a兲 Tube with r = 10 Å and n = 20 discretized with elements of reference size hⴱ. 关共b兲 and ¯ ⬘ and D ¯ ⬘⬜ vs 1 / h for tube shown in 共a兲. The scale on the horizontal 共c兲兴 D t r axis is such that 1 / hⴱ = 18. Upward-pointing triangles and circles denote results for q = 1 and q = 2 with 共␭ , ␾ / ␾⌫兲 = 共1 / 2 , 1 / 4兲. Downward-pointing triangles and squares denote results for q = 1 and q = 2 with 共␭ , ␾ / ␾⌫兲 ¯ ⬘ 共solid兲 and D ¯ ⬘⬜ 共dashed兲 vs n for tubes with r = 10 Å. = 共1 / 2 , 1 / 2兲. 共d兲 D t r Triangles denote results for elements of size hⴱ. Circles denote results for elements of size hⴱⴱ ⬟ hⴱ / 2. Both results were obtained with q = 1 and ¯ ⬘/D ¯ ⬘ 共solid兲 and ⌬D ¯ ⬘⬜ / D ¯ ⬘⬜ 共dashed兲 vs n, 共␭ , ␾ / ␾⌫兲 = 共1 / 2 , 1 / 2兲. 共e兲 ⌬D t t r r ¯ where ⌬D⬘ is the difference between results with element sizes hⴱ and hⴱⴱ.

Plot 共a兲 in Fig. 3 illustrates a straight tube model with r = 10 Å, n = 20, and a moderate element size hⴱ, which we take as a reference. Plots 共b兲 and 共c兲 show convergence results for the translational and rotational diffusion coefficients for the tube shown in 共a兲. Results are given for two different q ⫻ q Gauss quadrature rules: q = 1 and q = 2. Moreover, results are given for two different pairs of conditioning parameters: 共␭ , ␾ / ␾⌫兲 = 共1 / 2 , 1 / 4兲 and 共␭ , ␾ / ␾⌫兲 = 共1 / 2 , 1 / 2兲. For reference, the scale on the horizontal axis is such that 1 / hⴱ = 18. The data illustrate that, for the different choices of quadrature rule and conditioning parameters, the diffusion coefficients converge to limiting values on moderately refined meshes. The convergence characteristics are generally better for q = 2 than for q = 1, and are generally better for 共␭ , ␾ / ␾⌫兲 = 共1 / 2 , 1 / 2兲 than for 共␭ , ␾ / ␾⌫兲 = 共1 / 2 , 1 / 4兲. Because larger values of q require increased computational effort, we select the parameters q = 1 and 共␭ , ␾ / ␾⌫兲 = 共1 / 2 , 1 / 2兲 for all subsequent calculations. ¯ ⬘ and Plot 共d兲 of Fig. 3 shows the diffusion coefficients D t ⬜ ¯ ⬘ as a function of the number of base pairs n for a straight D r tube with radius r = 10 Å. Results are given for two element sizes: hⴱ and hⴱⴱ 共hⴱⴱ is approximately half of hⴱ兲. The data show that, for the entire range of n considered, reducing the element size by a factor of about 2 causes no significant ¯ ⬘⬜. Plot 共e兲 shows the relative differ¯ ⬘ and D changes in D r t ¯ ⬘⬜ obtained with element sizes h ¯ ⬘ and D ences in each of D ⴱ r t and hⴱⴱ as a function of n. The data show that the relative

Author complimentary copy. Redistribution subject to AIP license or copyright, see http://jcp.aip.org/jcp/copyright.jsp

165105-9

Sequence-dependent diffusion

J. Chem. Phys. 129, 165105 共2008兲

FIG. 5. Comparison of boundary element computations and experimental data on the translational diffusion coefficient for DNA sequences of various lengths. The solid curves denote boundary element results from the straight tube model with different radii: r = 10, 11, . . . , 15 Å ordered from upper to lower curve. The different open symbols denote different sets of experimental data: downward triangles 共Ref. 39兲, diamonds 共Ref. 40兲, circles 共Ref. 43兲, upward triangles 共Ref. 44兲, and squares 共Ref. 45兲. FIG. 4. Comparison of boundary element computations with empirical formulas for the diffusion coefficients of a straight tube. 关共a兲 and 共b兲兴 Compari¯ ⬘ and D ¯ ⬘⬜. Solid curves correspond to boundary element compusons for D t r tations, dashed curves correspond to the empirical formulas derived by Tirado et al. 共Ref. 17兲, and dotted curves correspond to the empirical formulas derived by Yoshizaki and Yamakawa 共Ref. 31兲. In each plot, the upper group of curves correspond to results for a tube of radius r = 10 Å, and the lower group corresponds to results for r = 15 Å.

difference in each coefficient is well under 0.1% uniformly in n. Based on these results, we expect the element size hⴱ to be sufficiently small for tubes with r ⱖ 10 Å for the range of n we consider. Notice that tubes with larger values of r would have lower circumferential curvature, which would be even more adequately resolved by elements of size hⴱ. While the above results pertain to straight tubes, similar convergence results and accuracies hold for curved tubes with moderate axial curvatures.32 Figure 4 presents a comparison between our boundary element computations and two sets of empirical formulas17,31 for the diffusion coefficients of a straight circular tube. ¯ ⬘ and n3D ¯ ⬘⬜ versus n for Shown in the figure are plots of nD t r two values of the radius: r = 10 Å and r = 15 Å. We choose ¯ ⬘ and n3D ¯ ⬘⬜ rather than D ¯ ⬘ and D ¯ ⬘⬜ since they to plot nD t r t r enhance the sensitivity and allow results for different radius values to be more easily distinguished. The empirical formulas derived by Tirado et al.17 are based on a curve fit of numerical results obtained from a bead-type model in which a cylinder is represented by a stack of rings of beads. In that model, the cylinders are left hollow and open, with no cap or plate at either end. The empirical formulas derived by Yoshizaki and Yamakawa31 are based on a combination of numerical results and slender-body theory.56,57 In that work, interpolation formulas were derived which were consistent with slender-body theory in the limit of long lengths and with numerical boundary element results in the limit of short lengths 共the boundary element data were limited to tubes with length to diameter ratios L / d ⬍ 5, or equivalently, n ⬍ 38 basepairs on our scale兲.

The data in Fig. 4 show that, for each of the two values of r, our boundary element computations are in general ¯ ⬘, but that more agreement with the empirical formulas for D t significant differences are visible between all three ap¯ ⬘⬜. By design, the empirical formulas derived proaches for D r by Yoshizaki and Yamakawa31 closely follow our boundary element results at shorter lengths, but show some slight discrepancy at longer lengths. This discrepancy is likely due to the approximations inherent in slender-body theory, and would presumably decrease in the limit of long length where slender-body theory is applicable. In contrast, the empirical formulas derived by Tirado et al.17 are in better agreement with our results at longer lengths than at shorter lengths. Because of the fundamental differences between the beadtype and Stokes models, it is ultimately difficult to understand the differences observed in the figure. Nevertheless, the results suggest that the bead-type model provides a reasonable approximation of the more precise Stokes model at longer lengths where end effects are less important. Figure 5 shows a comparison between our boundary element computations and experimental data on the translational diffusion coefficient for DNA sequences of various lengths. The curves in the figure are the results of our boundary element computations using a straight tube model with six different values of the radius: r = 10, 11, . . . , 15 Å. The symbols in the figure correspond to different sets of experimental data obtained by different techniques: velocity sedimentation,39,43 dynamic light scattering,40,44 and capillary electrophoresis.45 Differences due to experimental conditions are expected to be minimal since all the data were normalized to standard conditions.18 The observed disagreement between the electrophoresis data and those from other techniques is at present not well understood.45 With the exception of this electrophoresis data, nearly all the experimental values fall within the region covered by the computed curves. The experimental data in Fig. 5 for sequences of lengths

Author complimentary copy. Redistribution subject to AIP license or copyright, see http://jcp.aip.org/jcp/copyright.jsp

165105-10

O. Gonzalez and J. Li

FIG. 6. Comparison of boundary element computations and experimental data on the rotational diffusion coefficient for DNA sequences of various lengths. The solid curves denote boundary element results from the straight tube model with different radii: r = 12, 13, . . . , 18 Å ordered from upper to lower curve. The different open symbols denote different sets of experimental data: diamonds 共Ref. 40兲, downward triangles 共Ref. 41兲, and circles 共Ref. 42兲.

greater than 50 basepairs can be seen to be consistent with two distinct radius ranges. The light scattering data44 are closely consistent with r = 10– 11 Å and the sedimentation data43 are consistent with r = 12– 13 Å. This difference could be a characteristic of the different experimental techniques, or simply the result of experimental error. In this respect, it is important to note that the reported sequence lengths are themselves subject to error. Alternatively, the difference might be due to sequence effects, for example, sequencedependent axial curvature, which are ignored in the interpretation of the experimental measurements. The data for sequences of about 50 basepairs or less show more scatter relative to the computed curves, although they are still generally consistent. In this short length range, we expect that any sequence-dependent curvature effect would play a lesser role and that local irregularities in the actual hydrated surface would become increasingly important. As a result, a straight tube model of uniform radius with ends capped by hemispheres may be overly simplistic. Indeed, for extremely short sequences, we expect that an atomistic-type surface model is likely more appropriate. Figure 6 shows a comparison between our boundary element computations and experimental data on the rotational diffusion coefficient for DNA sequences of various lengths. The curves in the figure are the results of our boundary element computations using a straight tube model with seven different values of the radius: r = 12, 13, . . . , 18 Å. The symbols in the figure correspond to different sets of experimental data obtained by different techniques: dynamic light scattering40 and transient electric birefringence.41,42 As before, the data are normalized to standard conditions.18 Notice that nearly all the experimental values fall within the region covered by the computed curves, but here the computed curves correspond to larger values of r. Thus, in contrast to the translational data, the bulk of the rotational data is consistent with r = 13– 17 Å. This difference can likely be attributed to sequence-dependent axial curvature, which is ignored

J. Chem. Phys. 129, 165105 共2008兲

FIG. 7. Sequence-dependent curved tube models of different DNA sequences. 共a兲 Four different sequences of n = 40 basepairs. 共b兲 Four different sequences of n = 120 basepairs.

in the interpretation of the experimental measurements, and experimental errors, which are expected to be higher since rotational diffusion coefficients are more difficult to measure. However, the systematic nature of the difference toward larger values of r suggests that a systematic effect such as curvature is more likely than experimental error. B. Sequence-dependent curved tube model

Figure 7 shows curved tube models for various different DNA sequences. Shown are models for four different 40 basepair sequences and four different 120 basepair sequences, all with a tube radius of r = 13 Å. These tube models were constructed as described in Sec. II using the crystallographic data in Tables I and II. The figure illustrates the possible variability in overall axial curvature due to sequence effects. To study the effects of sequence on the diffusion coefficients as a function of n, we considered sample sets of the form ⌺n = 兵S1n , . . . , Sm n 其, where m is a fixed number and each Snj is a sequence of length n. For different values of n the sample ⌺n was constructed by a culling procedure. We first ⬘ constructed a set ⌺n⬘ = 兵S1n , . . . , Sm n 其, m⬘ Ⰷ m, consisting of an arbitrary mixture of completely random sequences, periodic sequences with random repeating units, and periodic sequences containing A-tracts of varying lengths. The radius of gyration R of each Snj 苸 ⌺n⬘ was computed and the minimum and maximum values Rmin and Rmax were recorded. The set ⌺n was then constructed by selecting appropriate elements of ⌺n⬘ to obtain an approximately uniform distribution in R between Rmin and Rmax. In our numerical experiments, we considered lengths 15ⱕ n ⱕ 120 共in increments of 5兲, and for each value of n we constructed sample sets of size m = 20. The four sequences shown in Fig. 7共a兲 are typical elements from the sample set ⌺40 and the four sequences shown in Fig. 7共b兲 are typical elements from the sample set ⌺120. Figure 8 shows the translational diffusion coefficient of each sequence in the sample set ⌺n for each value of n com-

Author complimentary copy. Redistribution subject to AIP license or copyright, see http://jcp.aip.org/jcp/copyright.jsp

165105-11

Sequence-dependent diffusion

FIG. 8. Comparison of numerical results from sequence-dependent curved tube model and straight tube model for translational diffusion coefficient. Cross and plus symbols denote results from the sequence-dependent curved model with different radii: 共crosses兲 r = 10 Å and 共pluses兲 r = 15 Å. The solid curves denote results from the straight tube model with different radii: r = 10, 11, . . . , 15 Å ordered from upper to lower curve. The open symbols denote experimental data as in Fig. 5.

puted using the curved tube model. Results are shown for two values of the radius: r = 10 Å and r = 15 Å. For comparison, the six curves and the experimental data from Fig. 5 are included for the length range considered here. It can be seen that for r = 10 Å and for r = 15 Å the diffusion coefficients obtained from the straight tube model are uniformly below those obtained from the sequence-dependent curved tube model. Similar results are expected for each of the intermediate radius values. Thus, for given values of n and r, the data show that the straight tube model underestimates the ¯ ⬘ relative to the curved model. Equivadiffusion coefficient D t ¯ ⬘, the data show that the lently, for given values of n and D t straight tube model underestimates the radius r relative to the curved model. This latter observation is relevant to previous studies in which the hydrated radius of DNA has been esti¯ ⬘. In particular, it suggests mated from given data on n and D t that previous estimates of the hydrated radius based on the straight tube model are likely to be underestimates. Figure 9 shows the rotational diffusion coefficient of each sequence in the sample set ⌺n for each value of n computed using the curved tube model. Results are shown for two values of the radius: r = 12 Å and r = 18 Å. For comparison, the seven curves and the experimental data from Fig. 6 are included as before for the length range considered here. Just as for the translational diffusion coefficient, it can be seen that for r = 12 Å and for r = 18 Å the rotational diffusion coefficients obtained from the straight tube model are uniformly below those obtained from the sequencedependent curved tube model. Similar results are expected for each of the intermediate radius values. For given values of n and r, the data show that the straight tube model under¯ ⬘⬜ relative to the curved estimates the diffusion coefficient D r ¯ ⬘⬜, the data model. Equivalently, for given values of n and D r show that the straight tube model underestimates the radius r relative to the curved model. The difference between the

J. Chem. Phys. 129, 165105 共2008兲

FIG. 9. Comparison of numerical results from sequence-dependent curved tube model and straight tube model for rotational diffusion coefficient. Cross and plus symbols denote results from the sequence-dependent curved model with different radii: 共crosses兲 r = 12 Å, 共pluses兲 r = 18 Å. For clarity, the crosses have been shifted horizontally one unit to the left. The solid curves denote results from the straight tube model with different radii: r = 12, 13, . . . , 18 Å ordered from upper to lower curve. The open symbols denote experimental data as in Fig. 6.

straight and the curved models is significantly more pronounced for the rotational diffusion coefficient than for the translational diffusion coefficient.

V. CONCLUDING REMARKS

The data in Figs. 8 and 9 may have an elegant mathematical explanation. Indeed, the data suggest that among all curved tubes of fixed length and radius the translational and rotational diffusion coefficients are minimal for a straight tube. Moreover, that for straight tubes of fixed length, both diffusion coefficients increase as the tube radius decreases. While such results seem plausible and even intuitive, we are not aware of any general results along these lines. We remark that such results would not only be of intrinsic mathematical interest, but would also be useful in the analysis and interpretation of hydrodynamic data on tubelike structures. Here such results would lend further support to our statement that previous estimates of the hydrated radius based on a straight tube model are likely to be underestimates. The variability of the sequence-dependent data in Figs. 8 and 9 suggests a possible method for refining rigid basepair model parameters for DNA in solution. In particular, from experimental data on known sequences of known length, the sequence-dependent model introduced here could be used to numerically fit the relative displacement and rotation parameters for all the different possible dimer steps, as well as the hydrated radius. The computational effort required in such an endeavor would be large, but would be well within the reach of modern techniques. It is an open question whether the diffusion coefficients are sufficiently sensitive and whether current experimental techniques are sufficiently accurate to make a meaningful fit possible. However, through a proper selection of sequences to maximize sensitivity, such a fit may ultimately be feasible. We remark that, at the present time, we are not aware of any experimental results on the

Author complimentary copy. Redistribution subject to AIP license or copyright, see http://jcp.aip.org/jcp/copyright.jsp

165105-12

sequence-dependent variability of the diffusion coefficients of DNA fragments at the length scales considered here. Our computational model for the diffusion coefficients of DNA fragments is based on a fundamental assumption of rigidity and neglects thermally induced bending fluctuations. For fragments of about 150 basepairs, we expect that such fluctuations induce a local axial curvature of the order 10−3 rad/ Å 共root mean square兲. For comparison, we note that the straight tube model has zero curvature by definition, whereas the curved tube model yields a sequence-averaged curvature of the order of 10−2 rad/ Å. Previous studies58 related to the straight tube model indicate that these fluctuations have a negligible effect on the translational diffusion coefficient at this length scale. In view of the curvature estimates above, we expect any such effects to also be negligible and even smaller for the curved model. The effects of fluctuations on the rotational diffusion coefficient have been less well studied. Based on the results in Figs. 8 and 9, we expect the effects to be larger compared to the translational diffusion coefficient, but we are not aware of any quantitative results along these lines.

ACKNOWLEDGMENTS

The authors gratefully acknowledge the generous support of the National Science Foundation DMS-0706951. 1

J. Chem. Phys. 129, 165105 共2008兲

O. Gonzalez and J. Li

B. Berne and R. Pecora, Dynamic Light Scattering: With Applications to Chemistry, Biology and Physics 共Wiley, New York, 1976兲. 2 B. Valeur, Molecular Fluorescence: Principles and Applications 共WileyVCH, Weinheim, 2002兲. 3 E. Fredericq and C. Houssier, Electric Dichroism and Electric Birefringence 共Clarendon, Oxford, 1973兲. 4 H. Fujita, Foundations of Ultracentrifugal Analysis 共Wiley, New York, 1975兲. 5 H. Schachman, Ultracentrifugation in Biochemistry 共Academic, New York, 1959兲. 6 Modern Analytical Ultracentrifugation, edited by T. Schuster and T. Laue 共Birkhauser, Boston, 1994兲. 7 D. Hawcroft, Electrophoresis 共Oxford University Press, Oxford, 1997兲. 8 T. Laue, T. Ridgeway, J. Wooll, H. Shepard, T. Moody, T. Wilson, J. Chaires, and D. Stevenson, J. Pharm. Sci. 85, 1331 共1996兲. 9 D. Baker, Capillary Electrophoresis 共Wiley, New York, 1995兲. 10 S. Allison and V. Tran, Biophys. J. 68, 2261 共1995兲. 11 S. Aragon and D. Hahn, Biophys. J. 91, 1591 共2006兲. 12 D. Brune and S. Kim, Proc. Natl. Acad. Sci. U.S.A. 90, 3835 共1993兲. 13 J. Garcia de la Torre, M. Huertas, and B. Carrasco, Biophys. J. 78, 719 共2000兲. 14 S. Allison, C. Chen, and D. Stigter, Biophys. J. 81, 2558 共2001兲. 15 S. Allison and S. Mazur, Biopolymers 46, 359 共1998兲. 16 S. Mazur, C. Chen, and S. Allison, J. Phys. Chem. B 105, 1100 共2001兲.

17

M. Tirado, C. Martinez, and J. Garcia de la Torre, J. Chem. Phys. 81, 2047 共1984兲. 18 C. Cantor and P. Schimmel, Biophysical Chemistry 共Freeman, San Francisco, 1980兲, Pt. II. 19 J. Happel and H. Brenner, Low Reynolds Number Hydrodynamics: With Special Applications to Particulate Media 共Kluwer, Boston, 1983兲. 20 S. Kim and S. Karrila, Microhydrodynamics: Principles and Selected Applications 共Butterworth-Heinemann, Boston, 1991兲. 21 S. Aragon, J. Comput. Chem. 25, 1191 共2004兲. 22 G. Youngren and A. Acrivos, J. Fluid Mech. 69, 377 共1975兲. 23 J. Garcia de la Torre and V. Bloomfield, Q. Rev. Biophys. 14, 81 共1981兲. 24 R. E. Dickerson, M. Bansal, C. R. Calladine, S. Diekmann, W. N. Hunter, O. Kennard, R. Lavery, H. C. M. Nelson, W. K. Olson, W. Saenger, Z. Shakked, H. Sklenar, D. M. Soumpasis, C.-S. Tung, E. von Kitzing, A. H.-J. Wang, and V. B. Zhurkin, J. Mol. Biol. 205, 787 共1989兲. 25 M. El Hassan and C. Calladine, J. Mol. Biol. 251, 648 共1995兲. 26 W. K. Olson, M. Bansal, S. K. Burley, R. E. Dickerson, M. Gerstein, S. C. Harvey, U. Heinemann, X. J. Lu, S. Neidle, Z. Shakked, H. Sklenar, M. Suzuki, C. S. Tung, E. Westhof, C. Wolberger, and H. M. Berman, J. Mol. Biol. 313, 229 共2001兲. 27 W. Olson, A. Gorin, X. Lu, L. Hock, and V. Zhurkin, Proc. Natl. Acad. Sci. U.S.A. 95, 11163 共1998兲. 28 J. Schellman and S. Harvey, Biophys. Chem. 55, 95 共1995兲. 29 M. Tirado and J. Garcia de la Torre, J. Chem. Phys. 71, 2581 共1979兲. 30 M. Tirado and J. Garcia de la Torre, J. Chem. Phys. 73, 1986 共1980兲. 31 T. Yoshizaki and H. Yamakawa, J. Chem. Phys. 72, 57 共1980兲. 32 O. Gonzalez, “On stable, complete and singularity-free boundary integral formulations of exterior Stokes flow,” SIAM J. Appl. Math. 共to be published兲. 33 O. Ladyzhenskaya, The Mathematical Theory of Viscous Incompressible Flow 共Gordon and Breach, New York, 1963兲. 34 H. Power and L. Wrobel, Boundary Integral Methods in Fluid Mechanics 共Computational Mechanics, Southampton, 1995兲. 35 C. Pozrikidis, Boundary Integral and Singularity Methods for Linearized Viscous Flow 共Cambridge University Press, Cambridge, England, 1992兲. 36 K. Atkinson, The Numerical Solution of Integral Equations of the Second Kind 共Cambridge University Press, Cambridge, England, 1997兲. 37 R. Kress, Linear Integral Equations 共Springer-Verlag, Berlin, 1989兲. 38 L. Ying, G. Biros, and D. Zorin, J. Comput. Phys. 219, 247 共2006兲. 39 G. Bonifacio, T. Brown, G. Conn, and A. Lane, Biophys. J. 73, 1532 共1997兲. 40 W. Eimer and R. Pecora, J. Chem. Phys. 94, 2324 共1991兲. 41 J. Elias and D. Eden, Macromolecules 14, 410 共1981兲. 42 P. Hagerman, Biopolymers 20, 1503 共1981兲. 43 R. Kovacic and K. van Holde, Biochemistry 16, 1490 共1977兲. 44 M. Mandelkern, J. Alias, D. Eden, and D. Crothers, J. Mol. Biol. 152, 153 共1981兲. 45 N. Stellwagen, S. Magnusdottir, C. Gelfi, and P. Righetti, Biopolymers 58, 390 共2001兲. 46 R. Sinden, DNA Structure and Function 共Academic, San Diego, 1994兲. 47 P. Hughes, Spacecraft Attitude Dynamics 共Wiley, Boston, 1983兲. 48 T. Sharrock, The Mathematics of Surfaces II 共Oxford University Press, New York, 1987兲, pp. 395–411. 49 S. Allison, Macromolecules 32, 5304 共1999兲. 50 S. Harvey and J. Garcia de la Torre, Macromolecules 13, 960 共1980兲. 51 G. Galdi, Handbook of Mathematical Fluid Mechanics 共Elsevier, Amsterdam, 2002兲, Vol. 1, pp. 653–792. 52 H. Brenner, J. Colloid Sci. 20, 104 共1965兲. 53 H. Brenner, J. Colloid Interface Sci. 23, 407 共1967兲. 54 A. Einstein, Investigations on the Theory of the Brownian Movement 共Dover, New York, 1956兲. 55 S. Koenig, Biopolymers 14, 2421 共1975兲. 56 R. Johnson, J. Fluid Mech. 99, 411 共1980兲. 57 J. Keller and S. Rubinow, J. Fluid Mech. 75, 705 共1976兲. 58 H. Yamakawa and M. Fujii, Macromolecules 6, 407 共1973兲.

Author complimentary copy. Redistribution subject to AIP license or copyright, see http://jcp.aip.org/jcp/copyright.jsp