Divergence-Based Framework for Diffusion Tensor Clustering, Interpolation, and Regularization Torsten Rohlfing1, Edith V. Sullivan2 , and Adolf Pfefferbaum1,2 1
2
Neuroscience Program, SRI International, Menlo Park, CA, USA
[email protected],
[email protected] Department of Psychiatry and Behavioral Sciences, Stanford University, Stanford CA, USA
[email protected] Abstract. This paper introduces a novel framework for diffusion tensor combination, which can be used for tensor averaging, clustering, interpolation, and regularization. The framework is based on the physical interpretation of the tensors as the covariance matrices of Gaussian probability distributions. The symmetric Kullback-Leibler divergence provides a natural distance measure on these distributions, which leads to a closed-form expression for the distance between any two diffusion tensors, as well as for the weighted average of an arbitrary number of tensors. We illustrate the application of our technique in four different scenarios: (a) to combine tensor data from multiple subjects and generate population atlases from ten young and ten older subjects, (b) to perform k-means clustering and generate a compact Gaussian mixture of multiple tensors, (c) to interpolate between tensors, and (d) to regularize (i.e., smooth) noisy tensor data. For boundary-preserving regularization, we also propose a non-linear two-stage smoothing algorithm that can be considered remotely similar to a median filter.
1 Introduction Diffusion tensor magnetic resonance imaging (DT-MRI) is an increasingly popular method for non-invasive mapping of in vivo brain connectivity [1]. The most common applications of DT-MRI currently are voxel-based comparisons [2] and fiber tracking [3]. In voxel-based studies, statistics are performed on scalar fields derived from the tensor data, e.g., fractional anisotropy (FA) maps. Fiber tracking, on the other hand, uses the complete tensor information to extract geometrical models that represent the course of white matter fiber tracts through the brain. Methods are also being developed that perform statistics on tensors directly (e.g., Xu et al. [4]). To apply many common image processing tasks (e.g., smoothing, interpolation, averaging) to tensor images, multiple tensors must be combined in a meaningful way. For averaging, these tensors originate from corresponding positions in multiple images, whereas for interpolation and smoothing they originate from a local neighborhood of a given location in a single image. It is possible simply to add the tensor matrices element by element [5], but this ignores the fact that the manifold of diffusion tensors does not define a Euclidean vector space. More appropriate tensor combination schemes have therefore been proposed, such as the geodesic approach by Fletcher & Joshi [6] or the algebraic framework by Batchelor et al. [7]. Both are motivated by the abstract mathematical properties of the diffusion tensors, e.g., their positive definiteness. N. Karssemeijer and B. Lelieveldt (Eds.): IPMI 2007, LNCS 4584, pp. 507–518, 2007. © Springer-Verlag Berlin Heidelberg 2007
508
T. Rohlfing, E.V. Sullivan, and A. Pfefferbaum
We introduce herein a new framework for the combination of multiple diffusion tensors that is motivated by the physical interpretation of the tensors. These are the covariance matrices of Gaussian probability distributions of local Brownian motion of hydrogen atoms [1]. Unlike [6] and [7], our combination method also has the advantage of providing a closed-form (i.e., non-iterative) solution for the weighted average of an arbitrary number of diffusion tensors. It also provides a closed-form expression for a distance measure between any two tensors. Compared with our method, the frameworks put forth by Fletcher & Joshi [6] and Batchelor et al. [7] require iterative procedures. As a proof of concept, this paper presents initial results of applying our new framework to four different scenarios. First, we use it to construct inter-subject population atlases by averaging several diffusion tensor images (DTI). Such atlases are potentially useful, for example, in cross-sectional studies of disease effects and longitudinal studies of aging and brain disease. Second, we use tensor average and distance to construct a k-means clustering algorithm [8] and generate Gaussian mixture representations of multiple tensor fields. Third, we illustrate the application of our framework for regularization (i.e., smoothing) of noisy tensor fields. Fourth, we illustrate the interpolation between different tensors that is useful for tensor image reformatting.
2 Methods In the original (and still most commonly used) diffusion imaging method [1], the diffusion tensor represents the covariance matrix of a zero-mean multivariate Gaussian probability distribution. For general Gaussian distributions, Myrvoll [9] developed a divergence-based clustering framework. We describe below the application of this framework to Gaussian diffusion distributions. The mathematical details are substantially simplified compared with [9], because a priori all distributions must have zero means. Thus, Myrvoll’s iterative joint estimation of centroid mean and covariance matrix is reduced to a closed-form expression for the covariance matrix. Divergence-based Tensor Distance. For two arbitrary probability distributions f and g over a domain Ω (here: Ω = R3 ), the symmetric Kullback-Leibler (SKL) divergence is a distance measure (but not a metric) that is defined as f g d(f, g) = (1) f log + g log . g f Ω Ω When f and g are zero-mean multivariate Gaussians with covariance matrices Σf and Σg , then Eq. (1) can be evaluated in closed form as d(f, g) =
1 −1 trace Σf Σ−1 + Σ Σ − 2I . g g f 2
(2)
Simple Tensor Average. For M zero-mean Gaussians with covariance matrices Σm , m = 1, . . . , M , their “centroid” is the zero-mean Gaussian that minimizes the total distance (2) from all M input Gaussians. This centroid, therefore, follows the common definition of the mean (or average) of the input Gaussians. Its covariance matrix Σc can
Divergence-Based Framework
be computed as the solution of the simplified matrix Ricatti equation M M Σm − Σ c Σ−1 Σc = 0. m m=1
509
(3)
m=1
We determine Σc by computing the eigensystem of the 6×6 block matrix ⎛ ⎜ 0 ⎜ ⎜ M=⎜ M ⎜ ⎝ Σ−1 m
M
⎞
Σm ⎟ ⎟ ⎟ ⎟. ⎟ ⎠ 0
m=1
(4)
m=1
To determine the solution of Eq. (3) from the block matrix M, let v1 through v3 be those three of its eigenvectors for which the real part of the corresponding complex eigenvalue is positive. Furthermore, for i = 1, 2, 3 let ui denote the upper halves (first i the lower halves of vi . Then the 3×3 matrix product three elements) of vi and w −1 Σc = u1 u2 u3 w 1 w 2 w 3
(5)
is a positive semi-definite solution of Eq. (3). Details of the derivation and the relevant proofs of existence are provided by Myrvoll [9] and the references therein. A shorter summary is also given in [10]. k-Means Clustering. Tensor average and distance as introduced above are sufficient to implement a k-means clustering algorithm [8] for diffusion tensors. The topic of clustering is an active field of research in its own right and is well beyond the scope of this paper. For now, we use an ad hoc implementation of a simple iterative clustering algorithm. All input tensors are first combined to form an initial centroid. They are then grouped by their distances to that centroid, where the number of groups is the userdefined number of output clusters. This initial clustering is then iteratively refined by alternatingly updating cluster centroids and tensor-to-cluster assignments. Weighted Tensor Average and Regularization. Using a real-valued vector w ∈ RM of M (w) weights, where m=1 wm = 1, the weighted average Σc = w1 Σ1 ⊕· · ·⊕wm Σm can be computed by analogously solving the eigensystem equation of the modified matrix ⎛ ⎜ 0 ⎜ ⎜ Mw = ⎜ M ⎜ ⎝ wm Σ−1 m
M m=1
⎞ wm Σm ⎟ ⎟ ⎟ ⎟. ⎟ ⎠ 0
(6)
m=1
The weighted tensor average allows for the divergence-based regularization of tensor images by using weights of a filter kernel (e.g., Gaussian) in the weighted average.
510
T. Rohlfing, E.V. Sullivan, and A. Pfefferbaum
(a)
(b)
Fig. 1. Ellipsoid glyph visualization of average tensor fields (coronal slice through lateral ventricles, corpus callosum, and basal ganglia). The FA map of each tensor field is shown in the image background. (a) Average of 10 young subjects. (b) Average of 10 older subjects.
The appropriateness of the definition in Eq. (6) follows from the following three observations: (a) Eq. (5) is invariant under multiplication of M with a positive real number. (b) Rational weights wm ∈ Q+ can be replaced with integers by multiplying each of them with the product of all denominators; the resulting matrix then represents an inflated set of input tensors, which are replicated in exact proportion to their relative weights. (c) The generalization to irrational weights wm ∈ R+ \ Q follows from continuity of mapping M → Σc (the eigenvalues of a matrix are the roots of a polynomial, i.e., a function in C ∞ ). Tensor Interpolation. Interpolation between two tensors, Σ1 and Σ2 , can also be expressed as a special case of the weighted tensor average: the linear interpolation between two tensors is computed as the weighted average Σ12 (w) = (w − 1)Σ1 ⊕ wΣ2 for w ∈ [0, 1]. Higher-order interpolation kernels can be applied accordingly.
3 Results In this section, we demonstrate the application of our framework for tensor averaging, clustering, regularization, and interpolation. First, tensor averaging is applied to tensor data from young and elderly healthy individuals. Second, we illustrate the potential benefits of k-means clustering for diffusion tensors on a set of synthetic tensor fields. Third, tensor regularization is illustrated using synthetic noisy tensor data. Finally, tensor interpolation is illustrated between multiple synthetic tensors. 3.1 Tensor Averaging DTI data were acquired on a 3 T GE scanner with the following parameters: FOV = 24 cm; 2.5 mm isotropic voxel size; TR = 11,000 ms; TE = 97.6 ms; slice thickness
Divergence-Based Framework
511
= 2.5 mm; inter-slice gap = 0 mm; slices = 62; six non-collinear gradient orientations +x +y, +y +z, +x +z, -x +y, -y +z, +x -z (each repeated with opposite gradient polarity); 1.5 Gauss/cm with 32 ms duration and 34 ms separation, resulting in a b-value of 860 s/mm². We acquired data on two groups of healthy subjects: ten young subjects (mean = 28.6, range = 22–37 years, 17.2 years of education; five men, five women) and ten older subjects (mean = 72.2, range = 65–79 years, 16.3 years of education; five men, five women). The older subjects were part of an ongoing study on aging [11]. One subject from the young group (23.8 year old woman) was randomly picked as the reference subject. From the b = 0 image of this subject, nonrigid coordinate transformations were computed to the b = 0 image of each of the remaining subjects. In a multi-resolution strategy, the nonrigid transformation had an initial control point spacing of 40 mm, which was refined three times to a final resolution of 5 mm. The benefits of higher spatial resolutions of the deformation appear marginal, based on the limited resolution and geometric fidelity of the diffusion-weighted images (DWI). Through the respective transformations, all seven images (b = 0, and b1 through b6 ) from each subject were warped into the coordinate system of the reference subject. The method used for tensor reconstruction is described in detail in Appendix A.1, and for image registration in Appendix A.2. The divergence-based averages of the tensor images from all ten young and all ten older subjects are shown using ellipsoid glyphs (Fig. 1(a,b)). The average of the older subjects shows greater diffusivity (i.e., increased tensor volume) and decreased anisotropy (i.e., more spherical tensors) than the young group. This is consistent with published studies on the effects of aging on FA [12]. 3.2 k-Means Clustering A synthetic example of k-means clustering and Gaussian superposition of tensor fields is shown in Fig. 2. Ten synthetic noisy input tensor fields were generated, five with a horizontal simulated “fiber,” and the other five with a vertical fiber. The clustering algorithm was set to generate two output clusters and was generally successful at distinguishing between the horizontal and vertical tensors (Fig. 2(c),(d)). The superposition of the centroids of both clusters (Fig. 2(e)) thus represents a mixture in which both fiber directions are preserved. This is a substantial improvement over the simple tensor average (Fig. 2(f)), in which tensors degenerate to isotropic distributions. The assignment which cluster represents which centroid is essentially arbitrary, so the horizontal, vertical, and spherical tensors are split somewhat randomly between the two clusters. The single spherical tensor in the crossing region in Fig. 2(e) illustrates a weakness of the k-means clustering algorithm (but not the divergence-based tensor combination framework), especially for even numbers of inputs. It is simply not always possible for the clustering algorithm to determine how many modes there truly are in the input data, independent of the type of input data and distance measure used. 3.3 Tensor Field Regularization In Fig. 3, a synthetic noisy tensor field is regularized using our framework. The noisefree tensor field (Fig. 3(a)) was generated by tensor reconstruction from simulated DWI
512
T. Rohlfing, E.V. Sullivan, and A. Pfefferbaum
(a)
(b)
(c)
(d)
(e)
(f)
Fig. 2. Synthetic example for k-means clustering with ten inputs and two output clusters. (a) One out of five input tensors fields with a horizontal “fiber.” (b) One out of five input fields with a vertical “fiber.” (c) Tensor field showing centroids of cluster #1. (d) Tensor field showing centroids of cluster #2. (e) Superposition of cluster centroid tensor fields (c) and (d). (f) Simple tensor average of the input images.
Divergence-Based Framework
(a)
(b)
(c)
(d)
513
Fig. 3. Divergence-based tensor field regularization (synthetic example). (a) Noise-free tensor field. (b) Noisy tensor field. (c) Noisy tensor field after smoothing. (d) Noisy tensor field after two-stage smoothing with distance percentile threshold. All four images show the same central 8×8 pixels from a three-dimensional 20×20×10 volume.
(6 gradient directions). The upper half of the field contains tensors oriented mostly along the y direction, the lower half contains tensors oriented along the x direction. The total size of the simulated tensor field was 20×20×10 pixels, which provided sufficient field of view for the smoothing kernel, but only the central 8×8 pixels are shown here. The noisy tensor data (Fig. 3(b)) was generated by tensor reconstruction from the same images after adding Gaussian noise to each of them. The visualization in Fig. 3(c) shows the noisy tensor field after regularization with a Gaussian kernel (2 pixels full width at half maximum). The noise was substantially reduced, but tensors along the boundary between the two homogeneous regions (upper and lower half) are mixed and combined into less anisotropic tensors. The visualization in Fig. 3(d) shows the noisy tensor field after regularization with a two-stage algorithm for thresholded smoothing. In the first pass, the tensors in a local neighborhood are combined using the same kernel used in Fig. 3(c). Then, for each of the combined tensors, its distance to the combined tensor is computed according to Eq. (2). The 50% of tensors that are closest to the first-pass combined tensor are then combined again with their previous kernel weights, and the result is the final
514
T. Rohlfing, E.V. Sullivan, and A. Pfefferbaum
(a)
(b)
(c)
Fig. 4. Barycentric interpolation between anisotropic (bottom left), oblate (bottom right), and isotropic tensors (top). (a) Divergence-based interpolation. (b) Arithmetic interpolation. (c) Differences between the tensor contours in (a) and (b).
(a)
(b)
(c)
Fig. 5. Barycentric interpolation between anisotropic tensors oriented along the x (bottom left), y (bottom right), and z direction (top). (a) Divergence-based interpolation. (b) Arithmetic interpolation. (c) Differences between the tensor contours in (a) and (b).
regularized output tensor. This procedure resembles a median filter in that both improve the boundary preservation while performing effective regularization. The 50% distance threshold can be adjusted to trade between better regularization (higher threshold) or better boundary preservation (lower threshold). The optimum value may depend on the relative size of filter kernel and image structures. 3.4 Tensor Interpolation Interpolation between tensors is illustrated in two examples. In Fig. 4, we interpolate between three anisotropic tensors, each of which is oriented along a different coordinate axis (x, y, and z). In Fig. 5, we interpolate between an anisotropic (bottom left), an oblate (bottom right), and a spherical tensor (top). Each interpolation example is represented over an equilateral triangle. The three sides of each triangle represent linear interpolation between the two tensors on the adjacent corners. Tensors inside the triangles are the barycentric combination of the tensors at all three corners. In each example, we compare the interpolation using our framework with arithmetic interpolation, i.e., the element-wise interpolation between the tensor matrices. Visual
Divergence-Based Framework
(a)
515
(b)
Fig. 6. Reoriented tensor reconstruction illustrated using a numerical phantom. (a) Tracked fibers along straight tracts originating from 16×16 region on the left, shown in red. Fiber count = 256. (b) Tracked fibers along deformed tract originating from 16×16 region on the left, shown in red. Fiber count = 254. Fibers were tracked using the DTI module of Slicer [17].
inspection of the difference images of the tensor outlines (Figs. 4(c) and 5(c)) indicates that our framework is biased in favor of more anisotropic tensors. Because anisotropic tensors are more direction specific than isotropic ones, the former can be considered to carry more useful information.
4 Discussion This paper has introduced a novel combination framework for diffusion tensors based on a divergence distance measure between Gaussian distributions. We have demonstrated its application to tensor averaging, interpolation, regularization, and clustering. To the best of our knowledge, this paper is also the first to propose retrospective kmeans clustering of multiple diffusion tensor images and to demonstrate this technique as an effective means of producing a compact, information-preserving representation of tensor superpositions. The basic tensor combination approach proposed herein competes with other methods, such as the geodesic approach [6], or the algebraic framework [7]. Compared with these methods, ours has the advantage of providing closed-form solutions for the distance between any two tensors and for the weighted average of an arbitrary number of diffusion tensors, at least to the extent that solving Eq. (3), which involves computing the eigensystem of a non-symmetric 6×6 matrix, is considered a closed-form operation. We note that our derivation a combination framework for diffusion tensors is based entirely on selecting an appropriate distance measure between two tensors, in our case the symmetric Kullback-Leibler divergence, Eq. (1). This is analogous to both Fletcher & Joshi and to Batchelor et al., albeit with a different distance measure as the foundation of our framework. It is not clear, which published tensor combination method performs better, and the answer probably depends on the application. However, our distribution-based model is physically well motivated and possesses convenient computational properties. It is also conceptionally more consistent than other methods with representing complex combinations of tensors as probabilistic superpositions, e.g., by means of Gaussian mixtures.
516
T. Rohlfing, E.V. Sullivan, and A. Pfefferbaum
Such mixtures have used as inputs to a Monte-Carlo fiber tracking algorithm [13]. Herein, we have generated such a mixture in a synthetic example in Section 3.2. Our next goal is to apply the k-means methodology used herein to obtain compact probabilistic population representations of actual multi-subject DTI data. This could be considered a generalization and extension of a method by Parker & Alexander [14], who represented crossings of two fibers in a single subject. Compared with simply averaging tensors across subjects, the k-means clustering almost entirely preserves their directional specificity, thus minimizing degeneration of tensors in the combined tensor images. As recently shown by Behrens et al. [15], probabilistic fiber tracking in such Gaussian mixture fields has the potential of improving the detection of minor tracts in areas of fiber crossings. Our next step will be to generate the Gaussian superpositions from multiple subjects rather than a single one.
Acknowledgments This work was supported by the National Institute on Alcohol Abuse and Alcoholism, Grants AA05965 and AA12388, and the National Institute on Aging, Grant AG17919.
References 1. Basser, P.J., Mattiello, J., LeBihan, D.: MR diffusion tensor spectroscopy and imaging. Biophys. J. 66(1), 259–267 (1994) 2. Shimony, J.S., McKinstry, R.C., Akbudak, E., et al.: Quantitative diffusion-tensor anisotropy brain MR imaging: Normative human data and anatomic analysis. Radiology 212(3), 770– 784 (1999) 3. Mori, S., Crain, B.J., Chacko, V.P., et al.: Three-dimensional tracking of axonal projections in the brain by magnetic resonance imaging. Ann.Neurol. 45(2), 265–269 (1999) 4. Xu, D., Mori, S., Shen, D., et al.: Spatial normalization of diffusion tensor fields. Magn. Reson. Med. 50(1), 175–182 (2003) 5. Jones, D.K., Griffin, L.D., Alexander, D.C., et al.: Spatial normalization and averaging of diffusion tensor MRI data sets. NeuroImage 17(2), 592–617 (2002) 6. Fletcher, P.T., Joshi, S.: Principal geodesic analysis on symmetric spaces: Statistics of diffusion tensors. In: Sonka, M., Kakadiaris, I.A., Kybic, J. (eds.) Computer Vision and Mathematical Methods in Medical and Biomedical Image Analysis. LNCS, vol. 3117, pp. 87–98. Springer, Heidelberg (2004) 7. Batchelor, P.G., Moakher, M., Atkinson, D., et al.: A rigorous framework for diffusion tensor calculus. Magn. Reson. Med. 53(1), 221–225 (2005) 8. MacQueen, J.: Some methods for classification and analysis of multivariate observations. In: Proceedings, Fifth Berkeley Symposium on Mathematical Statistics and Probability vol. 1., University of California Press, pp. 281–296 (1967) 9. Myrvoll, T.A.: Adaptation of Hidden Markov Models using Maximum a Posteriori Linear Regression with Hierarchical Priors. PhD thesis, Norwegian University of Science and Technology, Trondheim (2002) 10. Myrvoll, T.A., Soong, F.K.: Optimal clustering of multivariate normal distributions using divergence and its application to HMM adaptation. In: Proceedings (ICASSP ’03). 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003, vol. I, pp. 552–555. IEEE Press, New York (2003)
Divergence-Based Framework
517
11. Sullivan, E.V., Adalsteinsson, E., Pfefferbaum, A.: Selective age-related degradation of anterior callosal fiber bundles quantified in vivo with fiber tracking. Cereb. Cortex 16(7), 1030– 1039 (2005) 12. Salat, D., Tuch, D., Greve, D., et al.: Age-related alterations in white matter microstructure measured by diffusion tensor imaging. Neurobiol. Aging 26(8), 1215–1227 (2005) 13. Behrens, T.E.J., Woolrich, M.W., Jenkinson, M., et al.: Characterization and propagation of uncertainty in diffusion-weighted MR imaging. Magn. Reson. Med. 50(5), 1077–1088 (2003) 14. Parker, G.J., Alexander, D.C.: Probabilistic Monte Carlo based mapping of cerebral connections utilising whole-brain crossing fibre information. In: Taylor, C.J., Noble, J.A. (eds.) IPMI 2003. LNCS, vol. 2732, pp. 684–695. Springer, Heidelberg (2003) 15. Behrens, T., Johansen Berg, H., Jbabdi, S., et al.: Probabilistic diffusion tractography with multiple fibre orientations: What can we gain? NeuroImage 34(1), 144–155 (2007) 16. Alexander, D.C., Pierpaoli, C., Basser, P.J., et al.: Spatial transformations of diffusion tensor magnetic resonance images. IEEE Trans. Med. Imag. 20(11), 1131–1139 (2001) 17. 3D Slicer, available online: http://www.slicer.org. 18. Rueckert, D., Sonoda, L.I., Hayes, et al.: Nonrigid registration using free-form deformations: Application to breast MR images. IEEE Trans. Med. Imag. 18(8), 712–721 (1999) 19. Studholme, C., Hill, D.L.G., Hawkes, D.J.: An overlap invariant entropy measure of 3D medical image alignment. Pattern Recognit. 32(1), 71–86 (1999) 20. Rohlfing, T., Maurer, J.C.R., Bluemke, D.A., et al.: Volume-preserving nonrigid registration of MR breast images using free-form deformation with an incompressibility constraint. IEEE Trans. Med. Imag. 22(6), 730–741 (2003)
A Appendix A.1 Nonridigly Reoriented Tensor Reconstruction Let T : R3 → R3 be the coordinate transformations from the reference subject image to the image from one of the other subjects. If g1 through gK are the gradient directions used for acquisition of the DWI for the other subject, then at each pixel x in the reference image the diffusion tensor is of the reformatted image is reconstructed based on the directions x)g1 , . . . , J−1 x)gK , (7) J−1 T ( T ( where J−1 x. If T is a rigid transforT is the inverse of the Jacobian matrix of T at pixel mation, this results in a rotation of the gradient directions according to the global rotation between reference and subject anatomy. For nonrigid transformations, this rotation is locally different for each pixel, and there may be shear. By enforcing a positive Jacobian determinant throughout image space, it is guaranteed that the Jacobian matrix is non-singular, which in turn guarantees that the gradient directions remain non-collinear. This method does not require eliminating shear components of the transformation [16]. Nonetheless, we found that penalizing shear during the registration and applying only the orthogonal part of the Jacobian helped avoid numerical problems associated with angular resolution of the directional sampling when using the minimal number of six gradient directions. Unlike approaches that re-orient the tensors, our method is directly applicable to non-Gaussian models of diffusion. It may also be better suited for high-angular-resolution data, because it does not approximate shear by a rotation.
518
T. Rohlfing, E.V. Sullivan, and A. Pfefferbaum
The effectiveness of the tensor reconstruction method with locally reoriented gradients was tested using a numerical phantom (Fig. 6). DWIs were simulated as they would result from imaging a horizontal fiber tract with a square 16×16 pixel cross-section using 6 simulated diffusion gradients. Tensors were reconstructed from these DWI and fibers were tracked from one end of the tract. The DWI were then deformed using a Bspline deformation [18], with the center of the tract displaced by approximately 8 pixels (the control points in the plane cutting the tract in half were all move by 10 pixels). Tensors were reconstructed using locally reoriented gradients, and fibers were tracked from one end of the tract. As Fig. 6 clearly demonstrates, the tracked fibers follow the deformed tract perfectly and appear to remain parallel. Out of 256 original fibers in the undeformed tract, only two were lost in the deformed tract due to interpolation effects on the tract boundary. A.2 Shear-Constrained Nonrigid Inter-subject Image Registration To register images from different subjects, we applied the B-spline algorithm by Rueckert et al. [18] to register the b = 0 images. The image similarity measure was normalized mutual information, ENMI , as defined by Studholme et al. [19]. To prevent folding of the deformation field, we added a folding prevention term. To minimize shear components of the nonrigid transformations, which could impact the angular sampling of the reoriented diffusion gradient directions (see next section), we also incorporated a local shear penalty constraint. The total registration cost function thus is Etotal = ENMI + wfolding Efolding + wshear Eshear
(8)
The folding constraint Efolding is a weakly weighted version of the volume preservation constraint introduced by Rohlfing et al. [20], which enforces a positive Jacobian determinant at each pixel in the image. The local shear constraint Eshear is computed via a QR decomposition of the local Jacobian J, ⎞⎛ ⎞ ⎛ rxx rxy rxz qxx qxy qxz (9) J = ⎝qyx qyy qyz ⎠ ⎝ 0 ryy ryz ⎠ , qzx qzy qzz 0 0 rzz where the first matrix of the decomposition is orthogonal, and the second contains scales and shears. The shearing constraint term is then defined using the coefficients of the second matrix as 2 2 2 wshear (x)(rxy + rxz + ryz ), (10) Eshear = x∈Ω
where w(x) is a local rigidity constraint weight. We used here the FA map of the reference subject’s tensor image, multiplied by 10−2 , i.e., wshear (x) = 10−2 FAref (x).
(11)
Anisotropic areas (e.g., white matter) are thus warped with relatively little shear, while the more lenient shear constraint in regions with isotropic diffusion keeps the warping flexible enough to recover the complex inter-subject coordinate correspondences.