iTREE: FAST AND ACCURATE IMAGE REGISTRATION BASED ON THE COMBINATIVE AND INCREMENTAL TREE Hongjun Jia1, Guorong Wu1, Qian Wang1,2, Minjeong Kim1, and Dinggang Shen1 1
Dept. of Radiology and BRIC, 2Dept. of Computer Science, University of North Carolina at Chapel Hill ABSTRACT
In this paper, a novel tree-based registration framework is proposed for achieving fast and accurate registration by providing a more appropriate initial deformation field for the image under registration. Specifically, in the training stage, all training real images and a selected portion of simulated images are organized into a combinative tree with the template as the root, and then each training image is registered to the template with the guidance from the intermediate images on its path to the template. In the testing stage, for a given new image, we first attach it as a child node of its most similar image on the tree, and then use the respective deformation field of this image to initialize the registration. In this way, the residual deformation of the new image to the template can be fast and robustly estimated. In the other case, to register a set of new images, we attach them to the tree one by one by allowing similar test images to help each other during the registration. Importantly, after registration of all new images, a new tree is built which is more capable of representing population distribution and thus allowing for better and faster registration for new future images. This method has been evaluated on the real brain MR image datasets, showing that it can achieve better accuracy within less time than both the statistical model based registration method and the tree-based registration method. Index Terms — Image registration, intermediate template, statistical model, combinative tree, incremental tree 1. INTRODUCTION Fast and accurate non-rigid image registration is one of the most important goals to pursue in medical image analysis [1]. A large number of registration methods have been developed for pairwise registration [2, 3], where the moving subject is warped towards the fixed template with the estimated deformation by maximizing the similarity measures between the deformed subject and the template. However, the registration between images with large shape variations is still a challenging problem. One possible reason to cause this difficulty is the lack of an appropriate initial estimation of the spatial transformation. A good initialization of the spatial transformation can help the registration start from near the global optimum and search along the correct direction, thus significantly reducing the risk of being trapped into local minima and also lowering the computing time.
978-1-4244-4128-0/11/$25.00 ©2011 IEEE
1243
Recently, several intermediate templates (IT) guided registration methods have been demonstrated to be effective in the registration of brain structural images [4-6], diffusion tensor images [7], 2D shapes [8], and 3D cortical surfaces [9], for achieving faster and more accurate registration, especially when the inter-subject variation is large. In these methods, other images get involved into the pairwise registration between the subject and template (as intermediate templates), in order to decompose the potentially large deformation into several mild ones, each of which can be estimated with high reliability. Based on the way to formulate the intermediate templates, these IT-guided registration methods can be classified into two categories, i.e., intermediate template generation (ITG) [5, 6] and intermediate template selection (ITS) [4, 7-9] based methods. ITG methods try to construct, from the template, a simulated image which is more similar to the moving subject by using the statistical deformation models learned from the training dataset. In [5], principal component analysis (PCA) is utilized to learn the variations of training deformation fields and generate a set of new intermediate templates with the corresponding simulated deformation fields. Support vector regression (SVR) is particularly applied in [6] to correlate the statistics of image appearances and their deformation coefficients for effective prediction of an initial deformation field for a new subject, to speed up the registration. On the other hand, ITS methods aim to select the intermediate templates from real images without a training step, and the number of intermediate templates involved could be more than one. A minimum spanning tree is built in [7, 8] to connect all test images to the template, and the registration for each subject is performed by warping the subject towards its intermediate images on the path to the template one by one. It is worth noting that the difference between the neighboring images on the tree is generally small, and thus their deformation can be estimated more accurately. In this paper, we present a novel image registration method by combining the ideas of ITG and ITS to achieve fast and accurate image registration. As we can see, the representative ability of ITG methods may be limited by its single-template based IT prediction, although unlimited intermediate templates can be generated. On the other hand, a real image dataset is usually not large enough for ITS methods to allocate a smooth registration path for each image. To address the limitations and keep the advantages of these
ISBI 2011
two methods, we propose, in this paper, to build a combinative tree with both real and simulated images, thus incorporating sufficient population information and possibly providing a better initial deformation field for the subsequent refinement of the registration of a new subject. Also, the tree can grow incrementally during the registration of more and more new subjects to the template. There are three major advantages of the proposed method. First, the registration accuracy between training samples and the template can be well controlled by the treebased registration. In particular, since simulated images have the ground-truth deformation fields to the template, such shortcuts to the template on the combinative tree can reduce the training error. Second, the extended training set with both simulated and real images can better represent the population information than using only either simulated or real images. Third, the incremental tree can utilize the additional information provided by the similarity among test images to better guide the registration of new test images. Specifically, the registration result of one test image could become a better initialization of other test images than any original training sample on the tree. Experimental results on real images demonstrate that our method can achieve better registration performance within less time in comparison to both statistical model based method and tree-based method. 2. METHODS A flow chart for the proposed image registration framework is shown in Fig. 1. In the training stage, we build a combinative tree based on an extended set of training samples with both simulated and real images, and save the tree together with all estimated deformation fields. For each new image to be registered, we first attach it to the tree and register it towards the template starting with a well-estimated initial deformation field. We will describe each step in Fig. 1 with details in the following sections. 2.1. Generation and selection of intermediate templates In the training stage, the first step is to generate a set of simulated templates based on a statistical model, for extending the training set. In order to construct the statistical model of deformation fields, we need to register all training images to the template to get sample deformation fields, if the template is already pre-defined. Otherwise, we have to select one image from the dataset as the template. To minimize the bias in registration, it is important that the template should represent well the whole population. We adopt the same method as in [4, 10] to select the geometric median subject as the template, based on an undirected graph with edges weighted by pairwise distances between images. Specifically, given a template ܫ , all other images {I1, …, IN-1} are registered to the template to get a set of training deformation fields {G1, …, GN-1}, where ܰ is the number of training images. A statistical model, e.g., the PCA based model in [5] or the perturbation model in [6], can then be adopted to learn the variations of deformation fields and
1244
Fig. 1. The combinative and incremental tree (iTree) based image registration framework.
further generate simulated intermediate templates. In [5, 6], to ensure a dense sampling on the space of deformation fields, usually thousands of images are generated, which could make simulated images dominate the extended training set and significantly bias the extended training set towards the appearance of the fixed template. Moreover, it is found that many of these simulated images are quite similar to each other due to the dense sampling on deformation fields, and some simulated images are distributing closely around the real images in the training set, which may result in redundancy. Based on these observations, we employ a simulated image selection step to remove dominancy and redundancy in the extended training set, as detailed below. For each simulated image ܫመ אሼܫመଵ ǡ Ǥ Ǥ Ǥ ǡ ܫመௐ ሽ generated by warping template with those densely-sampled deformation fields, its shortest distance to the real images is defined as, ݀ ൌ ݀݅ݐݏ൫ܫመ ǡ ܫ ൯ ǡ ݅ ൌ ͳǡ ǥ ǡ ܹǡ ୀǡǥǡேିଵ
where ݀݅ݐݏ൫ܫመ ǡ ܫ ൯ is a distance metric between two images, and W is the number of simulated images before selection. After we have calculated the short distances for all simulated images, only those with a distance larger than the threshold (݀௧ ) are kept and combined with real images to form the extended training set. For example, a total of ܯsimulated images are selected and denoted as {IN, …, IN+M-1}, along with their deformation fields {GN, …, GN+M-1}. Usually, the threshold ݀௧ is set to be a value to select a comparable number of simulated images to the real images. Here, we set M=2N. It is worth noting that some selected images could be close to each other. We run a sequential check to remove this redundancy, i.e., for ܫ௧ ሺ ݐ ܰሻ, any image ܫ௦ ሺ ݏ ݐሻ with ݀݅ݐݏሺܫ௦ ǡ ܫ௧ ሻ ൏ ݀௧ will be discarded. 2.2. Construction of the combinative tree After generating and selecting simulated intermediate templates, an extended training set is formed by combining both
real and simulated images, which can better represent the population information than using only either real or simulated images. Then, we can build a tree to organize all of them together based on the pairwise dissimilarity. In this paper, we adopt a method to build the minimum spanning tree (MST) as in [7, 8]. The template image is selected as the root, and all other images are connected to the template by their respective paths on the tree. In Fig. 2, the organizing structures of three different registration frameworks are illustrated. The tree structure in the statistical model based method [5] is flat, and all paths are represented by a known deformation field from the simulated intermediate template to the actual template (the red dashed arrows in Fig.2a). In comparison, the tree-based methods [4, 7, 8] use only test images to build the tree. There is one and only one path from each node to the root, and each edge along the path is represented by a deformation field to be estimated (the purple dotted arrows in Fig.2b). In our organizing structure of the extended training set, on the contrary, there exist both simulated (ground-truth) deformation fields and those estimated during the training stage (the blue solid arrows in Fig.2c), which results in a graph, or a combinative tree, with possibly more than one routes from a node to the root because of the existence of shortcuts. In the training stage, we register all real images to the template by following their respective paths on the graph. It is worth noting that, because of the existence of shortcuts, or the ground-truth deformation fields as indicated by the red dashed curves in Fig.2c, we only need to register each real training image along its path till arriving at the nearest simulated image, and then jump to the final template by taking advantage of the shortcut. In this way, the registration accuracy on the original training dataset can be also improved, which has been confirmed in our experiments. All deformation fields are stored, together with the tree structure, for the subsequent registration step of new images. 2.3. Registration for new images With the combinative tree and the stored deformation fields, the registration of a new image to the template can be implemented as follows. Given an image to be registered, we first find its most similar training image on the tree and feed the corresponding deformation field as an initialization to further refine its registration result. As the selected training image is generally very similar to the new image, this provides a good initialization with a significantly reduced risk of being trapped into the local minima and also makes the registration converged within fewer iterations. To further improve the performance when trying to register multiple images, we take into consideration the similarity among the test images. Specifically, all test images are first sorted in ascending order based on their distances to the current tree, which are calculated as the minimum distance of each test image to the images on the tree. Then, the test image with the shortest distance to the tree will be registered first. After the registration, we attach it to the tree as a child
1245
Fig. 2. The organizing structure of (a) statistical model based method, (b) tree-based method, and (c) the proposed iTree method.
node of its best match, and the tree is updated accordingly (Fig.2c). It is worth noting that the tree is growing during the registration. This incremental tree can improve the registration accuracy as the best match for a test image could be any real or simulated training image, or even another test image as indicated within the green circle in Fig.2c. In the extreme case, if test images are very similar to each other, their respective best matches are no longer training samples. So, in our method, such best matches in the test set can be found to initialize the registration to the template, which is one of the most important advantages of our method. 2.4. Discussion With more and more new images being registered to the template, the tree becomes larger and larger, which may cause the efficiency problem in the future registration. To efficiently find the best match of a new image on the tree, a simple distance metric on down-sampled images, or a fast retrieval solution [11], can be adopted. Another potential issue is that the original template may not represent the augmented population very well after including many new images, so the template needs to be updated periodically and the training step also needs to be repeated with the updated template. It is worth noting that the registration framework proposed here is general, and different intermediate template generation or pairwise registration techniques can be used. 3. EXPERIMENTAL RESULTS To evaluate the performance of the proposed iTree framework, two sets of experiments on real brain images are implemented in comparison with the direct pairwise registration method, as well as two other methods, i.e., the statistical model based method [5] and tree-based method [8]. The distance metric is defined to be the intensity difference between images. In this paper, we use PCA to learn the statistical model and generate simulated images, and also diffeomorphic demons [12] is used to do the pairwise registration.
3.1. ADNI data We apply the proposed method on the ADNI dataset [14] with 100 images selected from normal control and MCI, where 50 subjects are randomly selected as training data and the rest 50 are used as test images. With PCA model [5], we select top 4 eigenvectors and 4 samples on each of them. So a total of 44=256 simulated images are generated, with only 100 selected to build the combinative tree. The registration results are evaluated by the average tissue overlap rate (with Jaccard rate as ȁܷ ܸ תȁΤȁܷ ܸ ȁ ) on gray matter (GM), white matter (WM), ventricle (VN) and cerebrospinal fluid (CSF) and the average entropy on the registered tissue segmentation images. First, the registration accuracy on the training set is improved by the proposed method. With the combinative tree based training step, the average overlap rate is 70.67%, which is higher than that of the pairwise registration (68.46%). We also list the measurements after the registration of all test images in Table 1. It can be seen that the proposed method can achieve the best performance among all three different registration methods. It is worth noting that the running time of our method is only 60 minutes for registration of 50 images, which is significant lower than the pairwise registration (158 mins) and the treebased method (142 mins), and it is also faster than the statistical model based method (85 mins). Table 1. The average overlap rate and entropy of the registered images by different methods. Overlap rate (%) Entropy GM WM VN CSF Original data Pairwise registration Statistical model [5] Tree based [8] Proposed iTree
42.36 49.06 50.62 50.58 52.68
54.97 65.05 68.21 68.05 69.27
61.89 79.69 81.35 81.62 82.74
38.40 53.12 59.05 58.84 61.61
0.788 0.617 0.568 0.570 0.547
3.2. LONI LPBA40 dataset There are 40 brain images in LONI LPBA40 dataset [14]. We select 20 images for training and the others for testing. The overlap rates on all 54 ROIs (with detailed descriptions in [14]) of three registration methods are plotted in Fig. 3. It shows that our method can achieve the best overall registration accuracy as well as on most ROIs. The running time for registering all 20 test images by our method is about 47 minutes, which is lower than that of the statistical model based method (52 mins) and the tree based method (80 mins). 4. CONCLUSIONS A new image registration framework based on the combinative and incremental tree is presented to achieve fast and accurate registration. Both simulated and real images are combined to build a tree to improve the training performance. In the application stage, each new image is attached as a child node of its best match on the tree, and the corresponding deformation field is used to initialize the registration for further refinement. Experimental results validate the efficacy of our method in comparison with other methods.
1246
Fig. 3. The overlap rates of 54 ROIs on LONI LPBA40 dataset after registration by three different registration methods.
5. ACKNOWLEDGMENTS This work was supported in part by NIH grants EB006733, EB008760, EB008374, MH088520, and EB009634. 6. REFERENCES [1] W.R. Crum, T. Hartkens, and D.L.G. Hill, "Non-rigid Image Registration: Theory and Practice," British Journal of Radiology, vol. 77, pp. S140-153, 2004. [2] D. Shen and C. Davatzikos, "HAMMER: Hierarchical Attribute Matching Mechanism for Elastic Registration," IEEE Trans. on Medical Imaging, vol. 21, pp. 1421-39, 2002. [3] A. Klein, et al., "Evaluation of 14 Nonlinear Deformation Algorithms Applied to Human Brain MRI Registration," NeuroImage, vol. 46, pp. 786-802, 2009. [4] J. Hamm, D.H. Ye, R. Verma, and C. Davatzikos, "GRAM: A Framework for Geodesic Registration on Anatomical Manifolds," Medical Image Analysis, vol. 14, pp. 633-642, 2010. [5] S. Tang, Y. Fan, and D. Shen, "RABBIT: Rapid Alignment of Brains by Building Intermediate Templates," NeuroImage, vol. 47, pp. 1277-87, 2009. [6] M. Kim, G. Wu, P.-T. Yap, and D. Shen, "A Generalized Learning Based Framework for Fast Brain Image Registration," in MICCAI 2010, Beijing, China, 2010. [7] H. Jia, P.-T. Yap, G. Wu, Q. Wang, and D. Shen, "Intermediate Templates Guided Groupwise Registration of Diffusion Tensor Images," NeuroImage, accepted, 2010. [8] B.C. Munsell, A. Temlyakov, and S. Wang, "Fast Multiple Shape Correspondence by Pre-Organizing Shape Instances," in CVPR 2009, pp. 840-847, Miami, FL, 2009. [9] P. Dalal, F. Shi, D. Shen, and S. Wang, "Multiple Cortical Surface Correspondence Using Pairwise Shape Similarity," in MICCAI 2010, Beijing, China, 2010. [10]H. Jia, G. Wu, Q. Wang, and D. Shen, "ABSORB: Atlas Building by Self-Organized Registration and Bundling," NeuroImage, vol. 51, pp. 1057-70, 2010. [11]C.E. Jacobs, A. Finkelstein, and D.H. Salesin, "Fast Multiresolution Image Querying," in SIGGRAPH, 1995. [12]T. Vercauteren, X. Pennec, A. Perchant, and N. Ayache, "Diffeomorphic Demons: Efficient Nonparametric Image Registration," NeuroImage, vol. 45, S61-72, 2009. [13]ADNI, "http://www.loni.ucla.edu/ADNI/," 2004. [14]D.W. Shattuck, et al., "Construction of a 3D Probabilistic Atlas of Human Cortical Structures," NeuroImage, 39, 1064-80,2008.