GRADIENT FIELD DISTRIBUTIONS FOR THE REGISTRATION OF IMAGES* Joshua Gluckman Department of Computer Science Polytechnic University Brooklyn, NY
[email protected] ABSTRACT This paper introduces a new method to register images that are rotated and translated with respect to each other. The method works by transforming each image to a gradient distribution space. This space represents the likelihood of finding a particular gradient in the image and is invariant to translation. Once transformed the rotation between the images is efficiently found using correlation. Unlike Fourier based methods, phase information is retained in the gradient distribution space, thus a larger class of images can be accurately registered. The method is computationally efficient and does not require non-linear optimization or iterative methods. Furthermore, large rotations and translations can easily be handled. 1. INTRODUCTION Image registration is the problem of determining a geometric transformation that aligns a pair of images, a fundamental task in both image processing and computer vision. A considerable amount of attention has been given to this problem over the years and a multitude of techniques have been developed. These methods differ in the type and severity of transformations they can handle (translation, rotation, affine etc...) and the amount of computational resources that are required. No one technique is suitable for all situations and the practitioner typically chooses the best method for the problem at hand. This paper introduces a new technique that is able to handle large rotations and translations in a computationally efficient manner. The method works by first computing the gradient field for each image. Next, a two dimensional histogram representing the gradient field distribution is constructed for each image. Because histograms discard the positional information of the gradients they are invariant to translation. Thus, the rotation between the images can be found by matching the gradient field histograms of the two images. Matching is performed by correlation after mapping the histograms to polar coordinates, where the rotation is reduced to a one dimensional shift. *This work was supported by a NSF ITR award no. IIS-0219078
The gradient field distribution is computed using derivative of Gaussian filters. The distribution changes with the scale (variance) of the filter, however at each scale the gradient distributions are related by the same rotation. Thus, to obtain robustness matching is performed between the gradient field histograms computed at several different scales. Once the rotation is found the translation can be recovered in a straightforward manner using phase correlation [1]. The key to the proposed technique is that despite large translations between the images the rotation can still be accurately found using the gradient field distributions. The computational cost of this method is dependent on computing the gradient field histograms and performing correlation between the them. Because derivative of Gaussian filters can be implemented efficiently using recursive IIR filtering [2] [3] and correlation is efficiently performed in the frequency domain. This method of registration is both simple to implement and fast to compute. 2. PREVIOUS WORK One of the first successful methods for image registration was the phase correlation technique [1], which uses only the phase information of the Fourier transform. This method works remarkably well for images degraded with noise, blur and illumination variations due to the robustness of the phase to these effects. Furthermore, only a partial overlap between the images is needed. Unfortunately, phase correlation can only be applied to translated images. For a large computational cost, rotation can be found by sampling the space of rotations and applying phase correlation on each sample [4]. A more efficient frequency domain method using the Fourier-Mellon descriptor was proposed in [5] [6]. The Fourier-Mellon descriptor is the log-polar mapping of the power spectrum, which is invariant to translation and reduces rotation and scale to a shift. Thus, correlation of the Fourier-Mellon descriptor can be used to recover rotation and scale. These methods are efficient but only work well for certain classes of images, those with dominant features in the power spectrum. In addition, as the percentage of overlap between the images diminishes, the power spectra
will be significantly altered and no longer correlate. Recently, a log-polar mapping has also been applied to the spatial domain [7]. This approach is able to register images with significant rotation and scale changes. However, the log-polar mapping must be applied about the center of rotation. Another approach to registration is to optimize over the parameter space of transformations while minimizing the difference of the two images [8] [9]. The advantage of these methods is that they can achieve sub-pixel accuracy and handle a larger class of transformations. The difficulty lies in the computational cost and lack of convergence for large rotations and translations. Hence, Fourier based methods are typically used for an initial starting point before applying optimization. A third class of techniques are feature based. These methods detect and match a set of local features. Using the matched points the geometric transformation can be found through least-squares estimation. For this to work local feature matching must be invariant to rotation and scale. This problem has been addressed in [10]. Because a single missmatched point can cause a gross error in the estimated transformation, robust estimation methods such as random sampling and M-estimators must be employed. Again, these methods work for a large class of transformations but due to need for robust estimation they are computationally costly. The method proposed in this paper, based on gradient field distributions, is computationally similar to the Fourier based methods [5] [6]. However, as will be shown is able to handle a larger class of images and less overlap between the images. 3. GRADIENT FIELD DISTRIBUTIONS Let f (x) be a continuous function defined over the image plane x = (x, y). The gradient field ∇f is defined by the partial derivatives of f with respect to the image plane, T ∂f ∂f ∇f (x) = , . (1) ∂x ∂y If two functions f1 and f2 are related by a rotation R and translation t of the image plane their gradient fields are related by the same transformation, that is ∇f1 (x) = ∇f2 (Rx + t). To remove the dependency on translation the gradient field can be mapped to a probability density function. Density corresponds to the area in the image plane that the gradient field takes on a particular value. One way to define a density function over a continuous vector field is to use a Gaussian weighting function for example, Z 2 ∇f −[u,v]T k − 1 dx, (2) p(u, v; β) = ce 2β2 k
where c is a normalization constant such that the density sums to unity, β is the size of the weighting function analogous to the bin-width of a histogram and the integration is performed over the domain of the image plane. It is straightforward to show that if ∇f1 (x) = ∇f2 (Rx + t) the gradient field distribution functions are related by p1 (u; β) = T p2 (Ru; β), where u = [u, v] . This property can be used to recover the rotation between a pair of images independent of the translation. After converting the density functions to polar coordinates (r, φ), the gradient densities are related by p1 (r, φ) = p2 (r, φ + θ)
(3)
and the rotation angle θ can be efficiently found using correlation. Rather than computing a single density function for each image we can compute a family of gradient fields over scale space using derivative of Gaussian filters. The gradient field at scale (variance) σ 2 is given by, ∇f (x, σ) =
T ∂G(x; σ) ∂G(x; σ) ∗ f (x), ∗ f (x) , ∂x ∂y
where G(x; σ) is a Gaussian function and ∗ denotes convolution. For each scale σ 2 and for a given bin size β the distribution functions p1 (u; β, σ) and p2 (u; β, σ) are related by the same rotation. The Gaussian filter plays an important role. For large variances noise is greatly attenuated and the gradients can be accurately computed. Furthermore, as the scale changes so does information contained in the density field. Thus, to increase robustness density functions can be compared at several different scales. In Figure 1, we took both an image and a 90◦ rotated copy of the image and computed the gradient distribution at two different scales, σ = 8 and σ = 16. Both distributions contain significant information to match the rotated images. Compare this to the power spectrum, which due to the lack of a dominant feature does not contain enough information to find the rotation. While the power spectrum discards the phase information, the gradient distributions are highly dependent on the phase and it is well known that for natural images the phase contains important information for image matching. Gradient distributions also behave well when only part of the images overlap. Suppose the image f1 is composed of two regions g1 and a and f2 is composed of g2 and b, where g1 and g2 are the overlapping but rotated portions. If the proportion of overlap is α then p1 = αg1 + (1 − α)a and p2 = αg2 + (1 − α)b. Assuming the regions a and b are not correlated then p1 ? p2 ≈ α2 g1 ? g2 + (1 − α)2 n, (? is correlation and n is noise). Thus we can expect gradient distributions to correlate as the amount of overlap decreases until the correlation peak gets lost in the noise of the clutter due to a and b.
Fig. 1. Gradient field distributions and power spectra for a pair of rotated images. (a) is an example image and a 90◦ rotated version. (b) is the power spectrum for each image. (c) is the gradient distribution at scale σ = 8 and (d) with σ = 16. 4. IMAGE REGISTRATION In order to register digital images a discrete implementation is needed. The gradient field can be computed using discrete convolution. To compute the density functions a twodimensional histogram is constructed in polar coordinates. This is done by mapping the gradient field to polar coordinates and then binning the data. In order to use the same bin size for each scale the gradient fields must be scaled by σ. For each scale the histograms are cross-correlated. To estimate the rotation between the images we find the maximum of the correlation function with the largest signal to noise ratio (SNR). We use the same definition of SNR as given in [6]: |Cmax − M | , (4) SNR = D where Cmax , M and D are the maximum, mean, and standard deviation of the correlation function. Next, we will summarize the basic processing steps to register a pair of images, f1 and f2 . k
1. For each scale σk = 2 compute σk ∇f1 (x; σk ) and σk ∇f2 (x; σk ). 2. Map each gradient field to polar coordinates. 3. For each scale construct the histograms h1,k (r, φ) and h2,k (r, φ).
4. For each scale compute the cross-correlation Ck (φ) = h1,k ? h2,k . 5. The angle of rotation is given by the maximum of the Ck with the largest SNR. 6. De-rotate one of the images and apply phase-correlation to find the translation. The number of scales depends on the size of the images but is typically between 4 and 6. In order to handle large values for σ, convolution can be performed using recursive IIR filtering [2] [3], for which the run-time is independent of σ. Hence, the computational complexity is determined by the fast Fourier transform which is used for the crosscorrelation in step 4 and the phase correlation in step 6. Thus, the complexity is O(n log n) where n is either the number of pixels or the number of bins in the histogram, whichever is larger. 5. RESULTS We performed the following experiment to determine the ability of gradient distributions to deal with large translations and rotations. A reference image of size 360 × 500 was taken and rotated 45◦ . Gaussian noise with std. of 1 was added to each image. A sequence of 250 × 250 pixel images was cropped from both the reference and the rotated
Fig. 3. Registered images with only 25% overlap.
0.9
0.88
0.88
0.875
0.86
0.87
0.84
0.865
0.82
0.86
0.8
0.855
0.78
0.85
0.76 0
50
100
150 200 rotation angle
(a)
250
300
350
0.845 0
6. CONCLUSIONS A new technique for image registration has been proposed. The method is well suited for images that have significant rotation and translation between them. The method is both computationally fast and simple to implement. We have not discussed the effect of scale and intensity changes. Both cause a change in the magnitude of the gradient field. Intensity changes could be handled by pre-processing (whitining, equalization, using log-intensities etc...). A scale change could be dealt with by using a log-polar rather than a polar mapping and comparing gradient distributions with different σ. This is the subject of future work. 7. REFERENCES 50
100
150
200 rotation angle
250
300
350
(b)
Fig. 2. Example test images and correlation curves. (a) A pair of 45◦ rotated images with 85% overlap. (b) Rotated images with 25% overlap.
image. The sequence was chosen such that pairs of cropped images had a decreasing amount of overlap between them. The percentages of overlap were approximately 85%, 65%, 35%, 25% and 15%. We attempted to register each pair of cropped images using the technique proposed in this paper. The number of bins used in the histogram was 100 for r and 360 for φ. The histograms were computed over four scales, σ = 2, 4, 8, and16. The recursive filtering was done using the Delft image processing library, which is publicly available from www.ph.tn.tudelft.nl/DIPlib. The correct rotation and translation was found for each pair except for the last one which had only 15% overlap. Figure 2(a) shows the correlation curve for the pair of images with 85% overlap. The curve shown is the one with maximum SNR, which occurred at σ = 8. For the images with 25% overlap the best correlation curve was found at σ = 4 and is shown in figure 2(b). Figure 3 shows the registered images with the overlap outlined.
[1] C.D. Kuglin and D.C. Hines, “The phase correlation image alignment method,” in Proc. Int. Conf. on Cybernetics and Society, 1975, pp. 163–165. [2] R. Deriche, “Recursively implementing the gaussian and its derivatives,” Tech. Rep. 1893, INRIA, April 1993. [3] I.T. Young L. J. van Vliet and P.W. Verbeek, “Recursive gaussian derivative filters,” in Proc. of ICPR, 1998. [4] E. De Castro and C. Morandi, “Registration of translated and rotated images using finite fourier transforms,” IEEE Trans. on PAMI, , no. 3, pp. 700–703, Sept. 1987. [5] B.S. Reddy and B.N. Chatterji, “An fft-based technique for translation, rotation, and scale-invariant image registration,” IEEE Trans. on PAMI, vol. 5, no. 8, pp. 1266–1270, 1996. [6] M. Defrise Q. Chen and F. Deconinck, “Symmetric phase only matched filtering of fourier-mellin transforms for image registration and recognition,” IEEE Trans. on PAMI, vol. 16, no. 12, pp. 1156–1168, 1994. [7] G. Wolberg and S. Zokai, “Robust image registration using log-polar transform,” in Proc. of IEEE ICIP, 2000. [8] K.J. Hanna J.R. Bergen, P. Anandan and R. Hingorani, “Hierarchical model-based motion estimation,” in Proc. of ECCV, 1992, pp. 237–252. [9] U.E. Ruttimann P. Thevenaz and M. Unser, “A pyramid approach to subpixel registration based on intensity,” IEEE Trans. on IP, vol. 7, no. 1, pp. 27–41, 1998. [10] C. Schmid Y. Dufournaud and R. Horaud, “Matching images with different resolutions,” in Proc.of IEEE CVPR, 2000.