the selection of natural scales in 2d images using ... - CiteSeerX

Report 0 Downloads 53 Views
THE SELECTION OF NATURAL SCALES IN 2D IMAGES USING ADAPTIVE GABOR FILTERING. J. FDEZ-VALDIVIA, J.A. GARCIA and J. MARTINEZ-BAENA Departamento de Ciencias de la Computacion e I.A. E.T.S. de Ingeniera Informatica Universidad de Granada. 18071 Granada. Spain. Xose R. FDEZ-VIDAL Departamento de Fsica Aplicada. Facultad de Fsica. Universidad de Santiago de Compostela. 15706 Santiago de Compostela. Spain. Corresponding author: J. Fdez-Valdivia Departamento de Ciencias de la Computacion e I.A. E.T.S. de Ingeniera Informatica. Universidad de Granada. 18071 Granada. Spain. e-mail : [email protected] fax : +34.58.243317 URL : http://decsai.ugr.es/~jfv/ Number of manuscript pages: 25 Number of gures: 6

1

Abstract This paper analyzes how the natural scales of the shapes in 2D images can be extracted. Spatial information is analyzed by multiple units sensitive to both spatial and spatial-frequency variables. Scale estimates of the relevant shapes are constructed only from strongly responding detectors. The meaningful structures in the response of a detector (computed through 2-D Gabor ltering) are, at their natural level of resolution, relatively sharp and have well-de ned boundaries. A natural scale is so de ned as a level  producing local minimum of a function that returns the relative sharpness of the detector response ltered over a range of scales. In a second stage, to improve a rst crude estimate of the local scale, the criterion is also rewritten to directly select scales at locations of signi cant features of each activated detector. Index terms: Natural scale selection, data-driven multichannel scheme, activated sensors, complex 2D Gabor lters.

1 INTRODUCTION In the real world, surfaces usually have a hierarchical organization that consists of a small number of levels. For example, at the nest level a tree is composed of leaves with a very complex structure of veins; at the next level, each leaf is replaced by a single region; and at the highest level, there is a single blob corresponding to the treetop. There is a natural range of resolutions corresponding to each of these levels of description. And at each level of description, the regions will have well-de ned boundaries. This naturally leads to a situation where visual data may represent distinct objects for di erent purposes: a tree may give rise to several di erent psychological objects, depending on whether it is to be avoided, climbed or felled. As shown in [1], once both the image intensity change models arising from distinct physical processes have 2

been located and scale information about such change models has been extracted, later stage processing tasks may be simpli ed. We address the problem of automatically extracting the signi cant scales of shapes in 2D images. The basic assumptions of the proposed formulation are to be in agreement with spatial-frequency channels models quite successful for the detection of visual patterns [2]: (i) spatial information should be analyzed by multiple detectors, each unit being sensitive to a certain range of 2D spatial frequencies; (ii) the model should base its responses only on those detectors which are sensitive to relevant shapes; and (iii) shapes should be described at their best scales. A crucial point is that the model incorporates simple solutions for the implementation of these assumptions: (i) A multichannel scheme and the activated detectors from this scheme are both derived from the image; (ii) The response of the activated units (hereafter named as sensors) are given by complex Gabor lters. The 2D Gabor functions give a good t of the behavior of the receptive eld of simple cells in mammals' primary visual cortex [3{6]. Also, these functions analyze the image simultaneously in both space and frequency domains, therefore we can use them to t both frequential and spatial parameters. (iii) The relative sharpness of the sensor response across scales is measured by a Gabor lter based criterion, and the natural scales are de ned as those at which sharpness reaches local maxima. If the sensor response isolates di erent sizes of detail at di erent locations, this criterion can be reformulated for extracting natural scales at di erent locations. The layout of this paper is as follows: Sections 2 and 3 describe the proposed multichannel scheme and how the activated sensors from this scheme are derived. In 3

Section 4, the global scale selection method is motivated and introduced. In Section 5, the method is reformulated for extracting local scales. A comprehensive analysis of the derived criteria is presented using a set of experiments and discussion in Section 6. Finally, the main conclusions of this work are summarized.

2 A DATA-DRIVEN MULTISENSOR SCHEME The selection of an appropriate set of responding units is a central issue in multichannel approaches. In general, on the 2-D spatial frequency plane, the multichannel approaches produce the desired scheme by the superposition of a xed number of spatial-frequency channels on a number of orientation bands. However, the problems with such a priori de nitions of the decomposition rule are that: (1) they may not actually re ect the underlying structures of the images under analysis, and (2) their lack of adaptability may well bias any posterior processing. A multichannel design is here achieved by splitting up the 2D spatial frequency plane of an image into a number of orientation and spatial-frequency bands as follows: (i) To produce a number of orientation bands, we take both theoretical and biological results into account. Following the eciency of the model human image code given by a number of authors [7,8], as well as the biological evidence that demonstrates that the median orientation bandwidth of visual cortex cells is about 40 deg [9], the orientation bands are selected with orientations respectively set to of 22:5o; 67:5o; 112:5o; and 157:5o, namely C0 ; C1; C2; and C3 , which correspond to an orientation bandwidth of 45 deg. Due to conjugate symmetry, the sensor design is only carried out on half of the 2-D frequency plane. (ii) Then to look for clumps of energy in the spectrum, these bands of orientation are independently partitioned into a number of channels of (radial) spatial frequency. 4

Such a de nition has several advantages. In particular it allows to separate those sensors mainly associated with noise and not containing any useful information for image processing, from sensors whose excitation levels exceed that of unwanted detail and noise. To derive the sensors, each of the four orientation bands Ci, i = 0; 1; 2; 3, with orientation set to i  45 + 22:5 deg, respectively, need to be partitioned into a number of spatial-frequency bands. This is performed by using an index for the orientation band Ci, denoted Ci (sup), that indicates the relative amount of spectrum folded back into a 2-D spectral domain (as given by the spatial-frequency channel (0; sup) on the orientation band Ci) after the sampling rate reduction to sup: R R sup j R=1 (; ) j dd (1) Ci (sup) = R CRi j R (; ) j dd =1 Ci where j R=1 (; ) j denotes the Fourier transform magnitude of the input image RR smoothed at scale  = 1; the double integral Cisup is over all coordinates (; ) for the 2-D spectral sector corresponding to the spatial-frequency channel (0; sup) upon RR the orientation band Ci; and the double integral Ci is over all polar coordinates for the entire band Ci. In the equation (1), to calculate the transform magnitude of the picture, the image is previously smoothed at scale level  = 1 in order to avoid very high frequencies from quantization noise (one cannot expect things to be geometrically correct at the limiting lower scale boundary, where there is a lot of spurious detail). Once the orientation band Ci under consideration is xed, the index Ci (sup) de ned in equation (1), may be seen as a function of the radial frequency sup. Such a function Ci (sup) determines a monotonic increasing function as sup increases. This function undergoes a signi cant rise in its value at locations sup such that the corresponding 2D spectral domain incorporates some important frequency components to the spectral content of the orientation band Ci. On the other hand, such a rise will be more or 5

less abrupt, depending on the relative importance of the frequency components that are added. In order to detect the locations sup where Ci undergoes a change in the rate of increment, a technique based on the extrema of the second-derivative (with respect to ), denoted C00 i , will be used. For each orientation band Ci; i = 0; : : : ; 3, the position of the extremes of C00 i , denoted as 0 < 1 < 2 < : : : < n , correspond to the locations of change in the rate of increment in Ci as sup increases. They may be used to produce a natural splitting of each orientation band Ci; i = 0; : : : ; 3; into a number of spatial-frequency channels (j?1; j ), which are alike in the importance of the relative frequency components isolated|for further details see [10].

3 THE ACTIVATED SENSORS IN THE MULTISENSOR SCHEME To identify the activated sensors from the multisensor scheme, a method can be applied for classifying sensors by exploiting the statistical regularities of their perceived responses. For this purpose: (i) Each sensor is described by a sensor measure (feature) that can successfully discriminate between activated and non-activated sensors. Here, we propose one feature derived from the summation of the normalized 2-D power spectrum over the sensor. Of course other measures intended to capture relevant characteristics of the sensor spectra are conceivable: measures such as location, size and orientation of peaks and entropy of the normalized spectrum in the sensors [11,12]. To evaluate these frequency domain features according to their ability to discriminate activated sensors, a method of successive selection and deletion based on Wilks criterion may be used [13]. Finally, we have found that the summation of the normalized 2-D power spectrum over the sensor provides an e ective 6

feature for discriminating the set of sensors on training sets. Normalization is a non-linear operation, where the 2-D power spectrum over the sensor is divided by the power spectrum over the entire frequency domain. The e ect of normalization is that the response of each sensor is rescaled with respect to the pooled activity of all the sensors. The measure for sensor Chi is given by RR j R (; ) j2 dd (2) w(Chi) = R RChij R =1(; ) j2 dd =1

where j R=1 (; ) j2 denotes the power spectrum of the image smoothed at the RR scale  = 1; the double integral Chi is over all polar coordinates for the 2D spectral sector corresponding to the sensor Chi under investigation; and the RR double integral is over the entire frequency domain. According to this model, the sensor's selectivity is attributed to summation| the linear stage|, while its nonlinear behavior is attributed to division{ the normalization stage|. Hence, a given sensor might be suppressed by the other sensors, including those with perpendicular orientation tunings. (ii) The subset of activated sensors is computed by the square-error clustering method implemented by the k-means algorithm [14]. The clusters are only two: the rst one, the cluster of activated sensors that put together explain a very signi cant portion of the power spectrum, and the second, the cluster of non-activated sensors. The consideration of only two clusters is based on the properties of the image coding via 2-D Fourier transform which produces a non-uniform response distribution. The model will base its responses only on those activated units sensitive to relevant shapes. In the clustering algorithm, the two seed points are chosen as the highest and smallest values in the sensor features w(Chi). The initial partition is formed by assigning each pattern to the closest seed point. The centroids of the resulting clusters are the initial cluster centers. The cluster centers are updated by recomputing the centroids of all patterns having the same cluster level at the end of the pass. 7

To overcome the problem of convergence of the k-means algorithm to local minima, the partitional algorithm is performed with several initial partitions. In practice, this k-means type algorithm converges very rapidly. In any case, [15] rigorously prove convergence of the k-means algorithm. In Figures 1 and 2, the multisensor scheme and the activated sensors are shown for two di erent incoming images.

4 THE GLOBAL NATURAL SCALES OF THE IMAGE When no prior knowledge is available for the natural scales in a given analysis, scales estimates can be computed as the spatial scales of shapes isolated in the response of the activated sensors (hereafter, sensor scales). There are several ways to compute the sensor responses: (i) the DFT of the input image is multiplied by an ideal lter such that all frequencies inside the sensor are passed with no attenuation, whereas all frequencies outside are completely attenuated, and then computed the inverse DFT; or (ii) the sensor responses are computed through 2-D Gabor ltering. Because of Gabor lters can be used to t both frequential and spatial parameters, we will use them for computing the sensor responses. The location of sensor center determines the central frequency Chi and the orientation Chi of shapes contained in the sensor response. A crude estimate of sensor scale  can be computed based on the sensor bandwidth. To improve this initial estimate, a Gabor lter-based autofocusing criterion is de ned in the following (the term \autofocusing" is intended to mean the selection of an appropriate scale of resolution). The underlying motivation of this approach comes from the notion that at each signi cant level of resolution for the sensor Chi, the meaningful structures at that level should be relatively sharp and have well-de ned boundaries. A `focus' function might be used 8

for measuring the sharpness of an enhanced image through ltering with a 2-D Gabor lter at the scale level , spatial-frequency Chi , and orientation Chi . This function would return a value indicating the relative sharpness of the ltered image across scales. As a result, a natural scale for shape in the sensor response can be de ned as a level at which the focus function reaches a local maximum. A heuristic principle for scale selection in each activated sensor is proposed, based on one physical observable, which is the relative sharpness of the ltered image at successive smoothing levels as a function of scale. Since we are dealing with a physical entity, in order to be able to express a physical relation in a unit free form, such a function relating physical observables must be independent from the choice of dimensional units [16]. We consider normalized coordinates, x= and y=, in order to guarantee scale invariance. Given the image r(x; y), let the ltered image, denoted JChi (x; y; ), be de ned as:

JChi (x; y; ) = r(x; y)  g(x; y; ; Chi ; Chi )

(3)

where  denotes the convolution operator, with g(x; y; ; Chi ; Chi ) being a complex 2D Gabor function which in complex notation represents the modulation product of a Gaussian envelope of arbitrary scale , with a sine wave with arbitrary frequency !0 = (!x0 ; !y0 ) such that: !y0 = tan  ; and q!2 + !2 =  Chi Chi y0 x0 !x0 The impulse response of a 2D Gabor signal can be represented by the equation ( 2 2) (4) g(x; y; ; Chi ; Chi ) = exp ? x 2+2y exp fj (!x0 x + !y0 y)g A signi cant scale for describing the shape isolated in the sensor response should produce a ltered image at this scale having a higher intensity variation than one ltered 9

at another scale. This suggests that the amplitude variance of the corresponding ltered image might be a suitable candidate as the criterion function for selecting scale estimates. The resulting criterion is given by: XX Fi() = (jJChi (x; y; )j ? )2 (5) x y

with XX jJChi (x; y; )j  = K12 x y

with jJChi (x; y; )j denoting the amplitude of the ltered image. In summary, in the absence of further information given an activated sensor Chi , a scale level  would be signi cant for the description of spatial structures contained in the sensor response, if the focus function Fi() had reached a local maximum at .

5 THE LOCAL NATURAL SCALES OF THE IMAGE The previous approach can be used to extract global scales for representing shapes. However, if the image contains di erent sizes of structures at distinct spatial locations, then this approach will not work. Even in the case of a real world image which contains regular structures of similar size, a close resemblance of these structures, if they are not entirely identical, will yield slightly variations of their local natural scales. In this case, the scales should be calculated locally rather than globally. A simple approach would consist in discovering the scales for each image region, isolating a speci c intensity change model [1]. In this process, rstly, the intensity image is segmented into a number of homogeneous regions. Secondly, to nd the spatial scales for regions, we might choose the scales maximizing the di erence between the number of zero10

crossings at successive scale levels. But the problem with this approach is that coarse scale statistics concerning large structures cannot be e ectively calculated from small regions. To overcome these problems, we develop a local scale detection technique, again based on a representation of the image that involves both spatial and spatialfrequency variables in its description. First the spatial locations worth noting regarding the selection of local scales are de ned as reasonable candidates for locations where the visual system perceives something of interest. To this aim, we used the locations of features contained in the local energy maps of the activated sensors [17,18]. The point is how, for the given image, they can be obtained, and we deal with it as follows. For each activated sensor Chi, let JChi (x; y) be the image ltered by a complex 2-D Gabor lter, as given in equation (3):

JChi (x; y) = r(x; y)  g(x; y; Chi ; Chi ; Chi )

(6)

with the lter parameters set to the global scale of Chi , Chi , its central spatialfrequency Chi , and its central orientation Chi . Then the local energy map from the viewpoint of the sensor Chi , noted as LemChi (x; y), is de ned by

LemChi (x; y) = jJChi (x; y)j2

(7)

To model as closely as possible the known properties of the human visual system, the local energy map is computed separately over the activated sensors using the respective complex Gabor lter g(x; y; Chi ; Chi ; Chi ), at the sensor parameters Chi , Chi and Chi (if there have been obtained several global signi cant scales for Chi, the LemChi function is calculated separately over each of them). In other words, the original image r(x; y) is ltered by a quadrature pair having the same amplitude spectra. This pair is de ned using the sine (odd-symmetric) and cosine (even-symmetric) 11

versions of the same complex lter g(x; y; Chi ; Chi ; Chi ): even (x; y ) = r(x; y )  g even(x; y;  ;  ;  ) JCh Chi Chi Chi i

(8)

odd (x; y ) = r(x; y )  g odd (x; y;  ;  ;  ) JCh Chi Chi Chi i

(9)

and

where  denotes the convolution operator; with geven and godd being the even-symmetric and odd-symmetric Gabor lters. In fact, to compute the LemChi map, the outputs of the two lters that make up the pair need to be squared and summed: even (x; y )]2 + [J odd (x; y )]2 LemChi (x; y) = [JCh Chi i

(10)

Consequently, the local energy map LemChi provides an image representation in the even and J odd . Hence, the detection of peaks on space spanned by the two functions JCh Chi i the LemChi map can be used as a detector of the locations of signi cant features from the viewpoint of Chi. Finally, to calculate the local natural scales at locations (x ; y) of local maxima in the LemChi map, the global criterion for scale selection is reformulated as follows: the natural scale  of the spatial structure with frequency components on the activated sensor Chi at location (x; y) should produce a ltered image at  , with higher local intensity variation upon the neighborhood W (x; y) of (x ; y) than other images ltered at any other scale. This suggests that the local amplitude variance of the image ltered over scales might be a suitable candidate as a criterion function for selecting local scales. The local variance is de ned as

Fi(; x ; y) =

X (x;y)2W (x ;y

 )

(jJChi (x; y; )j ? )2 12

(11)

where

 = Card[W1(x; y)]

X (x;y)2W (x ;y )

jJChi (x; y; )j

and JChi (x; y; ) as given in equation (3). The neighborhood W (x; y) is de ned as the pixels contained in a disk of radius r centered at (x ; y), with the radius disk being r = d[(xm ; ym); (x ; y)], where (xm ; ym) is the nearest local minimum to (x ; y) on the energy map LemChi , and with d[; ] being the Euclidean distance. Since the nearest local minimum to (x; y) on the local energy map marks the beginning of another potential structure, our selection for the neighborhood W (x; y) avoids interference with such a structure while calculating the local intensity variation. To sum up, in the absence of further information, given an activated sensor Chi, a scale level  would be signi cant in order to represent the spatial structure at location (x ; y) and with frequency components on the sensor, if the function Fi(; x ; y) had reached a local maximum at  .

6 EXPERIMENTAL RESULTS To examine the performance of the proposed approach, we used two examples of images with 256 gray levels: the rst image, as shown in Fig. 1(a), is a natural scene (128  128) containing a target; the second one is a biomedical image (256  256) as shown in Fig. 2(a). Both the software implementing the scale selection and the test data are available at the URL direction http://decsai.ugr.es/diata/software.html, or by anonymous ftp from decsai.ugr.es, in the tar les pub/diata/software/slocalimages.tar.gz and pub/diata/software/slocal-source.tar.gz . We rst describe the results of the global scale selection on these images. Next, the results of the local scale selection are illustrated on the same images. 13

In Fig. 1, the target image and its multisensor scheme are shown in (a) and (b), respectively. The only activated sensors are presented in (c). The image in (a) may be compared with the optimal reconstruction, in (d), which is obtained by linear combination of the activated sensor responses (see [10]). For each activated sensor Chi, its `focus' function across scales, plotted in (e), measures the sharpness of an enhanced image through ltering with the corresponding 2-D Gabor lter at the scale level , spatial-frequency Chi , and orientation Chi . The results of the global scale selection for a biomedical image are fully described in Fig. 2. The only activated sensors of the multisensor scheme are presented in (c). The optimal reconstruction, in (d), is obtained by using the activated sensor responses. It may be compared with the original image given in (a). In this example, the global scale selection is illustrated in (e), where the activated sensors and their focus functions are plotted. Next, Figs. 3-6 illustrate the performance of the local scale selection process on the same images. To compute the local energy map of each activated sensor Chi, the input image is rst convolved with the 2D Gabor lter g(x; y; Chi ; Chi ; Chi ) with Chi ; Chi ; and Chi being the global scale, central frequency and orientation of the activated sensor under analysis. Then, the local energy map LemChi is obtained as the squared amplitude of the ltered image. Figs. 3(a.2), 3(b.2) and 3(c.2) illustrate the energy maps for three di erent activated sensors of the biomedical image, as shown in 3(a.1), 3(b.1) and 3(c.1). Next to derive the spatial positions at which features are located, the local maxima of LemChi along the orientation Chi +90o are marked (note that the spectra have to be rotated 90o so that vertical orientation on the spectrum corresponds to horizontal features in the spatial-domain image). Three examples of local maxima maps are shown in Figs. 3(a.3), 3(b.3) and 3(c.3). They correspond to the energy maps in (a.2), (b.2) and (c.2), respectively. Finally to produce local scales at locations of local maxima (x ; y), the scale at which the resultant band-pass ltered image has highest local intensity variation upon the neighborhood of (x ; y) is computed. For the sensors in Fig. 3, the neighborhoods of the local maxima points, 14

as shown in Figs. (a.5), (b.5) and (c.5), are extracted using the local minima maps along the orientation Chi + 90o (given in Figs. (a.4), (b.4) and (c.4), respectively), as described in the previous section. Figs. 4, 5 and 6 illustrate how, for each activated sensor of the target image in Fig. 1(a), local scale changes across the interesting points of the sensor response. Each activated sensor and its energy map are shown in (a) and (b). A blend of the energy map and the corresponding local maxima map is given in (c). The neighborhoods for local maxima locations are shown in (d). These gures also give 3-D views of the computed local scales. They can be used to compare the global scale (shown as the black cones on the 3-D view sidelines) with the local scales, so that we may appreciate the magnitude of the error made by estimating a particular scale as the global scale instead of the local one. In summary, these results provide a representation of the image content according to orientation, spatial frequency, scale and visual attention. Note that only relevant information will be taken into account to explain the image: although the target image shows a complex scene|it contains a mixture of various types of relevant features, unwanted detail and noise|this representation clearly tends to drastically remove noise and unwanted detail, as shown in Figs. 4, 5 and 6.

7 Conclusion To extract estimates of natural scales in 2D images in absence of further information, three key points exist: (i) the scales of interest should be de ned as the natural scales for shapes isolated in the response of activated units|with the activated units being sensitive to relevant shapes; (ii) the scale selection should be accomplished using a representation of the image which involves both spatial and spatial-frequency variables 15

in its description, and by means of an objective criterion that measures the sharpness of this representation across scales; (iii) if the image isolates di erent sizes of detail at di erent locations, this objective criterion should be properly reformulated in a local fashion, and the spatial locations worth noting regarding the calculation of local scales may be those of local maxima of the energy map for each strongly responding unit.

Acknowledgments. We wish to thank Lex Toet for the target image in Fig. 1(a).

John G. Daugman, Eric Bardinet, and some referees read the manuscript and suggested several good ways to improve it, which we managed to implement only in part. This research was sponsored by the Spanish Board for Science and Technology (CICYT) under grant TIC97-1150.

References [1] J.A. Garca, J. Fdez-Valdivia, A. Garrido. \A scale-vector approach for edge detection." Pattern Recognition Letters, Vol. 16, pp. 637-646, (1995). [2] Graham, Norma. Visual Pattern Analyzers, Oxford Psychology Series, No. 16, Oxford University Press, (1989). [3] Marcelja S., \Mathematical description of the response of the simple cortical cells", J. Opt. Soc. Am., Vol. 70, pp. 1297-1399, (1980). [4] Daugman J.G., \Spatial visual channels in the Fourier plane",Vision Research, Vol. 24, pp. 891-910. (1984). [5] Daugman J.G., \Uncertainty relation for resolution in space, spatial frequency and orientation optimized by two-dimensional visual cortical lters", J. Opt. Soc. Am. A., Vol 2, pp. 1160-1169, (1985).

16

[6] Jones J.P., Palmer L.A., \An evaluation of the two-dimensional Gabor lter model of simple receptive elds in cat striate cortex", J. Neurophysiol., Vol.58, pp. 1233-1258, (1987). [7] Watson, A.B. \Eciency of a model human image code." J. Opt. Soc. Am. A, Vol. 4, pp. 2401-2417. (1987). [8] Navarro, R., Tabernero, A., and Cristobal, G. \Image representation with Gabor wavelets and its applications." Advances in imaging and electron physics, Peter W. Hawkes, editor, Academic Press Inc. Orlando, FL, (1995). [9] De Valois, R.L., Yund E.W., and Hepler, N. \The orientation and direction selectivity of cells in macaque visual cortex." Vision Research, Vol. 22, pp. 531-544. (1982). [10] Martinez-Baena, J., Fdez-Valdivia, J., Garcia, J.A., and Fdez-Vidal, X.R. \A new image distortion measure based on a data-driven multisensor organization." Pattern Recognition, To appear (1998). [11] D'Astous, F., Textural Feature Extraction in the Spatial Frequency Domain. Ph.D. thesis, Dept. of Systems Design Engineering, University of Waterloo, Ontario, Canada, (1983). [12] Jernigan, M.E., and F. D'Astous, \Entropy-based texture analysis in the spatial frequency domain". IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol. 6, pp. 237-243, (1984). [13] Rao, C.R., Linear Statistical Inference and Its Applications. Wiley, New York, (1973). [14] Jain, A.K., and Dubes, R.C. Algorithms for clustering data. Prentice-Hall, Englewood Cli s, NJ, (1988). [15] Selim, S.Z., and Ismail, M.A. \K-means-type algorithms: a generalized convergence theorem and characterization of local optimality." IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol. 6, pp. 81-87, (1984).

17

[16] Florack, L.M.J., ter Haar Romeny, B.M., Koenderink, J.J., and Viergever, M.A. \Scale and the di erential structure of images." Image Vision Comp., Vol. 10, pp. 376-388, (1991). [17] Morrone, M.C. and D. C. Burr. \Feature detection in human vision: A phase-dependent energy model." Proc. R. Soc. Lond. B, Vol. 235, pp. 221-245, (1988). [18] Fdez-Vidal, X.R., Garcia, J.A., Fdez-Valdivia, J. and Garrido, A. \Using models of feature perception in distortion measure guidance", Pattern Recognition Letters, To appear (1998).

18

Figure 1.- (A) The target image. (B) The multisensor scheme. (C) The subset of activated sensors from the multisensor scheme. (D) The optimal reconstruction by linear combination of the activated sensor's responses. (E) The `focus' functions used for extracting the global scale of each activated sensor. Figure 2.- (A) The biomedical image. (B) Its multisensor scheme. (C) The subset of activated sensors. (D) The optimal reconstruction. (E) The `focus' functions used for extracting the global scale of each activated sensor. Figure 3.- (A.1), (B.1) and (C.1) Three di erent activated sensors of the biomedical image. (A.2), (B.2) and (C.2) The corresponding local energy maps. (A.3), (B.3) and (C.3) The local maxima maps. (A.4), (B.4) and (C.4) The local minima maps. (A.5), (B.5) and (C.5) The regions of in uence. Figure 4.- (A) The activated sensor under analysis. (B) Its energy map. (C) A blend of the local maxima and energy maps. (D) The neighborhoods of local maxima points. (E) The 3-D view of the local scales obtained at locations of local maxima on the energy map (with the sensor's global scale shown as black cones on the sidelines). Figure 5.- (A) The activated sensor under analysis. (B) Its energy map. (C) A blend of the local maxima and energy maps. (D) The neighborhoods of local maxima points. (E) The 3-D view of the local scales obtained at locations of local maxima on the energy map (with the sensor's global scale shown as black cones on the sidelines). Figure 6.- (A) The activated sensor under analysis. (B) Its energy map. (C) A blend of the local maxima and energy maps. (D) The neighborhoods of local maxima points. (E) The 3-D view of the local scales obtained at locations of local maxima on the energy map (with the sensor's global scale shown as black cones on the sidelines).

19

(A)

(B)

(C)

(D)

700

Global scale = 1.4

600

500

400

300

200

100

0 0

5

10

15

20

10

15

20

10

15

20

5500 5000

Global scale = 4.3

4500 4000 3500 3000 2500 2000 1500 1000 500 0 0

5

4000

3500

Global scale = 3.3

3000

2500

2000

1500

1000

500

0 0

5

(E)

Figure 1 20

(A)

(B)

(C)

20000

(D)

12000

Global scale = 4.8

Global scale = 6.8

18000

10000 16000 14000 8000 12000 10000

6000

8000 4000 6000 4000 2000 2000 0

0 0

5

10

15

20

0

5

10

15

20

10

15

20

10

15

20

10

15

20

10

15

20

15

20

5000 2000

Global scale = 1.7

1800

Global scale = 2.8

4500

4000 1600

3500

1400

3000

2500 1200

2000 1000

1500 800

1000

500

600 0

5

10

15

0

20

30000

5

20000

Global scale = 8.3

25000

Global scale = 6.8

18000 16000 14000

20000 12000 15000

10000 8000

10000 6000 4000 5000 2000 0

0 0

5

10

15

20

30000

0

5

7000

Global scale = 8.3

25000

Global scale = 3.3

6000

5000 20000 4000 15000 3000 10000 2000

5000

1000

0

0 0

5

10

15

20

4500

0

5

2500

Global scale = 2.8

4000

Global scale = 2 2000

3500

3000

1500

2500 1000

2000

1500 500 1000

500

0 0

5

10

15

20

0

5000

10000

4500

9000

Global scale = 2.8

5

Global scale = 4.3

8000

4000

7000 3500 6000 3000 5000 2500 4000 2000 3000 1500

2000

1000

1000

500

0 0

5

10

15

20

0

(E)

Figure 2 21

5

10

(A.1)

(B.1)

(C.1)

(A.2)

(B.2)

(C.2)

(A.3)

(B.3)

(C.3)

(A.4)

(B.4)

(C.4)

(A.5)

(B.5)

(C.5)

Figure 3 22

(A) Magnitude of Global Scale (1.4) Magnitude of Local Scale

(B)

(C)

(E)

Figure 4 23

(D)

(A) Magnitude of Global Scale (4.3) Magnitude of Local Scale

(B)

(C)

(E)

Figure 5 24

(D)

(A) Magnitude of Global Scale (3.3) Magnitude of Local Scale

(B)

(C)

(E)

Figure 6 25

(D)