A MULTI-CHANNEL BASED APPROACH FOR EXTRACTING ...

Report 1 Downloads 32 Views
A MULTI-CHANNEL BASED APPROACH FOR EXTRACTING SIGNIFICANT SCALES ON GRAY-LEVEL IMAGES  M. Garca-Silvente, J. Fdez-Valdivia and J. A. Garca Departamento de Ciencias de la Computacion e I.A. E.T.S. de Ingeniera Informatica Universidad de Granada. 18071 Granada. Spain. Email: fmgs,jfv,[email protected] URL: http://pirata.ugr.es/vision/ Fax: +34.58.243317

Abstract This paper presents the construction of a novel representation of gray-level shape called the scalespectrum space, which makes both spatial frequency channels of speci c importance (concerning spectrum information being isolated) as well as signi cant scale levels from the viewpoint of these spectrum bands explicit. In scale-space representation, where the gray-level shape is generally comprised of multiple structures at di erent levels of scale, it is often not possible to obtain an image in which all the structures are described at their best scale levels, since if one structure is well-enhanced, the other ones appear blurred. At best, some forms of compromise among the structures at di erent scale levels may be sought. To overcome this problem, we present an ecient multi-channel scheme which may be employed to automatically describe each gray-level structure at its most suitable level of smoothing. The ability to decompose a complex problem{that of where to look as well as how to concentrate on certain features in the input data{ into simpler subproblems is a major motivation for using the proposed scheme. This representation allows data-driven detection of those spectrum bands, and the evolution of scale levels from the viewpoint of such domains, and it is not an e ect of some externally chosen criteria or tuning parameters. Therefore, a multichannel organization is derived, which is selectively sensitive to spatial frequency and size which is biologically inspired by the behavior of visual cortex neurones as well as retinal cells.

keywords and phrases: Gray-level shape representation, spatial frequency channel, scale selection, spectrum indexes, edge detection.  This work has been supported by the Spanish Board for Science and Technology (DGICYT) under grant PB94-0751

1

1 Introduction Numerous physiological measurements (Young[24] ) support the theory that the receptive eld pro les found in mammalian front-end visual systems may be modelled by Gaussian lters of various widths or their partial derivatives (Koenderink[15] ). The human visual system is capable of zooming in on the right range of scale: in this way, for instance, it accomplishes the heart's location task on a cardioscintigram by blurring the image, while it decreases the scale in order to analyze the heart's ner structure (Hay and Chesters[10] ). Having said this, the theory of a substratum of spatial frequency channels has emerged from a wide variety of observations, always with the idea lurking always in the background that a process resembling a coarse Fourier analysis might subserve our internal representation of the spatial visual world (Kabrisky[13] ; Blakemore and Campbell[2]; Ginsburg[9] ; Ma ei and Fiorentini[18] ; Pollen and Taylor[21] ). Early support for the spatialfrequency channels concept arose from psychophysical as well as neurophysical experiments. On the one hand, spatial frequency speci ty was revealed from masking and adaptation tests (Blakemore and Campbell[2] ) in addition to which (Campbell and Robson[4] ) have pointed out that detection and discrimination for one dimensional periodic luminance may be predicted from contrast thresholds of the individual Fourier components in their waveforms. On the other hand, many important papers (Campbell et al.[3] ; Ma ei and Fiorentini[18] ; Ikeda and Wright[11] ) have demonstrated that neurophysiological recordings from single cells in the visual cortex of mammalians show bandpass spatial frequency tuning properties similar to those observed psychophysically. As noted above, in many applications the image structures of interest occur at a variety of spatial scales, and in order to detect them multiscale analysis was proposed by (Rosenfeld and Thurston[22] ) and (Witkin[23] ), also developed by (Koenderink[14]; Babaud et al.[1] ; Yuille and Poggio[25] ; Lindeberg[16] ) among others. By applying operators of di erent neighborhood size to an image, multiscale analysis recovers information at a range of scales. But surfaces in nature usually have a hierarchical organization that consists of a small number of levels (Marr[19] ). For instance, at the nest level, a tree is composed of leaves with a very complex structure of veins. At the next level, each leaf is replaced by a single region, and at the highest level there is a single blob corresponding to the treetop. There is a natural range of resolutions corresponding to each of these levels of description. Naturally this leads to a situation in which one thing may represent several di erent objects at di erent times, when its image is processed for di erent reasons: a tree may give rise to several di erent psychological objects, depending on whether it is to be avoided, climbed or felled, for example. But the problem is that without prior knowledge about the desired features, explicitly selecting signi cant scales for images is a dicult problem, as well as combining all the information from multiple scales. Consequently, it seems clear enough that any visual system handling real-work image data requires a basic tool for scale selection capable of extracting only signi cant scales at which di erent structures occur without knowing either which kind of structure we are analyzing or where such a structure is located. An ideal tool should be capable of satisfying a criterion for computational eciency as well as representing the image detail, without requiring user-set parameters or prior knowledge about the nature of the image. This work is intended to provide the answer. Here, the behavior of the image spectrum over scales from the viewpoint of di erent spectrum domains of interest shall be analyzed. Therefore, a novel representation of gray-level structures is proposed: the 2

scale-spectrum space, which makes explicit the meaningful amount of deviation in spectrum information between images smoothed at successive scales across a range of separate scale levels, from the viewpoint of the di erent spectrum parcellations of interest (in order to analyze the given data). The scale-spectrum space may easily be computed using two spectrum indexes: the rst one, the antialiasing index indicating the relative amount of spectrum folded back into each spectrum domain; the second one, the ltering index measuring the amount of deviation in each spectrum parcellation of interest between two smoothed images (or regions) at successive scales across a discrete range. The layout of this paper is as follows: In Section 2, the concepts needed for constructing the proposed representation are introduced, and the two spectrum indexes are described. In Section 3, the formulation is given for extracting the spatial frequency channels in analyzing the data. In Section 4, the scale-spectrum space is presented, as well as how it may be used to determine just signi cant scales in representing the gray-level shape. In Section 5, a comprehensive analysis of the proposed representation is presented using a set of experiments. Finally, the main conclusions of the paper are summarized.

2 The role of spectrum indexes in constructing the scale-spectrum space Given a gray-level image (or a region in a segmented image) f (x; y );

0  x; y  K  2n ? 1

(1)

in order to construct the scale-spectrum space for f (x; y) which allows the extracting of only signi cant scales for gray-level structure, we use two spectral indexes: the antialiasing index and the ltering index.

2.1 The antialiasing index Let the Fourier transform of the given input (image or region) f (x; y) smoothed with a gaussian at scale  be F (1 ; 2 ); ? < 1 ; 2 <  (2) where  = 2n?1 . Since f (x; y) is real, its Fourier transform has conjugate symmetry in 1 ; 2 .

Given a spatial frequency channel (0; sup ), with sup in the discrete range (0; ), the antialiasing index noted as (sup ) indicates the relative amount of spectrum folded back into (0; sup ), i.e., into the corresponding 2-D spectrum domain, noted as sup where the highest radial frequency is given by sup , after the sampling rate reduction from  to sup : RR

 j F=1 (1 ; 2 ) j d2 d1  (sup ) = R R sup

 j F=1 (1 ; 2 ) j d2 d1

(3)

where j F=1 (1 ; 2 ) j denotes the Fourier transform magnitude of the given input smoothed at scale  = 1; and sup indicates the 2-D spectrum domain with the highest radial frequency given by sup , while  denotes the entire 2-D frequency spectrum. Here, such a smoothing is applied at scale  = 1 in order to avoid very high frequencies from quantization noise: as a matter of fact, one cannot expect things to be geometrically correct at the limiting lower scale boundary, where there is a lot of spurious detail. 3

In order to construct the scale-spectrum space, deviation in spectrum content between two smoothed versions of the input at successive scales, across a separate range of smoothing levels, must be calculated from the viewpoint of spatial frequency domain of particular importance (inf ; sup ). But, what is a spatial frequency domain of speci c importance? Since the spectrum information of the data is generally located at a certain number of frequencies, there will be domains (inf ; sup ) in (0; ) comprised of consecutive spatial frequency components exhibiting close resemblance without being identical in their importance for describing the data's spectrum content. Hence, in order to simplify later stage processing, the entire domain (0; ) may be segmented into a number of sequences (inf ; sup ) such that spatial frequencies from di erent domains are not alike, with respect to their relevance for explaining the data spectrum information. As a matter of fact, when  decreases from  to 0 di erent frequency components of the data are removed, and so there are values of  marking a change in the rate of spectrum attenuation. Such values of  may be used in deriving a natural partitioning of (0; ) into a number of domains which are alike in the importance of the respective spatial frequency components isolated. Of course, the problem is what is a  value determining a mark of the change of the rate of spectrum attenuation as  decreases. To answer this question the antialiasing index will be used by analyzing how the spectrum distortion due to aliasing evolves over the di erent spectrum domains.

2.2 The ltering index A heuristic principle for scale selection in each spatial frequency channel is proposed based on one physical observable, i.e., the distortion between spectrum information for input at successive smoothing levels as a function of scale. Because we are dealing with a physical entity, in order to be able to express a physical relation in a unit free form, such a function relating physical observables must be independent from the choice of dimensional units (Florack et al.[7] ). Henceforth an alternative formulation of the multiscale representation is used by considering normalized coordinates, x= and y=, with the motivation for introducing such dimensionless coordinates being scale invariance (Lindeberg and Garding[17] ). Given a spatial frecuency channel (inf ; sup ), with both inf and sup in the range (0; ), the ltering index noted as (inf ; sup ; ) indicates the amount of distortion between the Fourier transform magnitude of the normalized input at scale  and that of the normalized input smoothed at the following scale  + , from the viewpoint of such a domain of particular interest: (inf ; sup ;  ) =

Z Z

(inf ;sup)

( j F (1 ; 2 ) j ? j F+ (1 ; 2 ) j )2

d2 d1

(4)

where j F (1 ; 2 ) j denotes the Fourier transform magnitude of the normalized input smoothed at scale  ; and the integration domain (inf ;sup ) corresponds to vertical and horizontal frequencies in the 2-D spectrum domain, associated with the spatial frequency channel under investigation. The ltering index  can be used to measure the amount of deviation between the spectrum content of the normalized input at consecutive scales, over a range of scale levels. When such a dissimilarity measure is plotted against the scale level, a sharp growth in that deviation plot would indicate that relatively important spectrum information is being removed. Furthermore, by changing the spatial frequency channel, i.e., the corresponding spectrum domain, the resultant measurement would be adapted to other di erent sized structures. The value of the signi cant scale  at which analyzing information concerning a spectrum domain of particular importance, 4

corresponds to the level at which the value of the ltering index is meaningful larger than those of the neighborhood scale levels. This rule for determining the signi cant scales is objective, in the sense that it assumes no threshold, and therefore it leads to an objective selection of  . The rule considered here also makes the problem of selecting scale levels well-de ned.

3 Extracting spatial frequency channels of interest for the given input The antialiasing index (sup ) de ned in equation (3) it may be considered as a function of the radial frequency parameter sup , with sup determining the highest radial frequency for the domain under investigation:  : (0; ) ?! (0; 1) (5) sup ?!  (sup ) where (sup ) " 1; if sup " ; and (sup ) # 0; if sup # 0: Such a function (sup ) determines a monotonic increasing function as sup increases. This function undergoes a signi cant rise in its value (sup ) at locations sup such that the corresponding 2-D spectrum domain incorporates some important frequency components to the spectrum information from the given input. On the other hand, such rise will be more or less abrupt, depending on the relative importance (for explaining the data spectrum) of the frequency components being added. In order to detect locations sup at which  experiments a change in the rate of increment, a technique based on the extremes of the second-derivative, 00 , will be used. The extremes of the second derivative, both maxima and minima, correspond to positions marking a change in the rate of increment in , and so they are positions dividing the entire channel (0; ) into a number of sequences which are alike in the importance of the respective frequency components isolated. Of course, the crucial question is how to compute the second derivative robustly. One possibility is to compute second derivatives by using, for instance, the technique described in (Mokhtarian and Mackworth[20] ) or another simpler approximation (Jain[12]).

3.1 The spectrum domains of interest In any case, once the function 00 () has been calculated, the locations producing a change in the rate of increment in  may be detected as the positions of the extremes of the second-derivative 00 . Hence, we are now ready to de ne the concept of spatial frequency channel of interest by using such information from the antialiasing index. De nition 1. Let the position of the second-derivative extrema for the ltered antialiasing function () be 1 < 2 <    < n which correspond to the locations of the change in the increment rate in . A principle is proposed stating that the spatial frequency channel (i?1 ; i ) for each i, determines a spectrum domain of particular importance concerning the given input f (x; y), with i?1 ; i denoting the radial extrema frequency for the domain, and so when i decreases, meaningful frequency components in studying the spectrum information from the input 5

are removed. The value 0 corresponds to the rst location  at which () undergoes a meaningful rise in its value.

4 Detecting signi cant scales for the given input Once spectrum domains of speci c importance have been derived, the ltering index , de ned for each spectrum domain of particular importance, shall be the tool for analyzing the input f (x; y) : (i;  ) = (i?1 ; i ;  ) =

Z Z

(i?1 ;i )

( j F (1 ; 2 ) j ? j F+ (1 ; 2 ) j )2

d1 d2

with i?1 and i determining the radial extrema frequency corresponding to the spectrum band under investigation. The task of only selecting signi cant scales for processing the given input, having no prior information about it, is also intractable in terms of a pure mathematical question. Hence, a systematic method must be developed for generating initial hypotheses (which could not be optimal in a strict sense). In order to deal with the problem of selecting the signi cant scales at which the given input must be described, the proposed approach introduces a genuine principle about what an interesting scale level is: the scales of interest are generated from scales where the amount of distortion from a spectrum band of speci c importance assumes local maxima over scales. For each channel, the local-maxima locations of the ltering index (i; ) are extracted over a range of separate scale levels, with i being xed and indicating the channel under investigation, while  increases by a constant  from one level to the next, and with  starting at 1. Of course, once i is xed, a location  producing a local-maximum value in the ltering function over scales, determines a scale at which there is a signi cant distortion between the spectrum content of the data smoothed at scale  and that of the data at the next scale  + , from the viewpoint of (i?1 ; i ), which may be understood as the structure at a scale  associated with frequency components in the channel given by radial extrema frequency i?1 ; i is removed from the data at the next scale  + . That is, the scale  is a signi cant scale concerning gray-level shape with frequency components concerning (i?1 ; i ). Principle 1. In absence of further information, given a spatial frequency channel (i?1 ; i) under investigation, a scale level  would be signi cant in order to represent structure with frequency components within the corresponding 2-D spectrum band if the location  had produced local maxima for the ltering index (i;  ) over scales. So, the complex problem of extracting scale levels at which generic gray-level shape is analyzed, has been reduced to a simpler one for detecting scales of interest at which structure with very speci c spectrum information (high, medium or low frequency content) would be described.

4.1 The scale-spectrum space As a result of the derived formulation, a general framework is obtained for analyzing over scales structures corresponding to the di erent parcellations in the spectrum domain: the scale-spectrum space. The scalespectrum space of the given input f (x; y) is taken to mean a parameterized family in which the signi cant scales for analyzing the distinct spectrum parcellations of interest are represented. Given this basis, the scale-spectrum space is de ned as a 2-D space parameterized by both the channel (under investigation), and the scale values. The gray-level structure is now a function of channel and scale, and the scale-spectrum 6

space represents a sequence of spectrum parcellations concerning qualitative structure and the important scales at which structure within each parcellation would be described. Note that such a representation avoids both redundancy and error from information with di erent signi cance being simultaneously analyzed across scales. De nition 2. A point on the scale-spectrum space is a vector (i; ), with i, 1  i  n, indicating the spatial frequency channel (i?1 ; i ) under investigation, while  denotes the scale level. The representation of the local-maxima positions of the function (i; ) over scales for each spectrum parcellation, given by the radial frequency extrema i?1 ; i produces the scale-spectrum space for the given input. Through such a representation, we can nally identify a signi cant scale for describing the gray-level shape as the one producing the local maxima of (i; ) from the viewpoint of a channel (i?1 ; i ). De nition 3. A point marked on the scale-spectrum space is a point (i; ) at which the ltering index (i;  ) assumes local maxima across a range of separate scales, from the viewpoint of the channel (i?1 ; i ). All the points in the scale-spectrum space are characterized in accordance with the criterion of point marked. A straightforward way to detect signi cant scale levels for shape in the given input is by determining the scales  having an associated marked point (i; ) in the scale-spectrum space. Note that as well as knowing whether a scale  is signi cant or not to describe shape, we have also derived that the scale  will only be important in order to accomplish later stage processing, in which detail concerning the channel (i?1 ; i ) is considered, if such a scale determines a marked point (i; ) in the scale-spectrum space. Principle 2. In absence of further information, a scale level  will be signi cant in order to accomplish later stage processing in which the structure concerning the channel (i?1 ; i ) is studied, if there is a marked point (i; ) in the scale-spectrum space for the given input f (x; y). A visual system incorporating this heuristic principle may be able to handle real-work image data (about which usually very little or no prior information is available) with automatic scale selection and being basically free from tuning parameters.

5 Experimental results and discussion Any useful gray-level shape representation method in computational vision should make an accurate and reliable analysis of an object possible, hence a number of criteria should be satis ed by such a representation schema: (1) invariance, in the sense that the representation for the gray-level shape should be invariant against shape-preserving transformations; (2) a criterion of uniqueness meaning that two gray-level shapes which contain di erent basic structures have distinct representations; (3) the gray-level representations must be stable with respect to signi cant uniform and non-uniform noise on the shapes they represent; (4) furthermore such a representation schema must satisfy a criterion of eciency, that is, its computational complexity should be a low-order polynomial in time and space. Of course, these criteria for judging the suitability of shape representation, could produce some diculties. In fact, there are trade-o s that need to be made: the suitability for shape representation depends on what functions such a representation should ful ll, or what properties the representation should have. Therefore, in order to judge the suitability of the proposed representation, we state the function that the representation ful lls: rstly, it produces a multiple channel organization capable of overcoming the unwanted-detail problem at source, where each channel 7

contributes to the output image only when its excitation level exceeds that of the noise or unwanted detail, so that only those frequencies which are relaed to signi cant structures for the subsequent processing would be enhanced, whereas those with unwanted detail would be blurred and so eliminated; secondly, to be used for as obtaining a description of the di erent intensity change models in the given image by applying one smoothing operator with proper scale to the image.

5.1 A multichannel scheme for eliminating noise and unwanted-detail at source We now propose a representational model for simulating the behavior of those units (visual cortex neurones and retinal cells) selectively sensitive to spatial frequency and size organized to compensate for earlier attenuation and so achieving a better initial estimate of the image for subsequent processing. To accomplish this objective, we propose a system capable of acting between two extreme modes of visual processing{the di use (low-resolution) and the focused (high-resolution) modes{, which takes into account a very powerful advantage due to this organization by performing the image correction using multiple channels acting in parallel whose outputs are nally combined to produce the nal result. Since a channel only contributes to the output image when its excitation level exceeds that of the noise, the present system compensates degradations introduced into the picture by the imaging process while simultaneously avoiding the enhancement of unwanted detail. The essential idea for this approach is quite simple: embed the original image f (x; y) in a family of derived images Ji (x; y; i ) corresponding to channels whose excitation levels exceed that of the unwanted detail. Assuming that the excitation level of the noise and unwanted detail is  = 1 (of course, this assumption may be changed depending on the particular application under consideration), a spatial-frequency channel has a meaningful excitation level if the corresponding signi cant scale i is greater than 1. Let Ji (x; y; i ) be obtained by convolving the original image with both a Gaussian kernel G(x; y; i ) of scale i and a band-pass lter hChi (x; y) associated with channel Chi . The images Ji (x; y; i ) are nally combined by addition in order to produce the nal result. Due to the disregarded channels{those from the unwanted detail{ the output is, in general, a low-contrast image. To overcome this problem, a typical contrast stretching transformation may be applied to the image. Figure 1(A) shows a biomedical image containing a tissue section. To produce Fig. 1(B), zero-mean normally distributed noise with  = 20 was added to Fig. 1(A). Figs. 1(C) and 1(D) show the multichannel ltered images for the images in Fig 1(A) and 1(B), respectively. Fig. 1(E) (respectively, 1(F)) displays the gray-level di erences between the images 1(A) and 1(C) (resp. 1(B) and 1(D)). Figures 2(A) and 2(B) show a textured image and a digital mammogram, respectively. Their multichannel ltered versions are presented in Figs. 2(C) and 2(D), respectively. Figs. 2(E) and 2(F) display the gray-level di erences between the original image and its reconstructed version in each case.

5.2 Edge detection Most edge detection techniques apply a smoothing process before extracting intensity changes; hence a small-scale parameter should be applied to a ne intensity change structure while a large-scale parameter operator should be used to get details from a coarse intensity change structure. In fact, the scale parameters are critical because they control the behavior of edges (Garca, Fdez-Valdivia, and Garrido[8] ). By applying 8

operators of di erent neighborhood sizes to an image, multiscale analysis obtains information at a range of scales. Therefore, once the most signi cant scales, i , with i = 1;    ; L, have been obtained using the multichannel approach, the edge curves may be extracted from an image using the Canny edge detector at a scale parameter i , i = 1;    ; L. Figs. 3 and 4 show the maps of edges obtained by combining outputs from the di erent signi cant scales, when this process is applied to several biomedical images, the thickness of the edge curves is proportional to the amount of smoothings.

6 Conclusion The paper introduced a novel representation technique for gray-level shape: the scale-spectrum space. The construction of such a representation by using two spectrum indexes was also described. The proposed scheme was based on the following heuristic principle: in the absence of further information, a scale level  would be signi cant in order to represent structure with frequency components in a spectrum band given by radial frequency extrema i?1 ; i , if the location  had produced local maxima of the ltering index (i; ) over scales, with (i; ) measuring the amount of deviation between the spectrum information of the input smoothed at a scale  and that of the input at the next scale  +  over a range of scale levels, from the viewpoint of channel (i?1 ; i ). The proposed representation nearly satis es the number of criteria considered necessary for any general-purpose gray-level shape representation scheme: (1) It allows data-driven detection of signi cant spectrum domains of speci c importance and the evolution of scale levels from the viewpoint of such domains, and it is not an e ect of some externally chosen criteria or tuning parameters. Hence, in the absence of further information, the scale-spectrum space may be used as a guide for the subsequent processing requiring knowledge about the scales at which gray-level structure with particular spectrum components (high, medium or low frequency content) occurs. (2) It avoids both redundancy and error from information with di erent signi cance being simultaneously analyzed over scales. (3) It is capable of dealing with variable levels of noise on the data. Experimental results shown that the shape might be described with a high level of accuracy while eliminating noise and unwanted detail. (4) It satis es a criterion of eciency, i.e., its computational complexity is a very low-order polynomial in time and space. (5) The proposed method for representing gray-level shape does not require user-set parameters or prior knowledge about the nature of the images.

References [1] J. Babaud, A.P. Witkin, M. Baudin, R.O. Duda. Uniqueness of the Gaussian kernel for scale-space ltering. IEEE-PAMI 8, 26-32 (1986). [2] C. Blakemore and F.W. Campbell. On the existence of neurones in the human visual system selectively sensitive to the orientation and size of retinal images. J. Physiol., 203, 237-260, (1969). [3] F.W. Campbell, G. F. Cooper, and C. Enroth-Cugell. The spatial selectivity of the visual cells of the cat. J. Physiol., 203, 223-235, (1969).

9

[4] F.W. Campbell, and J.G. Robson. Application of Fourier analysis to the visibility of gratings. J. Physiol., 197, 551-566, (1968). [5] H. Crane. A theoretical analysis of the visual accommodation system in humans. NASA Ames Research Center NAS 2-2760 (1966). [6] F. Farrokhnia. Multi-channel ltering techniques for texture segmentation and surface quality inspection. PhD thesis, Michigan State University, (1990). [7] L.M.J. Florack, B.M. ter Haar Romeny, J.J. Koenderink, M.A. Viergever. Scale and the di erential structure of images. Image Vision Comp., 10(6), 376-388, (1991). [8] J.A. Garca, J. Fdez-Valdivia, A. Garrido. A scale-vector approach for edge detection. Pattern Recognition Letters 16, 637-646 (1995). [9] A.P. Ginsburg. Specifying relevant spatial information for image evaluation and display design: an explanation of how we see certain objects. Proc. Soc. Inf. Disp. 21, 219-227 (1980). [10] G.A. Hay, M.S. Chesters, M.S. A model of visual threshold detection. J. Theor. Biol. 67, 221-240 (1977). [11] H. Ikeda and M.J. Wright. Spatial and temporal properties of sustained and transient cortical neurones in Area 17 of the cat. Expl. Brain Research, 22, 385-398, (1975). [12] A.K. Jain. Fundamentals of Digital Image Processing. Prentice-Hall, Englewood Cli s, NJ (1989). [13] M. Kabrisky. A Proposed Model for Visual Information Processing in the Human Brain. Illinois University Press, Urbana, (1966). [14] J.J. Koenderink. The structure of images. Biological Cybernetics 50, 363-370 (1984). [15] J.J. Koenderink and A. J. van Doorn. Receptive eld families. Biological Cybernetics 63, 291-298 (1990). [16] T. Lindeberg. Detecting salient blob-like image structures and their scales with a scale-space primal sketch: A method for focus-of-attention. IJCV, 11, 283-318 (1993). [17] T. Lindeberg and Jonas Garding. Shape form Texture from a Multi-Scale Perspective. Proc. 4th International Conference on Computer Vision, 683-691, (1993). [18] L. Ma ei and A. Fiorentini. The visual cortex as a spatial frequency analyser. Vision Research, 13, 1255-1267, (1973). [19] D. Marr. Vision. San Francisco, CA: Freeman, (1982). [20] F. Mokhtarian, A. Mackworth. Scale-based description and recognition of planar curves and twodimensional shapes. IEEE PAMI 8, 34-43 (1986). [21] D. A. Pollen and J. H. Taylor. The striate cortex and the spatial analysis of visual space. In The Neurosciences, Third Study Program. MIT Press, Cambridge, 239-247, (1974). [22] A. Rosenfeld, M. Thurston. Edge and curve detection for visual scene analysis. IEEE Trans. Comput. 20, 559-562 (1971). 10

[23] A.P. Witkin . Scale-Space Filtering. Proc. 8th Int. Joint. Conf. on Arti cial Intelligence, Karlsruhe, West Germany, 1019-1022 (1983). [24] R. A. Young. The Gaussian derivative model for spatial vision: I. Retinal mechanisms. Spatial Vision 2(4), 273-293, (1987). [25] A.L. Yuille, T.A. Poggio. Scaling theorems for zero-crossings. IEEE PAMI 8, 15-25 (1986).

11

Summary (a) What is the original contribution of this work? In scale-space representation where gray-level shape generally comprises multiple structures at di erent levels of scale, it is often not possible to obtain an image where all the structures are described at their best scale levels, if one structure is wellenhanced, the other ones appear blurred. At best, some forms of compromise among the structures at di erent scale levels may be sought. To overcome this problem we present an ecient multi-channel based scheme which may be employed to automatically describe each gray-level structure at its most appropiate level of smoothing. The proposed multichannel organization is selectively sensitive to spatial frequency and size which is biologically inspired by the behavior of visual cortex neurones as well as retinal cells. This representation allows data-driven detection of those spatial frequency channels into which the entire domain is to be decomposed, and the evolution of scale levels from the viewpoint of such spectrum bands, and it is not an e ect of some externally chosen criteria or tuning parameters. (b) What is the most closely related work by others and how does this work di er? The most closely related work is that of Lindeberg[16] in which the problem of gray-level structure scale selection is dealt with. But that work is single-scale, in the sense that a single scale is derived even though di erent sized structures may be present in the image. Hence, it does not obtain an image in which all the structures are described at their best scale levels. On the other hand, our work is intended to derive a gray-level image representation inspired by the behavior of those units (visual cortex neurones and retinal cells) selectively sensitive to spatial frequency and size. Hence, the proposed scheme provides a framework accomplishing a reasonable spectrum domain decomposition into spatial frequency channels for the input image, and deriving the only signi cant scales corresponding to all the relevant di erent sized structures concerning each channel. (c) Which track or tracks and which topics within a track are most appropriate for the submission? The track for the submission is Pattern Recognition and Signal Analysis. The topics are both Multiresolution Methods and Feature Detection.

12

A

B

C

D

E

F Figure 1:

13

A

B

C

D

E

F Figure 2:

14

A

B

C

D

E

F Figure 3:

15

A

B

C

D

E

F Figure 4:

16