Analyzing Images Containing Multiple Sparse Patterns with ... - IJCAI

Report 1 Downloads 30 Views
A n a l y z i n g Images C o n t a i n i n g M u l t i p l e Sparse P a t t e r n s w i t h Neural Networks R a n g a c h a r i A n a n d , K i s h a n M e h r o t r a , C h i l u k u r i K . M o h a n a n d Sanjay R a n k a School of Computer and Information Science Syracuse University, Syracuse, NY 13244-4100

Abstract We have addressed the problem of analyzing images containing multiple sparse overlapped patterns. This problem arises naturally when analyzing the composition of organic macromolecules using data gathered from their N M R spectra. Using a neural network approach, we have obtained excellent results in using N M R data to analyze the presence of various amino acids in protein molecules. We have achieved high correct classification percentages (about 87%) for images containing as many as five substantially distorted overlapping patterns.

1

Introduction

Currently known image analysis methods are not very effective when applied to images containing large m u l tiple overlapped sparse patterns. Such patterns consist of a small number of features dispersed widely in the image. The features are usually small in size: possibly no larger than a single pixel. Such a classification problem is encountered when analyzing images obtained by certain types of Nuclear Magnetic Resonance ( N M R ) spectroscopy. Neural networks offer potentially promising techniques for such problems, but few successful results have been reported in the literature on the application of neural networks to such complex image analysis tasks. One possible approach is to use Strong and Whitehead's physiological model [10] which describes how humans can sequentially focus on each pattern contained in a complex image. Their model is a discrete-event simulation of activities w i t h i n human neurons. Due to the complexity of human neurons this model has only been tested w i t h small input images. The selective-attention neural network of Fukushima presents another approach for classifying overlapped patterns [3]. The m a i n problem in applying Fukushima's approach for large images is the huge size of the required network. As many as 41000 cells are needed for classifying patterns in a 19 x 19 image. Since practical applications require processing considerably larger (256 x 256) images, the computational requirements using Fukushima's model are too high.

838

Learning a n d Knowledge Acquisition

We have developed a modular analyzer for the problem of analyzing images containing multiple sparse patterns. Each module detects the presence of patterns that belong to one class in the input image. Each module has two stages. The first stage is a feature detector based on clustering [5]. For each class of patterns, cluster analysis is used to identify those regions of the input image where features of the patterns belonging to that class are most likely to be found. The second stage of each module is a backpropagation-trained feed-forward neural network [9] that performs the tasks of thresholding and classification. W i t h this approach, we have been able to obtain very high correct classification performance (87%) on 256 x 256 images w i t h noisy test data. In the next section, we discuss the problem of analyzing multiple sparse patterns, describe some details of the N M R analysis problem, and discuss previous work on this topic. In section 3, we describe details of our system. Experiments and results are presented in section 4. Section 5 contains concluding remarks.

2

The problem

The images we analyze may contain many different 'patterns'. Each pattern consists of several 'features'. A feature may be a group of neighboring pixels, or perhaps just a single pixel. The locations of pixels may vary w i t h i n a range determined by the feature. Hence the pattern-matching process has to allow for variability of pixel locations. Figure 1 shows three images, each containing one pattern (of the same class) which consists of three features. Each feature consists of a single pixel (indicated by a '+' symbol), which must occur somewhere w i t h i n a known region (delineated by dashed ellipses in the figure).

Figure 1: Three sparse patterns which belong to the same class.

Overlapped feature

Overlapped feature

regions of one class

regions of different classes

F i g u r e 2:

O v e r l a p o f feature regions

F i g u r e 3: A class of patterns w i t h two pattern templates. I n t h e a p p l i c a t i o n s t h a t w e are i n t e r e s t e d i n , f e a t u r e r e g i o n s f o r d i f f e r e n t classes do o v e r l a p , as s h o w n in F i g u r e 2 . C o n s e q u e n t l y , a f e a t u r e m a y lie w i t h i n f e a t u r e r e g i o n s of several classes. Such a f e a t u r e p a r t i a l l y c o n s t r a i n s t h e c l a s s i f i c a t i o n , a l t h o u g h i t does n o t p e r m i t u s t o decide u n a m b i g u o u s l y w h e t h e r a p a r t i c u l a r class o f p a t t e r n s i s present i n t h e i m a g e . A s n o t e d b y R u m e l h a r t a n d M c C l e l l a n d [9], such p r o b l e m s are i d e a l c a n d i d a t e s for n e u r a l n e t w o r k s o l u t i o n s . A p a r t i c u l a r i n s t a n c e o f t h i s p r o b l e m arises i n t h e classification o f N M R spectra. N M R spectroscopy is a powerful m e t h o d for the d e t e r m i n a t i o n of the three-dimensional structure of complex organic macrom o l e c u l e s such as p r o t e i n s [11]. P r o t e i n s are l o n g chains o f s m a l l e r m o l e c u l e s c a l l e d a m i n o acids. A p p r o x i m a t e l y 1 8 d i f f e r e n t t y p e s o f a m i n o acids are c o m m o n l y f o u n d i n p r o t e i n s . T h e f i r s t step i n a n a l y z i n g t h e s t r u c t u r e o f a p r o t e i n i s t o d e t e r m i n e i t s c o n s t i t u e n t a m i n o acids. O n e t y p e o f N M R s p e c t r o s c o p y used for t h i s p u r p o s e i s called C o r r e l a t i o n a l Spectroscopy ( ' C O S Y ' ) . T h e C O S Y s p e c t r u m of a protein is the result of the c o m b i n a t i o n o f t h e s p e c t r a o f i t s c o n s t i t u e n t a m i n o acids. T h e t a s k o f d e t e r m i n i n g t h e c o n s t i t u e n t a m i n o acids o f a p r o t e i n i s t h e r e f o r e e q u i v a l e n t t o t h e task o f a n a l y z i n g a n i m a g e c o n t a i n i n g m u l t i p l e sparse p a t t e r n s . T h e t r a i n i n g set f o r o u r a n a l y z e r consists o f a n u m b e r o f s a m ple s p e c t r a f o r each t y p e o f a m i n o a c i d . T h e s e s p e c t r a were g e n e r a t e d f r o m i n f o r m a t i o n a b o u t t h e d i s t r i b u t i o n s o f peaks for each t y p e o f a m i n o a c i d , t a b u l a t e d i n [4]. 2.1

Definitions

Image representation: An input image is a twod i m e n s i o n a l a r r a y o f n o n - n e g a t i v e i n t e g e r s called ' i n tensities'. We w i l l represent an i m a g e by a set P =r) of t r i p l e s , w h e r e each t r i p l e P i = f o r . T h e first t w o c o m p o nents o f each t r i p l e i d e n t i f y t h e l o c a t i o n o f a non-zero element in the i n p u t image, while the t h i r d ( P i , tz) represents t h e i n t e n s i t y o f t h a t e l e m e n t . C h e m i s t s refer to each such t r i p l e as a peak. A n i m a g e P m a y c o n t a i n several p a t t e r n s ( d i s j o i n t s u b sets of P). E a c h p a t t e r n is a c o l l e c t i o n of peaks associated w i t h a c e r t a i n a m i n o a c i d class. T h e n u m b e r o f p a t terns c o n t a i n e d i n a n i n p u t i m a g e i s n o t k n o w n a priori. Hence each i m a g e c o n t a i n s a n u n k n o w n n u m b e r o f peaks (N). P a t t e r n D e s c r i p t i o n : I n s o m e cases, t h e s a m e class m a y b e i d e n t i f i e d b y o n e o f m a n y d i f f e r e n t i m a g e s . For i n s t a n c e , an a m i n o a c i d c m a y g i v e rise to t(C,1) or t(c,,2)j

w h i c h are t w o d i f f e r e n t c o n f i g u r a t i o n s . T h e r e f o r e , w e define a set T c of pattern-templates for each class r, where each p a t t e r n - t e m p l a t e t(c,,i) c h a r a c t e r i z e s one c o n f i g u r a tion:

Each

pattern-template

A feature-template c a t i o n for a f e a t u r e l o n g i n g t o class c . features ( p e a k s ) are

is

a set

of feature-templates:

F(c,j,k) c o n t a i n s a c o m p l e t e specifiw n i c h c o u l d o c c u r in a p a t t e r n be Feature-templates determine which present i n a n i n p u t i m a g e :

w h e r e r (c,j,k) is t h e center of a f e a t u r e r e g i o n a n d X(c,j,k) is used to define h o w f a r t h e f e a t u r e r e g i o n e x t e n d s a r o u n d t h e center. A s d e s c r i b e d i n s e c t i o n 3 , w e o b t a i n t h e values o f r b y c l u s t e r a n a l y s i s a n d i m p l i c i t l y c o m p u t e t h e values of A w h e n a n e u r a l n e t w o r k is t r a i n e d 2.2

Classification procedure

I n t h i s s e c t i o n , w e describe o u r p r o c e d u r e f o r a n a l y z i n g i m a g e s w i t h m u l t i p l e p a t t e r n s f r o m C classes. M a t c h i n g a f e a t u r e - t e m p l a t e : T h i s i s t h e f i r s t step in pattern recognition. We must determine whether a peak i n t h e i n p u t i m a g e m a t c h e s a f e a t u r e - t e m p l a t e . We say t h a t a p e a k P i ' m a t c h e s ' a f e a t u r e - t e m p l a t e

w h e r e g , t h e ' e r r o r f u n c t i o n ' , i s chosen t o increase w i t h distance Peaks w i t h h i g h i n t e n s i t y v a l ues {Pi,Z) m a t c h a f e a t u r e t e m p l a t e even w h e n t h e y are positioned far f r o m the location of the feature template T h i s i s t h e reason f o r d e p i c t i n g f e a t u r e t e m p l a t e s w i t h v a r y i n g g r e y levels i n F i g u r e 3 . M a t c h i n g a p a t t e r n - t e m p l a t e : W e say t h a t a n i n p u t image P 'matches' a pattern-template if f o r each feature-template there exists a unique peak such t h a t P , m a t c h e s C l a s s i f i c a t i o n : If an i n p u t image P matches a p a t t e r n t e m p l a t e t(c,j), a p a t t e r n of class c is defined to be

A n a n d , et al.

839

Feature-regions computed from training set

Input image to be analyzed

F i g u r e 5: Block diagram of p a r t of the modular analyzer.

F i g u r e 4: Overview of the classification process.

present i n t h e i n p u t i m a g e . T h e o v e r a l l a n a l y s i s t a s k i s t o d e t e r m i n e a l l t h e classes whose f e a t u r e s are present i n t h e i n p u t i m a g e , hence t h e a b o v e p r o c e d u r e i s r e p e a t e d f o r e v e r y class. A n o v e r v i e w o f t h e c l a s s i f i c a t i o n process i s d e p i c t e d i n Figure 4. In the example shown, the feature-regions for t w o classes are k n o w n t o t h e a n a l y z e r . U s i n g t h i s k n o w l edge, t h e a n a l y z e r can d e t e r m i n e w h e t h e r a p a t t e r n o f e i t h e r class i s present i n t h e i n p u t i m a g e .

3

System description

I n t h i s s e c t i o n , w e describe a m o d u l a r a n a l y z e r f o r t h e a n a l y s i s o f i m a g e s c o n t a i n i n g m u l t i p l e sparse p a t t e r n s . A n i m p o r t a n t aspect o f o u r s y s t e m i s t h e use o f a clust e r i n g a l g o r i t h m t o t r a i n f e a t u r e d e t e c t o r s f o r each class. O u r use o f t h e t e r m ' f e a t u r e d e t e c t i o n ' t o i d e n t i f y spatial f e a t u r e s i n t h e p a t t e r n s i s t o b e d i s t i n g u i s h e d f r o m s t a t i s t i c a l p a r l a n c e , w h e r e ' f e a t u r e d e t e c t i o n ' m a y refer t o t h e process o f c h o o s i n g t h e best set o f v a r i a b l e s t o c h a r a c t e r i z e a set o f i t e m s [ l ] . 3.1 O v e r v i e w of t h e a n a l y z e r A block d i a g r a m of the analyzer is shown in Figure 5. E a c h m o d u l e d e t e c t s t h e presence o f one class o f p a t terns. T h e modules w o r k in parallel on an i n p u t image ( p r e s e n t e d as a l i s t of p e a k s ) . E a c h m o d u l e consists o f t w o stages. T h e f i r s t s t a g e , c a l l e d a clustering filter, t r a n s f o r m s t h e i m a g e i n t o a 'feat u r e v e c t o r ' . T h e second stage is a p e r c e p t r o n - l i k e feedforward neural network. T h e clustering filter computes t h e values o f t h e m a t c h i n g f u n c t i o n s , w h i l e t h r e s h o l d i n g is done by the neural network.

840

Learning and Knowledge Acquisition

F i g u r e 6: Use of clustering to find the expected locations of features.

3.2

Clustering

I n m a c h i n e v i s i o n s y s t e m s , c l u s t e r i n g i s o f t e n used for i m a g e s e g m e n t a t i o n [8]. I n o u r s y s t e m , c l u s t e r i n g i s used to find the expected locations of features. We illustrate t h i s w i t h a n e x a m p l e . L e t t h e t r a i n i n g set f o r s o m e class c consist o f t h e t h r e e i m a g e s i n F i g u r e 1 . E a c h i m a g e c o n t a i n s one p a t t e r n , w h i c h i s k n o w n t o b e l o n g t o class c. T h e s e i m a g e s h a v e been s u p e r i m p o s e d in F i g u r e 6. C l e a r l y , t h e f e a t u r e s o c c u r i n t h r e e c l u s t e r s . T h e center o f each c l u s t e r i s t h e e x p e c t e d l o c a t i o n o f a f e a t u r e . T h e p r o c e d u r e m a y b e s u m m a r i z e d t h u s : f o r each class c, create a set R c c o n t a i n i n g t h e l o c a t i o n s of a l l f e a t u r e s i n a n i m a g e c r e a t e d b y s u p e r i m p o s i n g a l l t h e t r a i n i n g set i m a g e s f o r class c . B y a p p l y i n g a c l u s t e r i n g a l g o r i t h m t o Rc, w e d e t e r m i n e t h e e x p e c t e d l o c a t i o n o f each f e a t u r e , i.e., t h e c l u s t e r center. W e have i n v e s t i g a t e d t w o c l u s t e r i n g a l g o r i t h m s : t h e K - m e a n s c l u s t e r i n g a l g o r i t h m [5] a n d t h e L V Q ( L e a r n i n g Vector Q u a n t i z e r ) [6]. W e have f o u n d t h a t the L V Q performs better for our p r o b l e m . A b e n c h m a r k i n g study

Feature detectors

detectors w i l l be higher, but this is easily handled by the neural network. 3.4

F i g u r e 7 : Clustering filter

by Kohonen et al. [7] also found that the LVQ produced better results than K-means clustering on several classification tasks. 3.3

The clustering filter

The role of each clustering filter (shown in Figure 7) is to extract relevant information f r o m the input image for one class of patterns. A clustering filter consists of a number of feature detectors. A feature detector is a processing unit activated by the presence of a feature (peak) in its receptive field, a specific region of the i n put image. The output of a feature detector is a real value which depends on the number of peaks present w i t h i n its receptive field, the intensities of those peaks and their distances f r o m the center of the receptive field. The output of a clustering filter is the 'feature vector', each element of which is the (real-valued) output from one feature detector in the filter. In a filter for class c, the receptive fields of the feature detectors should coincide w i t h the feature regions of class c. For simplicity, we use feature detectors w i t h fixed size receptive fields in our system. Consequently, if a feature region is larger than the receptive field, several feature detectors are required to cover i t . We use the LVQ learning procedure to determine the position of each feature detector. The o u t p u t f r o m a feature detector located at coordinate r, when presented w i t h an image P — {PI,P2,...,PN}

is:

The kernel function chosen is where is a constant between 0.1 and 0.5. A l t h o u g h all peaks in the image are fed to the feature detector, it w i l l actually respond only to those peaks which are very close to r. This is because for the values of r that we have chosen, the reciprocal of the kernel function g(x) drops to almost zero when x > 6. Therefore the feature detector only responds to peaks which lie w i t h i n a radius of 6 pixels around its center, r. As we noted previously, there are cases where two peaks sometimes occur very close together. We do not need to make any special provision for this situation. We use only one set of feature detectors to cover the combined feature region. The o u t p u t f r o m these feature

The neural network

A l t h o u g h a clustering filter is trained to respond most strongly to patterns of a particular class, it is possible (due to overlap of feature-regions) that some of the detectors of one class may be activated when patterns of another class are presented. We use a feed-forward neural network to determine i m p l i c i t l y the appropriate thresholds for each pattern detector. This neural network is trained after the clustering filters of the first stage have been trained and set up. For each class c, the neural network (of the corresponding module) must be taught to discriminate between feature vectors obtained from images containing a pattern of class c and feature vectors produced from images which do not contain patterns of class c. We use backpropagation [9] to train the network. Backpropagation is a supervised learning algorithm in which a set of patterns to be learnt are repeatedly presented to the network together w i t h their target output patterns. At the outset of a t r a i n i n g session, the weights and biases are randomly initialized. For each pattern presented, the error backpropagation rule defines a correction to the weights and thresholds to minimize the square sum of the differences between the target and the actual outputs. The learning process is repeated until the average difference between the target and the actual output falls below an operator-specified level.

4

Results

In this section, we describe our experiments in training and testing the sparse image recognition system, and report the results obtained. 4.1

System parameters

To substantiate our approach, seven modules were trained for the N M R protein analysis problem. Each module can detect patterns corresponding to one amino acid. The final o u t p u t from each module is a yes/no answer about whether the respective class (amino acid) is judged to be present. From among 18 possible amino acids, we trained modules for seven amino acids whose spectra appeared to be the most complex, w i t h more peaks than the others. But in training as well as testing the modules, we used data which included peaks from the other 11 amino acids as well, and obtained good results in analyzing the presence of the seven amino acids for which modules were trained. This shows that an incremental approach is possible: upon building modules for the other 11 amino acids, we expect t h a t our results w i l l continue to hold. Table la lists the parameters of each module. The names of the amino acids along w i t h their one-letter codes are listed in the first column. The second column is the number of feature detectors in the first stage of each module. The t h i r d column shows the number of hidden layer nodes in the neural network which comprises the second stage of each module. These were approximately the smallest number of nodes required for convergence

A n a n d , et al.

841

of the network t r a i n i n g procedure, obtained by experim e n t i n g w i t h v a r i o u s values.

Table l a : Module Parameters. T h e t r a i n i n g set consists of a t o t a l of 90 single-class images, w i t h 5 for each of the 18 amino acids. The equation indicating how weights are changed using the error back-propagation procedure [9J is:

In each module, we used a value of n — 0.1 for the learning rate parameter, and a value of a — 0.05 for the mom e n t u m coefficient. The target mean squared error (to terminate the network t r a i n i n g procedure) was set to be 0.01. D u r i n g t r a i n i n g , the target o u t p u t for the networks was set to be 0.1 when the required answer was ' n o ' and 0.9 when the required answer was 'yes'. Weights in the network were updated only at the end of each 'epoch' (one sequence of presentations of all t r a i n i n g inputs). Table lb shows the number of epochs needed to train the LVQ's and the feedforward (backpropagation) neural networks, for each module.

T o i l l u s t r a t e t h e t e s t i n g m e t h o d , consider a c o m p o s i t e image created by s u p e r i m p o s i n g t w o images w i t h patt e r n s t h a t b e l o n g to classes c a n d d. C o r r e c t classificat i o n i m p l i e s t h a t t h i s i m a g e s h o u l d b e classified ' N O ' b y m o d u l e s a, e, i, f a n d v a n d ' Y E S ' by m o d u l e s c a n d d. We m e a s u r e t h e percentages of c o r r e c t c l a s s i f i c a t i o n , testing the modules in this manner on various composite images. In t h e first set of e x p e r i m e n t s , we g e n e r a t e d 1000 exa m p l e s o f each case: c o m p o s i t e i m a g e s c o n t a i n i n g 2 , 3 , 4 a n d 5 p a t t e r n s r e s p e c t i v e l y . In each set, t h e c o m p o s i t e i m a g e s were c r e a t e d b y s u p e r i m p o s i n g a r a n d o m l y chosen set of i m a g e s (each of a d i f f e r e n t class) d r a w n f r o m t h e t r a i n i n g set. T h e percentages o f these i m a g e s c o r r e c t l y classified b y each m o d u l e u n d e r d i f f e r e n t c o n d i t i o n s are r e p o r t e d in t a b l e 2, f o r r = 0.5 a n d r = 0 . 1 . F r o m t a b l e 2 , i t i s clear t h a t e r r o r rates increase w i t h the number of images (patterns) in the i n p u t image. T h i s is because t h e r e c e p t i v e fields of d i f f e r e n t classes of p a t t e r n s o v e r l a p . Hence p a t t e r n s o f one class m a y p a r t i a l l y a c t i v a t e f e a t u r e d e t e c t o r s f o r o t h e r classes. A s t h e n u m ber o f p a t t e r n s i n t h e i n p u t i m a g e increases, i t becomes increasingly likely t h a t a feature detector m a y respond to artifacts arising f r o m a fortuitous combination of patt e r n s b e l o n g i n g t o o t h e r classes. T h i s p r o b l e m i s f u r t h e r a g g r a v a t e d b y i n c r e a s i n g t h e size o f t h e r e c e p t i v e f i e l d s , as s h o w n in t a b l e 2.

Table 2: Percentages correctly classified, when test images were random combinations of t r a i n i n g set images.

Table l b : N u m b e r of epochs required for training. To investigate the effect of varying the receptive field sizes of detectors, we trained two versions of each m o d ule w i t h different values for r, the constant in the kernel function. When = 0.5, detectors have small receptive fields, whereas when = 0 . 1 , detectors have large receptive fields. 4.2

E x p e r i m e n t a l results

T h e goal of the experiments was to measure the correctness of overall classification when the system was presented w i t h composite images containing several patterns of different classes. Various experiments were performed to test our sparse image analysis system on composite images consisting of: (i) different numbers of patterns; (ii) w i t h and w i t h o u t perturbations; and ( i i i ) for detectors w i t h different receptive fields

and

842

Table 3: Percentages correctly classified, w i t h low noise. Test images were random combinations of t r a i n i n g set images w i t h peak locations randomly translated by [ - 1 , + l ] .

= 0.1).

Learning and Knowledge Acquisition

Table 4: Percentages correctly classified, w i t h high noise. Test images were r a n d o m combinations of t r a i n i n g set images w i t h peak locations randomly translated by [-3, + 3 ] . I t i s d e s i r a b l e t o p e r f o r m c o r r e c t c l a s s i f i c a t i o n even i n t h e presence o f s m a l l e r r o r s o r c o r r u p t e d d a t a . Hence,

we tested our system w i t h composite images produced by superimposing distorted versions of the training setimages to the system. T w o series of experiments were performed, varying the amount of distortion in the test data, for small and large receptive fields (r = 0.5 and r = 0.1). In one set of experiments, distortions were introduced by adding a random integer in the range [ - 1 , +1] to the coordinates of the peaks. The results of these are summarized in table 3. In another set of experiments, the distortion was increased by adding random integers in the range [-3, 4-3] to the coordinates of peaks. These results are summarized in table 4. W i t h the small receptive field system (r = 0.5), the combined effect of distortion and multiple patterns causes classification accuracy to deteriorate substantially. On the other hand, classification capabilities of the large receptive field system (r = 0.1) are less affected and degrade more gracefully w i t h noise. This phenomenon may be contrasted w i t h the observation that the small receptive field system performs marginally better on uncorrupted test data.

5

Concluding Remarks

In this paper, we have addressed the problem of analyzing images containing multiple sparse overlapped patterns. This problem arises naturally when analyzing the composition of organic rnacromolecules using data gathered from their N M R spectra. Using a neural network approach, we have obtained excellent results in analyzing the presence of various amino acids in protein molecules. We have achieved high correct classification percentages (about 87%) for images containing as many as five substantially distorted overlapping patterns. The architecture of our system is modular: each module analyzes the i n p u t image and delivers a yes/no output regarding the presence of one class of patterns in the image. Each module contains two stages: a clustering filter, and a feedforward neural network. An unconventional aspect of our approach is the use of clustering to detect spatial features of patterns. We performed a number of experiments to measure the correctness of overall classification when the system was presented w i t h composite images containing several patterns of different classes. We tried two versions of the system, one w i t h small receptive field detectors and the other w i t h large receptive field detectors. In both cases, we observed that the rate of correct classification decreased as the number of patterns in the image was increased. To determine the ability of the system to cope w i t h variations in the patterns, images with random perturbations to the patterns were presented to the system in another series of experiments. In this case, we observed that the classification abilities of the large receptive field system are less affected and degrade more gracefully. The classification process described in this paper is only the first step in the analysis of N M R spectra. It is of considerable interest to chemists to determine the precise association of the peaks in in the input image w i t h different patterns. We are currently working on an

extension to the system described in this paper to perform this task. We plan to refine the clustering algorithm to enable the use of feature-detectors w i t h variable size receptive fields. We expect to improve performance by combining the evidence from multiple input sources, as is done in other N M R analysis methods.

Acknowledgments We gratefully acknowledge the assistance received from Sandor Szalma and 1st van Pelczer of the NMR Data Processing Laboratory, Chemistry Department, Syracuse University. The images for the training set were generated by a program w r i t t e n by Sandor Szalma.

References [1] Chen, C. "Statistical Pattern Recognition". Spartan Books, 1973. [2] Fu, K. S. and M u i , J. K. "A Survey on Image Segm e n t a t i o n " . Pattern Recognition, vol. 13, pp. 3-16, 1981. [3] Fukushima, K. "Neural Network Model for Selective Attention in Visual Pattern Recognition and Associative Recall". Applied Optics, vol. 26, no. 23, pp. 4985-4992, 1987. [4] Gross, K - H . , Kalbitzer, H. R. "Distribution of Chemical Shifts in 1H Nuclear Magnetic Resonance Spectra of Proteins". Journal of Magnetic Resonance, vol. 76, pp 87-99, 1988. [5] Jain, A. K. and Dubes, R. C. "Algorithms for Clustering D a t a " . Prentice H a l l , 1988. [6] Kohonen, T. "Self-organized Formation of Topologically Correct Feature Maps". Biological Cybernetics, vol. 43, pp. 59-69, 1982. [7] Kohonen, T . , Barnes, G., and Chrisley, R. "Statistical Pattern Recognition w i t h Neural Networks: Benchmarking Studies". Proc. IEEE Int. Conf. on Neural Networks, vol. 1, pp. 61-68, 1988. [8] Nagao, M. and Matsuyama, T. "A Structural Analysis of Complex Aerial Photographs". Plenum Press, 1980. [9] Rumelhart, D. E., and McClelland, J. L. "Parallel Distributed Processing, Volume 1 " . M I T Press, 1987. [10] Strong, G. W. and Whitehead, B. A. "A Solution to the Tag-Assignment Problem for Neural Networks" . Behavioral and Bram Sciences, vol. 12, pp. 381-433, 1989. [11] W u t h r i c h , K. " N M R of Proteins and Nucleic Acids". John Wiley & Sons, New York, 1986.

A n a n d , et al.

843