2009 International Conference of Soft Computing and Pattern Recognition
Object Recognition using Fourier Descriptors and Genetic Algorithm
M. Sarfraz
Mehmood-ul-Hassan and M. Iqbal
Department of Information Science, Kuwait University Adailiya Campus, P.O. Box 5969, Safat 13060, Kuwait e-mail:
[email protected] Department of Computer Science COMSATS, Abbottabad, Pakistan. e-mail:
[email protected] This paper has used Fourier descriptors, with different combinations, for the recognition of objects captured by an imaging system which may transform, make noise or can have occlusion in the images. An extensive experimental study, similar to the moment invariants [9], has been made using various similarity measures in the process of recognition. These measures include Euclidean Measure and Percentage error. Comparative study of various cases has provided very interesting observations which may be quite useful for the researchers as well as practitioners working for imaging and computer vision problem solving. Although the whole study has been made for bitmap images, but it can be easily extended to gray level images. From the analysis and results using Fourier Descriptors, the following questions arise: What is the optimum number of descriptors to be used? Are these descriptors of equal importance? To answer these questions, the problem of selecting the best descriptors has been formulated as an optimization problem. Genetic Algorithm technique has been mapped and used successfully to have an object recognition system using minimal number of Fourier Descriptors. The goal of the proposed optimization technique is to select the most helpful descriptors that will maximize the recognition rate. The proposed method will assign, for each of these descriptors, a weighting factor that reflects the relative importance of that descriptor. The outline of the remainder of the paper is as follows. Getting of bitmap images and their outline is discussed in Sections 2 and 3 respectively. Section 4 deals with the study of Fourier descriptors. The concepts of similarity measures are explained in Section 5. Detailed experimental study and analyses, for a simple approach, are made in Section 6. Section 7 deals with proposed methodology together with interesting observations during the experimental study. Finally, Section 8 concludes the paper.
Abstract— This work presents study and experimentation for object recognition when isolated objects are under discussion. The circumstances of similarity transformations, presence of noise, and occlusion have been included as the part of the study. For simplicity, instead of objects, outlines of the objects have been used for the whole process of the recognition. Fourier Descriptors have been used as features of the objects. From the analysis and results using Fourier Descriptors, the following questions arise: What is the optimum number of descriptors to be used? Are these descriptors of equal importance? To answer these questions, the problem of selecting the best descriptors has been formulated as an optimization problem. Genetic Algorithm technique has been mapped and used successfully to have an object recognition system using minimal number of Fourier Descriptors. The proposed method assigns, for each of these descriptors, a weighting factor that reflects the relative importance of that descriptor. Keywords- Object recognition, Fourier descriptors, Genetic algorithm, image, noise.
I.
INTRODUCTION
Fourier descriptors [1, 2, 14], like Moment descriptors [9], have been frequently used as features for image processing, remote sensing, shape recognition and classification. Fourier Descriptors can provide characteristics of an object that uniquely represent its shape. Several techniques have been developed that derive invariant features from Fourier Descriptors for object recognition and representation [1-5, 14]. These techniques are distinguished by their definition, such as the type of data exploited and the method for deriving invariant values from the image Fourier Descriptors. Granlund [1] introduced Fourier descriptors using complex representation in 1972. This method ensures that a closed curve will correspond to any set of descriptors. The Fourier descriptors have useful properties [3, 4]. They are invariant under similarity transformations like translation, scaling and rotation. The objects having these kind of transformations can be easily recognized using some recognition algorithms with Fourier descriptors as invariant features. For example, the Fourier descriptors, of the boundary [11-13], for recognizing closed contours is proposed in [5]. However, despite its success in some applications, it has certain limitations. Occlusion is the severe shape distortion, when the shape gets distorted, the Fourier descriptors don’t work well for recognition [6-8]. 978-0-7695-3879-2/09 $26.00 © 2009 IEEE DOI 10.1109/SoCPaR.2009.70
II.
GETTING BITMAP IMAGE
Bitmap image of a character can be obtained by creating a bitmap character on some program like Paint or Adobe Photoshop. Alternatively an image drawn on paper can scan and store it as bitmap. We used both methods. The quality of bitmap image obtained directly from electronic device depends on the resolution of device, type of image (e.g. bmp, jpeg, tiff etc), number of bits selected to store the image etc. The quality of scanned image depends on factors 324 318
such as quality of image on paper, scanner and attributes set during scanning. Figure 1(a) shows the bitmap image of a character. III.
corresponding Fourier descriptors of the input shape and each of the shapes contained in the database as shown in Figure 2. The similarity measures, attempted for experimental studies, are Euclidean Distance (ED) and Percentage Error (PE).
FINDING BOUNDARY
In order to find boundary of bitmap image, first its chain code is extracted [14, 15]. Chain codes are a notation for recording the list of edge points along a contour. The chain code specifies the direction of a contour at each edge in the edge. From chain coded curve, boundary of the image is found [16]. The selection of Boundary Points is base on their corner strength and contour fluctuations. The input to our boundary detection algorithm is a bitmap image. The algorithm returns number of pieces in the image and for each piece number of Boundary Points and values of these Boundary Points Pi = (xi , y i ), i = 1,2,..., N . Figure 1(b) shows detected boundary of the image of Figure 1(a). (b)
(a)
Figure 2. Pictorial Description of the method.
Given two sets of descriptors, how do we measure their degree of similarity? An appropriate classification is necessary if unknown shapes are to be compared to a library of known shapes. If two shapes, A and B, produce a set of values represented by a(i) and b(i) then the distance between them can be given as c(i) = a(i) – b(i). If a(i) and b(i) are identical then c(i) will be zero. If they are different then the magnitudes of the components in c(i) will give a reasonable measure of the difference. It proves more convenient to have one value to represent this rather than the set of values that make up c(i). The easiest way is to treat c(i) as a vector in a multi-dimensional space, in which case its length, which represents the distance between the objects, is given by the square root of the sum of the squares of the elements of c(i). The similarity measures, attempted for experimental studies, are as follows:
Figure 1. (a) Bitmap image, (b) Outline of the image.
IV.
FOURIER THEORY
To characterize objects we use features that remain invariant to translation, rotation and small modification of the object’s aspect. The invariant Fourier descriptors of the boundary [11-13] of the object can be used to identify an input shape, independent on the position or size of the shape in the image. Fourier transform theory has played a major role in image processing for many years. It is a commonly used tool in all types of signal processing and is defined both for one and two-dimensional functions. In the scope of this research, the Fourier transform technique is used for shape description in the form of Fourier descriptors. The shape descriptors generated from the Fourier coefficients numerically describe shapes and are normalized to make them independent of translation, scale and rotation. The Fourier transform theory can be applied in different ways for shape description. In this research, the procedure has been implemented in such a way that the boundary of the image is treated as lying in the complex plane. So the row and column co-ordinates of each point on the boundary can be expressed as a complex number. For details, the reader is referred to [5, 14]. V.
n
1.
∑ ( a (i ) − b (i ))
2
(Euclidean Distance (ED))
i =1
n
2.
a (i )
∑ b (i )
(Percentage Error (PE))
i =1
In this study, n is the number of FDs considered, a(i) is the ith FD of the template image, and b(i) is the ith FD of the test image. A tolerable threshold ρ is selected to decide a test object recognized. This threshold is checked against the least value of the selected similarity measure. VI.
SIMILARITY MEASURES
RESULTS AND ANALYSIS
The recognition system is tested by generating the test objects by translating, rotating, scaling, adding noise, and adding occlusion to the model objects contained in a
This paper implements two different simple classifiers that calculate different similarity measures of the
319 325
The recognition system is tested by generating the test objects by translating, rotating, and scaling and adding noise to the model objects contained in a database of size 60. The test objects were randomly rotated, scaled and translated. Sixty test objects were used for each of the experiments for testing similarity transformation, 16 test objects were used for noisy objects with similarity transformations, and 60 test objects were used for occluded objects. The salt & pepper noise of density 10% is added to the objects for generating the noisy test objects. Median filter was used in the experiment to filter the noise, so that the noise remains on the boundary of the object. The procedures taken to analyze and test the system are as follows:
database of different sizes. The test objects were randomly rotated, translated, and scaled. Some were considered without scale of their model sizes. About 100 test objects were used for each of the experiments for testing similarity transformation. The salt & pepper noise [15-16] of different densities is added to the objects for generating the noisy test objects. Median filter was used in the experiment to filter the noise, so that the noise remains on the boundary of the object. Median filtering is a type of neighborhood processing that is particularly useful for removing 'salt and pepper' noise from an image. The median filter [15-17] considers each pixel in the image and it looks at its nearby neighbors to decide whether or not it is representative of its surroundings. Instead of simply replacing the pixel value with the mean of neighboring pixel values, it replaces it with the median of those values. The median is calculated by first sorting all the pixel values from the surrounding neighborhood into numerical order and then replacing the pixel being considered with the middle pixel value. As would be seen in the experiments, FDs are not promising for the recognition of occluded objects. Around 20% occlusion was added into the objects to make tests. We split the experiments into different categories explained in the following paragraphs. The first series of experiments has been made to view results for different combinations of the Fourier Descriptors. Various experiments can be seen presenting different scenarios of the combination of Fourier Descriptors, similarity measures, and nature of data used.
TABLE I.
RECOGNITION RATES FOR DIFFERENT NUMBERS OF FOURIER DESCRIPTORS USING EUCLIDEAN DISTANCE.
Number of FDs Used
4
Transformations Noise Occlusion
71.67% 75% 5%
TABLE II.
1. The base case: That is, the Fourier descriptors FD 1-6 (highlighted in Table 1) are used as features and the Euclidean distance is considered for comparison. The percentage of recognition recorded in case of just similarity transformations is 83.3%. In case of similarity transformations with noise, it is about 93.75%. It is worth noting that in the latter case, only translation is considered as a similarity transformation, i.e., the test images are not rotated or scaled. The recognition rate of occluded objects is only 8.3%, which is very low. The Fourier Descriptors for an object in case of similarity transformations, noise and occlusion are experimented. It was found, by computation, that the Fourier descriptors do not change much in case of similarity transformations and noise. However, the occlusion caused the change in the values of descriptors.
6 Base Case 83.33% 93.75% 8.33%
11
18
22
29
93.33% 93.75% 20%
90% 93.75% 18.33%
93.33% 93.75% 23.33%
95% 93.75% 23.33%
TABLE 2: RECOGNITION RATES FOR DIFFERENT NUMBERS OF FOURIER DESCRIPTORS USING PERCENTAGE OF ERRORS.
Number of FDs Used
4
Transformations Noise Occlusion
70% 87.5% 8.33%
6 Base Case 80% 81.25% 11.67%
9
16
22
29
86.67% 81.25% 13.33%
75% 81.25% 8.33%
68.33% 81.25% 6.67%
68.33% 81.25% 11.67%
93.33% and improves the recognition rate of occluded images to 20%. From these results, it can be concluded that a good compromise between recognition performance and computational cost is achieved using 11 FDs. That is, increasing the number of FDs beyond 11 does not help much as the maximum recognition rate obtained for transformed images using up to 40 FDs is 95%. Another
2. Fourier Descriptors and Euclidean distance: Experiments are made to obtain the recognition rates of transformed, noisy, or occluded images considering different numbers of FDs using Euclidean distance. The recognition rates for using different numbers of FDs, ranging from 1 to 40, are computed. Some sample results are tabulated in Table 1. For example, using 11 FDs improves the recognition rate of transformed images to
320 326
observation is that the maximum recognition rate is achieved by using 29 FDs. Thus, using more FDs does not improve the recognition performance further.
will maximize the recognition rate and assign for each of these descriptors a weighting factor that reflects the relative importance of that descriptor. Since the problem of selecting the best descriptors can be formulated as an optimization problem, one need to define an objective function. The objective function, in this case, is made up of the following two terms: • the recognition rate, • the number of useful descriptors. In other words, it is required to maximize the recognition rate using the minimum number of descriptors. Our proposed technique works as follows:
3. Fourier Descriptors and percentage of errors: The recognition rates of transformed, noisy, or occluded images using the sum of percentage of error (PE) have also been obtained. The recognition rates for using different numbers of FDs, ranging from 1 to 40, are summarized. Some of these numerical results are tabulated in Table 4. It can be seen that, using PE with FDs results in less efficient performance than using ED. Moreover, increasing the number of FDs does not necessarily guarantee a better performance. From Table 4, it can also be observed that the best recognition rate of transformed images is achieved using 9 FDs. However, larger number of FDs gives rise to a lower performance.
i.
VII. PROPOSED TECHNIQUE One of the most important tasks regarding to object recognition is how to find number of descriptors of a given object. The query that arises is what is the optimum number of descriptors to be used with maximum recognition rate? , Are descriptors having equal importance? Such reasons signify the importance of these descriptors and also selecting the best descriptor by applying optimization technique. So we will have to use the best descriptors for the process of recognition. We have used one of the evolutionary optimization technique named as Genetic Algorithm for recognition of an object. The pictorial representation of the proposed technique is shown in Figure 3. It is clear from this figure that the optimized
ii.
weights are used with FDs in the process of recognition. These weights obtained from GA show the relative importance of these descriptors.
iii. iv.
v. vi. Figure 3. Pictorial Description of the proposed approach.
A. Optimization of the Feature Vector using GA The problem of selecting the best descriptors can be formulated as an optimization problem. The goal of the optimization is to select the most helpful descriptors that
vii.
321 327
Initialization: - The first step for Genetic algorithm in the optimization process is initialization. In this step various parameters are initialized to their desired value. In our simulation we have set the following parameters. • Bias is set between 0 and 0.2. • Number of iterations are 40 but we also have set the stopping criteria. If our stopping criteria meet during the specified iterations the simulation ends, otherwise it goes for the number of iterations specified. • Numbers of trials are used in between 10 to 20. • We have also set the stopping criteria which depends on the Hits used in the simulation. Evaluate Fitness function: The second step of Genetic algorithm is to evaluate the fitness of each particle as in our case we have the weights used as particles, so we compute fitness against each of the particle. Fitness against each particle shows the relative importance of those weights, and is given by: f = 1 − min( PE ) where f is the fitness and PE is the percentage of errors of all the training images for a given set of weights. In third step stopping criteria has to be checked, if stopping criteria is met then best generation will be obtained and GA will be terminate else jump to step iv Crossover: crossover is the most important operator of GA. Crossover is applied on previous population. After applying crossover we get new population and again here checked the hit ratio. If current hit ratio is better than previous than previous, population will be replaced with newly generated population, other wise step iv will be repeated until best hit ratio is achieved. Mutation: If stopping criteria matched then terminate else apply mutation on newly created population and check for hits. Best hit ratio: If current hit ratio is better than previous, population will be replaced with newly generated population, other wise step vi will be repeated until best hit ratio is achieved. Termination: Now again check here for stopping criteria, if found then terminate other wise start searching from the beginning.
B. Genetic Algorithm A novel population based optimization approach, called Genetic Algorithm (GA) approach, has been used. GA was introduced first in 1960 by John Holland [18]. GA is also known as evolutionary algorithm (EA). Evolutionary algorithms are general-purpose stochastic methods simulating natural selection and evolution in the biological world. GA differs from other optimization methods, such as PSO [19], Simulated Annealing, in the fact that GA maintains a population of potential solutions to the problem, and not just one solution. TABLE III.
1
2
3
4
5
Training set*
X
X
X
O
X, O, N
No. of FDs considered
11
11
6
6
11
0.19
0.149
0.135
0.116
0.141
0.21
0.1489
0.528
0.1457
0.7
0.2
0.1488
0.415
0.0841
0.0118
0.771
0.858
0.924
0.4544
0.0675
0.897
0.941
0.942
0.4418
0.277
0.96
0.7027
0.935
0.3533
0.4099
0.864
0.5466
Load database of descriptors. Initialize an array of particles with random values as initial weights. 3. Evaluate goodness & check the stopping criteria. 4. If stopping criteria does not meet then apply crossover on the weights & check for hits. If current hits are better then previous, replace new population with previous and store the hits. 5. Repeat for all Childs after crossover. 6. Check the stopping criteria. 7. Apply mutation on each particle & check for hits. If current hits are better than previous replace, 8. Check for stopping criteria. 9. If best hit meet stopping criteria then stop Else 10. Repeat steps 3 to 9 11. If stopping criteria doesn’t meet for number of trials then go to step2. End
0.7199
0.88
0.7568
0.7313
0.1592
0.1939
0.6508
0.9048
0.411
No of FD’s used X
11
7
6
6
11
100
98.33
96.67
93.33
93.33
N
100
100
100
100
93.75
Recognition Rate
1. 2.
OPTIMIZED WEIGHTS FOR DIFFERENT NUMBERS OF FOURIER DESCRIPTORS.
Experiment No.
Optimized Weights obtained
This procedure is repeated until the potential solution is reached. The best solution found is expected to be a near optimum solution. For more detail about general GA algorithm and detail about GA steps and operators used in GA, the reader is referred to [18]. Here is the algorithm using GAs: Initialization: Set stopping criteria Set no of iterations (Stopping criteria) Set no of trials Set threshold to compute goodness
C. Optimized weights used for Test Results using GA The proposed GA–based approach was implemented using MATLAB 7.1. In our implementation, we have set bias in between 0 and 0.2, number of iterations are set to 40 and in each iteration we have used 10 to 20 trials. Also note that we have set the stopping criterion which is also checked during these iterations. The search process may also stops if stopping criteria is met during the specified number of iterations. Table III demonstrates the computed optimized weights for different numbers of Fourier descriptors. Table 6 displays about the total number of optimized weights used for different numbers of Fourier descriptors and the recognition rate achieved. In the first experiment when a database of 60 transformed objects, in Table III, was considered, one can see a much better recognition results than un weighted FDs in Table I. Our proposed approach recognizes object 100% in some of the cases. In the second experiment, the results are better than using 11 un-weighted FDs. In our approach if we consider 11 descriptors then after optimization, object recognizes 98%, the results are better than using 11 un-weighted FDs. Experiment 3, when considered for 6FDs, shows generally, much better results than using 6 un-weighted
O 26.67 23.33 21.67 20 16.67 X = transformed objects, O = occluded objects, N = noisy objects.
*
Generally, GA works as follows: a population of individual is initialized where each individual represents a potential solution to the problem. The quality of each solution is evaluated using a fitness function. A selection process is applied during each iteration of GA in order to form a new population. The selection process is biased towards the fitter individuals to insure that they will be a part of new population. Individuals are altered using two main operators of GA which are crossover and mutation.
322 328
has been utilized successfully for this purpose. Using GA, to find the most suitable descriptors and to assign weights for these descriptors, improved dramatically the recognition rate using the least number of descriptors.
FDs. In case of Noisy objects, one can see a much better recognition results than Table I, i.e objects are recognized 100% after optimization. However, in experiment 4, using occluded images, recognition improves up to 32% which is much better from Table I. Experiment 5 for a mixed data set of transformed, noisy and occluded objects has produced wonderful results of 93.33% recognition rate. Such an achievement has not been obtained by any number of un-weighted FDs.
ACKNOWLEDGMENT The authors are grateful for constructive comments of the anonymous reviewers towards the improvement of the paper. REFERENCES [1]
TABLE IV.
DIFFERENT NUMBERS OF FOURIER DESCRIPTORS AND THE RECOGNITION RATE USING PERCENTAGE ERROR (PE).
1
2
3
4
5
Training set
X
X
X
O
X,O,N
No of FD’s used
11
7
6
6
11
X
80
80
75
78.33
72
N
87.5
87.5
87.5
87.5
81.25
Recognition Rate
Experiment
No.
[2]
[3]
[4]
O 15 7 11.67 13.33 16.67 X = transformed objects, O = occluded objects, N = noisy objects.
[5]
Table IV demonstrates the recognition results for different number of descriptors by using Percentage error. In this case database of 60 objects, is considered and we can see that the results obtained from weighted descriptors are much better than un-weighted descriptors in Table II.
[7]
*
[6]
[8]
[9]
VIII. CONCLUSION This work has been reported to make a practical study of the Fourier descriptors to the application of Object Recognition. The implementation was done on a Laptop using MATLAB 7.1. The ultimate results have variations depending upon the selection of number of FDs, similarity transformations, noise, occlusion, and data size. The variety of similarity measures and different combinations of FD features, used in the process, make a difference to the recognition rate. The results have been tested using up to 40 FDs, and different size of databases. Different combinations of these parameters implied different results. Two similarity measures, including ED, and PE, provided different recognition results. The images used are all bitmapped images, further investigations are being done with some more complex images. It can be seen that, using PE with FDs results in less efficient performance than using ED. Moreover, increasing the number of FDs does not necessarily guarantee a better performance. The images that have to be recognized, but failed to be recognized by most of the FD combinations, are to be analyzed further. This leads to the theory of optimization to find out appropriate features or attributes in the image that made it difficult to be recognized. The methodology of GA
[10]
[11]
[12] [13]
[14]
[15] [16] [17] [18] [19]
323 329
G.H. Granlund, Fourier Preprocessing for hand print character recognition, IEEE Trans. Computers, Vol C-21, 1972, pp. 195-201. A Project led by Julien Boeuf and Pascal Belin, and supervised by Henri Maître: http://www.tsi.enst.fr/tsi/enseignement/ressources/mti/descript_fourie r/index.html. O. Betrand, R. Queval, H. Maître, Shape Interpolation by Fourier Descriptors with Application to Animation Graphics, Signal Processing, June 1981, 4:53-58. H. Maître, Le traitement des images, ENST, December 2000, pp. 7072. C.T. Zahn, R.Z. Rhoskies, Fourier descriptors for plane closed curves, IEEE trans. Compu. 21 (1972) 269-281. Thomas Bernier, Jacques-Andre landry, A new method for representing and matching shapes of natural objects, Pattern Recognition 36 (2003), 1711-1723. N. Ansari, E.J. Delp, Partial Shape Recognition: a landmark based approach, IEEE Trans. PAMI 12 (1990), 470-183. J. Zhang, X. Zhang, H. Krim, G.G. Walter, Object representation and recognition in shape spaces, Pattern Recognition 36(5), 2003, pp. 1143-1154. M. Sarfraz, Object Recognition using Moments: Object Recognition using Moments: Some Experiments and Observations: Geometric Modeling and Imaging – New Advances, Sarfraz, M. and Banissi, E. (Eds.), ISBN-10: 0-7695-2604-7, IEEE Computer Society, USA, 2006, pp. 189-194. John W. Gorman, O. Robert Mitchell, Frank P. Kuhl, Partial shape recognition using dynamic programming, IEEE Transactions on pattern analysis and machine intelligence, Vol.10(2), March 1988. G. Avrahami and V. Pratt. Sub-pixel edge detection in character digitization. Raster Imaging and Digital Typography II, pp. 54-64, 1991. Hou Z. J., Wei G. W., A new approach to edge detection, Pattern Recognition Vol. 35, pp. 1559-1570, 2002. N. Richard, T. Gilbert, Extraction of Dominant Points by estimation of the contour fluctuations, Pattern Recognition Vol. 35, pp. 14471462, 2002. M. Sarfraz, Object Recognition using Fourier Descriptors: Some Experiments and Observations: Computer Graphics, Imaging and Visualization – Techniques and Applications, Banissi, E., Sarfraz, M., Huang, M. L., and Wu, Q. (Eds.), ISBN: 0-7695-2606-3, IEEE Computer Society, USA, 2006, pp. 281-286. Rafael Gonzalez, Richard Woods and Steven Eddins, Digital Image Processing Using Matlab, Prentice Hall, 2003. R. Jain, R. Kasturi, B. Schunk, Machine Vision, McGraw Hill, 1995. http://www.cee.hw.ac.uk/hipr/html/median.html. Melanie Mitchell, An introduction to genetic algorithms, The MIT Press, 1998. R. Eberhart and J. Kennedy, A new optimizer using particle swarm theory, Proc. the Sixth Intl. Symposium on Micro Machine and Human Science, MHS '95, 4-6 Oct 1995, pp. 39 -43.