Grading Textured Surfaces with Automated Soft ... - Semantic Scholar

Report 2 Downloads 41 Views
Grading Textured Surfaces with Automated Soft Clustering in a Supervised SOM J. Martín-Herrero1, M. Ferreiro-Armán1, J. L. Alba-Castro1 1

Artificial Vision Team, Department of Signal Theory and Communications, University of Vigo, E36200 Vigo, Spain [email protected], {mferreir, jalba}@gts.tsc.uvigo.es

Abstract. We present a method for automated grading of texture samples which grades the sample based on a sequential scan of overlapping blocks, whose texture is classified using a soft partitioned SOM, where the soft clusters have been automatically generated using a labelled training set. The method was devised as an alternative to manual selection of hard clusters in a SOM for machine vision inspection of tuna meat. We take advantage of the sequential scan of the sample to perform a sub-optimal search in the SOM for the classification of the blocks, which allows real time implementation.

1 Introduction Grading of surfaced textures is a common task in machine vision applications for quality inspection (QC). Samples of a textured surface are presented to the system, which has to find the degree of compliance of the surface with a standard, usually set by subjective knowledge about the expected appearance of the material in question. Therefore, texture grading generally requires the use of classifiers able to capture the knowledge about the material of the human QC operators, which usually can only be expressed through the labelling of samples or sub-samples. Industrial machine vision applications for control quality and assurance often have the additional constraint of speed: decisions about the sample being processed have to be taken in real time, usually on high speed production lines, which, on the other hand, is the major factor justifying the use of automated QC systems. Several recent works have shown the suitability of Self Organizing Feature Maps (SOM) [1] for machine vision applications [2-4]. A SOM maps a high dimensional space of feature vectors into a lower dimensional map while preserving the neighbourhood, such that feature vectors which are close together in the input space are also close to each other in the lower dimension output space, and feature vectors which are separated from each other in the input space are assigned separate nodes in the output map. Thus, a two dimensional SOM can be used to somewhat visualize the input space, allowing the QC operator to define connected neighbourhoods in the output map that correspond to arbitrarily shaped connected regions in the higher dimension input map. This is achieved by assigning representative samples to each node in the map, which is presented to the human operator as a map of textures. Provided that the underlying feature vector adequately describes the texture of the

samples for the problem at hand, the QC operator can then use his knowledge to define labelled neighbourhoods in the map that will be used for the classification of the samples in the production line (see, for instance, [4], [5]). However, this use of SOM has two drawbacks. One is speed: every feature vector has to be confronted with every node in the map to find the winner node (best matching unit, BMU), to be classified as appertaining to the labelled neighbourhood where the BMU is located. This involves the computation of a Euclidean distance or a dot product per node. Several methods have been proposed to speed up the winner search [6-8] but their suitability depends on the dimension of the feature vector, the size of the map, the required accuracy on the selection of the winner, and the shape of the surface generated by the distances from the sample feature vector to the feature vectors of the map nodes. The second drawback is due to the interaction between the QC operator and the SOM visualized as an ordered set of samples of the training set. The training sample closer to each node in the SOM is chosen as its representative to build the visual map. We have detected several sources of problems during the field test phase of a prototype using this technique: 1) Quite different samples may correspond to the same node, such that a unique sample won’t be a good representative of every sample assigned to the node; 2) Some nodes may exist that are not the BMU for any sample in the training set, such that the sample closer to them actually belongs to another node, and thus cannot be used as a representative for the former, and, therefore, they will have no representative in the map, where thus a black hole appears; 3) Even when a good map is obtained through careful design and training such that 1) and 2) are minimized or prevented, the operator may (and we have observed them to do it) pay attention to irrelevant features in the representatives which do not correspond to the actual factors affecting their classification, and thus, may define classification neighbourhoods in the map which do not agree with the features for which it was designed, and, thus, the SOM interface which was devised for a better interaction with the human operator may in fact hinder the system tuning by the operator. We have devised a method to define automatically the neighbourhoods in the SOM, as soft clusters, thus eliminating the problems due to the subjective interpretation of the visual map, while still profiting of the dimensional reduction and topological preservation inherent to the SOM to achieve better speed performance. Section 2 explains the method and section 3 describes the results obtained with the implementation of the method in a machine vision system for quality inspection of tuna meat.

2 Method The texture sample under inspection is divided into blocks with horizontal and vertical overlap, whose size is determined by the scale of the texture patterns of interest. From each block a feature vector is extracted which suitably describes the texture for the problem at hand. Feature distribution vectors (histograms) of local texture measures have shown to be suitable feature vectors for this kind of applications [9].

The feature vectors of the block define the input or feature space, F ⊂ Rd, which will be mapped into a subspace, O ⊂ R2, by a SOM, ΨF→O, which, if it is correctly trained, will preserve the intrinsic topology of F. For that purpose, first a sufficient number of representative blocks has to be extracted from samples to be labelled by trained operators and make up the training set. Each block is labelled as appertaining to one of a number of classes, C, ranging from “bad quality” to “good quality”. The size of the training set has to be adequate to the size of the SOM, and the size of the SOM depends on the shape of the feature space within Rd, and on the number of classes, such that enough separability between classes may be warranted in the resulting map. This is usually tuned via repeated trials for different configurations. After training, the SOM is constituted by an ordered two dimensional set of nodes, each one characterized by its feature vector, nij ∈ Rd, thus mapping F into R2. Note that the labels of the training set do not play any role during the training of the SOM. In the classifying phase, to classify a block, we have to extract its feature vector, v, and locate the BMU, the nij minimizing d(nij, v), where d denotes the Euclidean distance. Then the block is said to correspond to node ij, and, if labelled neighbourhoods have been defined in the map, it is assigned the same label as the neighbourhood that the node ij belongs to. 2.1 Automated Clustering of the SOM To automatically define labelled neighbourhoods in the SOM, we classify every block in the training set and record, for every node and each class, the number of times that it has been the BMU for a block of the class, wijc. It is not a rare event that blocks of different classes are assigned the same node in the map, and, thus, that wijc may be greater than zero for several classes, c, and the same node, ij. This is what drove us to use soft clusters to label the neighbourhoods in O. For every node ij in the map a sensitivity function is defined for every class, c, for which wijc > 0, as:

( )

S ( x, y ) = log1.01 n c ij

c ij

2

2

e −2(( i− x ) +( j − y ) −d max ) − 1 . e 2 d max − 1

(1)

Then every node is assigned a vector puv ∈ RC, whose cth component is the membership of the node to class c, obtained from:

∑∑ S (u, v ) c ij

puvc =

i

j

   ∑ ∑ Sijk (u, v ) ∑   k =0  i j  C −1

2

, c = 0, ... C − 1.

(2)

This allows soft classification of blocks, such that once the BMU for a block has been found, the block will add up to the sample final grading result its corresponding memberships to each quality degree or class. The final grading for the sample is the result of adding up the partial classifications of the blocks extracted from the sample.

2.2 Fast Sub-optimal Winner Search The most common way of finding the BMU in the map is by using exhaustive winner search. However exhaustive winner search may take a long time for real time applications. To reduce BMU search time, we propose a sub-optimal BMU search based on the topology preserving feature of the SOM: if we classify two adjacent overlapping image blocks, there is a high probability that their feature vectors are close to each other. Provided that the SOM preserves the topology of the feature space, F, they should also be close to each other in O. Overlap and spatial correlation of the image causes that the feature vectors from adjacent image blocks are close in the feature space. SOM training is designed to obtain a topology preserving map ΨF→O between F ⊂ Rd and O ⊂ R2, which means that adjacent reference vectors, nij, in F, belong to adjacent nodes ij in O. When this condition holds we can perform a BMU search that takes advantage of the image scanning to search only in a neighbourhood of the last winning node. To ensure good results, it is necessary to quantify the topology preservation of the SOM. Probably the most referenced method for performing this measurement is the topographic product, P, [10], such that P < 0 if dim(O) < dim(F); P → 0 if dim(O) = dim(F); and P > 0 if dim(O) > dim(F). Therefore, a P value close to 0 will indicate that ΨF→O is able to map neighbour feature vectors to neighbour output nodes. Therefore, we take advantage of the sequential scanning of the blocks in the image, to restrict the BMU search of each block to the neighbourhood of the BMU of the previous block. We perform the neighbourhood search describing a spiral around the previous BMU. The BMU for the current block is the nearest node to the block’s feature vector within the given neighbourhood. The size of the neighbourhood, and thus the performance of the search, is given by the maximum search distance. Error curves graphing the distance between the real BMU (obtained through exhaustive search) and the approximate BMU for a set of blocks will provide the grounds for choosing the maximum search distance. Care has to be taken to reset the search from time to time in order to not to accumulate excessive error due to the use of approximate BMUs as initial points for the search. The optimal moment to reset the search is at the beginning of each scan line of blocks in the image. Consequently, the search for the first block in each scan line is exhaustive, thus warranting that the accumulated error is eliminated. Anyway, the first block in a scan line and the previous block (the last block in the previous line) are not adjacent in the image, and therefore, the assumption that the BMU of the former is in the neighbourhood of the BMU of the latter does not hold, and thus the exhaustive search is compulsory. The relative extent of the neighbourhood search area with respect to O will give the gain in time obtained with the neighbourhood search with respect to the exhaustive search. There is no overhead due to the spiral search because this can be performed with the same computational cost as the usual raster scan. This is achieved by using a displacement vector ∆v = (∆vx, ∆vy) ∈ O, which is rotated 90º at step sn = sn-1 + [n/2], where n is the step number and s0 = 1 ([·] denotes integer part). Rotating ∆v is achieved through ∆v’ = (∆vy, −∆vx). Thus, a single loop is enough to perform the entire search.

3 Experimental results We implemented the method on a industrial machine vision system for quality inspection of tuna meat that used a SOM for interfacing with the QC operator [5], [11]. The samples for inspection are the contents of tuna cans. The cans are round, and the images are acquired such that the cans have a diameter of 512 pixels. The blocks fed to the SOM are 50x50, with a 60% horizontal and vertical overlap, totalling an average 300 blocks per can. The feature vectors consist of a mixture of local binary patterns (LBP) [12], entropy of the co-occurrence matrix [13] and a measure of correlation between different affine transformations of the block, such that the dimension of the input space is d = 7. We used a 12×12 SOM, trained with 1000 training blocks belonging to three different classes, C = 3, corresponding to “Good quality”, “Medium quality”, and “Bad quality”. We used the usual training method [1]. The typical topographic product obtained was P ∈ [-0.0080, -0.0001]. This means that the map achieved satisfactory neighbourhood preservation. This allowed fast BMU search as described in 2.2.

Fig. 1. A SOM of tuna meat. Note the black nodes which did not win for any block in the training set.

Figure 1 shows a typical map where each node is represented by the closest block in the training set. Figure 2 shows the membership to each of the three classes (Figures 2(a), 2(b) and 2(c)) for every node in O. The figures show that the darkest areas in each one occupy a different region of the map, and a certain degree of overlap exists between the different classes, which is accounted for via the soft classification scheme allowed by the use of soft membership.

a)

b)

c)

Fig. 2. Membership of every node in the map to each of the three classes: a) good quality tuna, b) medium quality tuna, c) bad quality tuna. Darker areas indicate a higher membership.

Next, we studied the spatial correlation between blocks in the image and the corresponding BMU in O. Figure 3 shows three curves for three kinds of samples (tuna cans): generally good looking, generally medium looking, and generally bad looking. As expected, bad looking tuna has a higher disorder, and, thus, neighbouring blocks are less similar than in good looking cans. Anyway, the shape of the three curves supports our statement that neighbouring blocks in the image have their corresponding BMU in near locations of the SOM, and, thus, the fast search method can be applied with low cost. To evaluate this cost, we produced Figure 4, were we can see error rates for the different classes and average error distance in the map. The bars show, for each class, the percentage of blocks (N = 6000 blocks) which were assigned the wrong BMU due to the sub-optimal search, i.e. a different BMU than that found by exhaustive search. The line (secondary axis) shows the average distance between the approximate BMU and the real BMU in O. If we take a search neighbourhood around the previous BMU of 49 nodes, the sub-optimal search requires just 34% of the time needed by an exhaustive search, and we get an average error rate of about 7%. We can reduce this average error rate to 3% if we increase the search neighbourhood to 81 nodes. However, this would give us 56% of the time required by an exhaustive search to perform the sub-optimal search. Field tests where the cans thus graded were compared to the grading provided by QC operators showed that the level of performance of the system had been maintained in spite of the time gain achieved and the automated generation of the map of classes. The cost of the improvement devolve upon the training phase, which now requires a labelled training set that has to be generated in collaboration with the QC operators.

Fig. 3. Correlation between distances between blocks in the image sample and distances between the corresponding BMU in the map for different types of cans. 90% percentiles are also shown.

Fig. 4. Error rates for each of the three classes due to the sub-optimal search. The bars show the percentage of blocks for which the approximate BMU and the real BMU (exhaustive search) differed. The line shows the average distance in the map between the approximate BMU and the real BMU (N = 6000 blocks).

References 1. Kohonen, T.: Self-organizing Maps. Springer-Verlag, Berlin (1997) 2. Niskanen, M., Kauppinen, H., Silven, O.: Real-time Aspects of SOM-based Visual Surface Inspection, Proceedings of SPIE. Vol. 4664, (2002) 123-134 3. Niskanen, M., Silvén, O., Kauppinen, H.: Experiments with SOM Based Inspection of Wood, International Conference on Quality Control by Artificial Vision (QCAV2001). 2 (2001) 311-316 4. Kauppinen, H., Silvén, O., Piirainen, T.: Self-organizing map based user interface for visual surface inspection, 11th Scandinavian Conference on Image Analysis (SCIA99). (1999) 801-808 5. Martín-Herrero J., Ferreiro-Armán, M., Alba-Castro, J. L.: A SOFM Improves a Real Time Quality Assurance Machine Vision System, Accepted for International Conference on Pattern Recognition (ICPR04). (2004) 6. Cheung, E.S.H., Constantinides, A.G.: Fast Nearest Neighbour Algorithms for selfOrganising Map and Vector Quantisation, 27th Asilomar Conference on Signals, Systems and Computers. 2 (1993) 946-950 7. Kaski, S.: Fast Winner Search for SOM-Based Monitoring and Retrieval of HighDimensional Data, 9th Conference on Artificial Neural Networks. 2 (1999) 940-945 8. Kushilevitz, E., Ostrovsky, R., Rabani, Y.: Efficient Search for Approximate Nearest Neighbor in High Dimensional Spaces, 30th ACM Symposium on Theory of Computing. (1998) 614-623 9. Ojala, T., Pietikäinen, M., Harwood, D.: Performance evaluation of texture measures with classification based on Kullback discrimination of distributions, Proc. 12th International Conference on Pattern Recognition. Vol. I (1994) 582-585. 10. Bauer, H.-U., Herrmann, M., Villmann, T.: Neural maps and topographic vector quantization, Neural Networks, Vol.12(4-5) (1999) 659-676 11. Martín-Herrero J., Alba-Castro J.L.: High speed machine vision: The canned tuna case, in J. Billingsley (ed.) Mechatronics and Machine Vision in Practice: Future Trends. Research Studies Press, London, (2003) 12. Mäenpää, T., Ojala, T., Pietikäinen, M., Soriano, M.: Robust texture classification by subsets of Local Binary Patterns, Proceedings of the 15th International Conference on Pattern Recognition. (2000) 13. Gonzalez, R.C., Woods, R.E.: Digital Image Processing. Addison-Wesley. New York. (1992)