Local Shape Association Based Retrieval of Infrared Satellite Images

Report 1 Downloads 36 Views
IEEE International Workshop on Multimedia Information Processing and Retrieval, Irvine, CA, December 2005

Local Shape Association Based Retrieval of Infrared Satellite Images Aiyesha Ma and Ishwar K. Sethi Intelligent Information Engineering Laboratory Department of Computer Science and Engineering Oakland University Rochester, Michigan 48309 [email protected], [email protected]

Abstract

provide a connection with the information desired, such as cloud shapes [3, 4]. For weather imagery, often the images are looked at only as they are collected. A retrieval system to search past weather patterns will provide meteorologists with insights to the current weather system [5]. By analyzing past images better models can be developed for the prediction of events such as severe weather, fires, droughts, and dust storms. Furthermore, the impact of clouds on climate will be better understood after further studying the properties of clouds [6]. Existing work pertaining to remotely sensed images has primarily focused on retrieval according to texture based classification [7, 8, 9, 10]. Since retrieval based on classification is very limiting in terms of the types of searches [9], independent features rather than classes are desired [4]. Although analysis of remotely sensed images has focused on texture, shape is also an important feature particularly in weather imagery [3]. Cloud type, determined by its shape [11], and height is used by meteorologists to determine the current and future weather [12]. In addition, more specific objects such as typhoons, hurricanes, showers, thunderstorms, likely tornadoes, and even volcanic ash can be determined from the shapes present in weather images [13, 12, 14]. In contrast to existing content based remote sensing image retrieval methods, this paper presents a shape based retrieval system for use on infrared images. Similar to standard document retrieval, an indexing structure is used in the retrieval process, rather than the direct comparison approach more common in content-based image retrieval systems. The proposed system uses region growing to segment out areas of interest which are then characterized using polygonal approximation. Retrieval of similar polygons is performed using a local association hashing method. Furthermore, the local association hashing method is a partial shape matching method. A brief review of existing content based retrieval research with a focus on remote sensing applications is pre-

This paper presents a shape-based retrieval system and its application to infrared satellite images. A complete system is presented, from region extraction of a full hemisphere scan to the actual retrieval mechanism. After region extraction, polygonal approximation is applied to the region shape, and local features of the polygons are hashed to provide an association space. This space becomes the indexing structure through which retrieval takes place. Although the indexing stage, containing region extraction and polygonal approximation, is slow, the actual retrieval is very fast. On average, retrieval of a query shape from a database of 1965 shapes takes 0.7 seconds for the more reduced representation, and 2.8 seconds for the less reduced representation consisting of 1914 shapes. The overall design is good for a moderately sized database, and extensions could be made to apply the method to a massive database. The results show that the approach performs well, and that there is a substantial speed benefit for using the local association hashing method.

1 Introduction Advances in hardware, in particular sensors and storage, have allowed vast amounts of data to be collected. Yet, software to retrieve meaningful subsets of such stored data has, unfortunately, not kept pace. This is particularly true of image data. Effective content-based image retrieval methods are necessary to facilitate the usage and understanding of large amounts of data contained in databases of images. For large image databases, such as those collected by remote sensors, manual classification of images to enable content based search is infeasible [1, 2]. Current retrieval of weather satellite images accomplished by specifying the date, time, place, and satellite by which the image was collected. Often these features don’t 1

sented in Section 2. An overview of the proposed retrieval system is described in Section 3. Sections 4, 5, and 6 describe the different components of the system: preprocessing and region extraction, shape characterization, and retrieval method. Results are shown in Section 7, and the conclusion along with some future research directions are presented in Section 8

2 Review of Existing Approaches While search methods for hyper–text are robust enough to return semantically meaningful results in response to a query, there is a lack of efficient, semantically meaningful search methods for images. Much of the early work concentrated on hand-annotating the image databases and then searching by means of standard text-based query methods. However, two major problems have kept this method impractical. The first is that there is no consistent way to annotate large image databases due to viewer subjectivity among other concerns. The second is that for a very large and expanding database, this method is thoroughly impractical from the standpoint of time and labor costs [1]. Instead, research has been focused lately on automated and semi-automated methods of searching large image databases. Many methods rely on the extraction of color or texture descriptors, and then organize that data to determine the similarity of images to one another. Color information is popular because it can be mapped to a three dimensional coordinate system that closely matches human perception, using the opponent color theory. Color, however, is unsuitable, since many remote sensing images are primarily intensity or grayscale images. Some shape and texture measures exist, but each has its limitations [15, 16]. Examples of shape and texture measures include wavelets, fractals, Gabor filters, Fourier descriptors, and moments [15, 1]. In the area of remote sensing, a classification stage is sometimes performed first, and semantic retrieval is then accomplished by retrieving a particular class. Examples of this approach include [7] and [9]. The classification approach taken in [7] uses vectors of pixel values from AVHRR (Advanced Very High Resolution Radiometer) images to cluster spatially local regions. The cluster information is used to obtain various classifications such as snow, land, and cloud. Queries are by class or by pixel value. The algorithm described in [9] allows relational queries containing metadata and lower-level features from Landsat data, meaning that queries such as “agricultural areas within 2 kilometers of water bodies” are possible. New feature types can be added to the system by specifying an example. In [10], AVHRR images are still retrieved based on location, date, and times, but the system determines metadata regarding cloud cover. This allows a user to specify that the re-

gion of interest should be cloud free. Users are allowed to retrieve Landsat images using generic features that are dynamic or determined a-priori such as urban, agricultural, or forest areas in [17]. A toolbox using Gibbs Random Fields, a texture based approach, is described in [18]. This toolbox was developed to aid in the extraction of spatial information for retrieval applications. The authors propose using the toolbox for texture characterization, such as “clouds,” “light clouds,” and “forests.” An infrastructure to facilitate analysis of large image databases is proposed in [19]. A mixture of texture and shape features are used on three levels of Landsat images: pixel, region, and tile. Gabor wavelets at the pixel level are used for texture. At the region level shape features such as eccentricity, orientation along the major axis, and invariant moments are used. Existing shape-based approaches to weather satellite image retrieval are rather limited. A deformable model approach to fit ellipses to clouds in meteorological images is proposed in [2]. The spatial distribution of the ellipses is then modeled using relational graphs. Queries are specified by a graph, and the retrieval determines the graph similarity. Speed is an issue with this approach. The relation graph of ellipses approach is applied to typhoons in [13]. Pixels are categorized into their respective cloud types, and relational graphs of ellipses locations are used to describe the typhoons. The typhoons are then characterized into a standard set or typhoon types and retrieval is done by text queries specifying the desired type of typhoon. A shape based method for retrieving Meteosat images is described in [3]. A point diffusion method is used to compare contours of point sets in from extracted cloud events in Meteosat satellite images. The areas of interest were preselected from the raw scan data, meaning that no segmentation approach was described. Furthermore, the images were separable into two classes, hurricanes and non-hurricanes, and the retrieval results were presented only in this context.

3 Overall System Design Figure 1 shows an overview of the system. As illustrated, within the indexing stage, the first step is preprocessing the image database. For these infrared satellite images, this consisted primarily of the representation of the image as blocks characterized by the mean and standard deviation of the intensity values. The second step is region extraction, which was performed by using region-growing segmentation. In the third step, regions are characterized by performing polygonal approximation on the region shapes. In the retrieval stage, if an image is specified as the query, then the above mentioned steps of preprocessing, region extraction, and region characterization, must be performed on the query image. The resulting polygon then becomes the

query shape in Figure 1. Otherwise, a query shape is specified directly, and the retrieval is performed by a polygon hashing method. The ranked results are then retrieved. Figure 1. Overview of System Design

The mean intensity, I, of each block,B, containing pixels, p1 to pn , was then calculated as the mean over the surface correlating to those pixels. n S(p ) · I(pi ) n i M ean(B) = i=1 i=1 S(pi ) Similarly, the standard deviation was calculated as follows.  n 2 · (I(pi ) − M ean(B)) i=1 S(pi ) StdDev(B) = n i=1 S(pi )

4.2

4 Region Extraction Infrared images capture the temperature of objects in the image, so clouds tend to be highlighted in satellite images taken of the earth. This section describes two slightly different region extraction methods used to extract these cloud structures Instead of performing region growing directly on the grayscale image, the image is broken into blocks of n by n pixels, and within each block the mean and standard deviation is calculated. Region growing is then performed on these two features. Blocks of 10 by 10 pixels were used for the first region extraction method while blocks of 5 by 5 were used for the second region extraction method.

4.1

Region Growing and Extraction

A region growing method is used, in which a seed block is selected and the neighboring blocks are added to the region if similar enough by some threshold value. The region is denoted by a representative vector based on the mean and standard deviation of the blocks in the region. This representative vector is updated each time a block is added to a region. If a neighboring block is not similar enough, then that block forms a new region. After the entire image has been subdivided into regions, small regions are merged into neighboring larger regions. Since the region growing method does not partition the image into foreground regions and background area, the regions are ordered according to mean and also according to standard deviation. To extract regions, the highest ranked areas above some threshold are selected. wm · RankMean + ws d · RankStdDev > CutOf f Where wm and ws d are some weights to give more importance to either higher means or higher standard deviations. This approach occasionally resulted in broken regions.

Preprocessing 4.3

In the first method, a simple mask consisting of the average value over all images was used to adjust for consistent intensity variations along the edges and poles. This resulted in artifacts, so for the second extraction method, preprocessing consisted of making corrections for the distortion resulting from the orthographic projection. To correct for the distortion, the corresponding area on the sphere surface for each pixel must be calculated. Since the earth is not a perfect sphere, the horizontal and vertical radii differed, and to simplify the calculation the two radii were averaged to obtain one value. For a pixel centered at coordinates (h,v), and of unit area, u2 , the corresponding sphere surface area was as follows.  S(p(h,v) ) =

h+u/2

h−u/2



v+u/2

v−u/2



r2 dxdy r2 − (x2 + y 2 )

Confidence Based Region Extraction

The chi-square confidence based design presented in [20] was the conceptual basis for the second region region growing method. In the adapted confidence based region growing method used here, a block was grouped with its neighboring block according to the following equation: Conf (i, ni ) · GP r(i) > T hreshold This equation has two parts, the confidence part, Conf , and a probability that the block should be associated with its neighbors, GP r. The confidence was modified so that the confidence of the neighbor of i, ni , is calculated with respect to i, rather than the region i is in, Ri .  2 µni − µi Conf (i, ni ) = σi

The probability, GP r(i), is related to the variance of block i and was added in because a block with a low standard deviation, σi , indicated a more homogeneous area, and was more likely to have a stronger association with its neighbors.  1 if |σi | > 1 |σi | GP r(i) = 1 if |σi | ≤ 1 Again, small regions were forced to merge with the most similar neighboring large regions. In addition, similar enough neighboring regions were also merged. The region extraction was performed by selecting all regions above a certain threshold. Because the poles are cold in general, two thresholds were used: one for the middle third of the image, and a lower one for the two outer thirds of the image. The location of a region was determined by its centroid. This approach performed well for a majority of the images, but could be improved further.

5 Polygon Approximation Polygonal approximation is a feature reduction method. Rather than select all points within an object, or all points along the edge of an object, polygonal approximation gives some subset of edge points that represents the overall object. Although there are many polygonal approximation methods described in the literature [21], a classical method was selected for simplicity and speed. In the merge method of polygonal approximation, the influence of each point in the polygon is calculated and the point with the minimum influence is removed. Influence is calculated as the change in polygon approximation error if the point was to be removed. Points are removed until the approximated polygon error falls within a certain error tolerance. This merge method approach was further extended by considering the error along each segment of the approximated polygon. This criterion allowed a relative stopping criterion to be set based on the overall size of the original polygon. The stopping criterion was set independently for each region so that larger regions were allowed more variation along their edges than smaller regions.

6 Polygon Hashing The indexing structure is based on work presented in [22] and [23]. In contrast to many shape comparison methods, the polygon hashing method maps all local features to a common space, denoted in Figure 2 as the index space. Properties, such as the length ratios and vertex angles of three consecutive vertices of a polygon are used as local shape features. This representation is called a trigram. This index space provides an association between local features and the objects in the database. During retrieval, the local

features of a query object are hashed and the associations are retrieved. These associations then vote on the candidate objects to provide a score, by which the results are ranked. Figure 2. Polygon Hashing

Vector hashing was used instead of string hashing. In vector hashing, rather than quantize and bucket the trigrams, a nearest neighbor approach is used directly on the trigrams, where all the nearest trigrams within some specified distance in the hash space are retrieved. Also, rather than use tri-grams, n-grams of vertex angles and length ratios were used and various values of n were tested. A weighting function that reduces the dependence on the number of vertices is used to tally the candidate polygons. In this function, a higher weight is given to trigrams from other polygons with a similar number of vertices.    Num Vertices in Polygon p  I(p) = 1 − Num Vertices in Query  W (p) = max(I) − I(p) Where max(I) is the maximum of all I(p).

7 System Results Fifty-six full hemisphere infrared images taken by Meteosat satellites were used (Figure 3(a)). About half of these were from a satellite located at the prime meridian and equator, and the other half from a satellite over the Indian Ocean. The steps described in the preceding section were performed on all images resulting in 1965 polygonal regions for the region growing method (FREM) and 1914 polygonal regions for the confidence-based region growing method (CBREM). Intermediate results of the indexing stage are shown in Figure 3. Retrieval examples were performed with respect to each of these polygonal regions. In this case the input shape is derived from an existing shape in the data set. In an actual system the input would be derived from a new example, or sketched by hand. Some retrieval results are shown in Figures 4 – 6. For these examples the query shape is shown in the upper left of the image, and the top retrieved result is below that. Other retrieved results are in the second and third columns; 2 through 4 in the middle column, and 5 through

(a) Full Hemisphere c 1999 EUMETSAT

(b) Region Means

(c) Segmented Image

Figure 5. Retrieval Result – CBREM (d) Polygonal Region (e) Edge Points of Re- (f) Polygonal Approxigion mation of Region

Figure 3. Intermediate stages, CBREM 7 in the rightmost column. The retrieval is performed on the approximated polygon, shown on the bottom right of each result. The original image along with the extracted region shape is also shown for each result.

Figure 6. Retrieval Result – CBREM

For a quantitative assessment of performance, a hit-ormiss performance measure and the speed of the hashing retrieval method are discussed.

as the query shape. Otherwise it was considered a ‘miss’. This retrieval was performed on all the shapes, and the ‘hit’ percentages for several parameters are shown in Figure 7. Two parameters were tested: the maximum distance in the hashing stage (maxD) and the n of n-grams. The length ratio and vertex angle n-grams were concatenated resulting in v 2n-grams. Accuracy appears to be limited to 87% for FREM. This is due to artifacts resulting from the preprocessing stage. An example is shown in Figure 8. The apparent cap in CBREM is because of an accidentally duplicated image in the image set. If the duplicated image was removed, the accuracy would reach 100%.

7.1

7.2

Figure 4. Retrieval Result – FREM

Hit-or-Miss

Although precision and recall are better measures of performance, they require a study of the images by an expert in the field. Until this expert study can be accomplished, a hitor-miss measure is used to identify the performance effects of changes to the parameters. A retrieval result was considered a ‘hit’ if the top retrieved result had the same index

Speed

When working with massive datasets, an important consideration is the speed of the algorithm. In a retrieval system the indexing stage has a less stringent speed requirement. The indexing only needs to be fast enough to be accomplished in the time it takes the data to be collected. In contrast, the retrieval stage should be as fast as possible and

Figure 7. Hit-or-miss accuracy

Figure 9. Average retrieval speed

8 Conclusion

Figure 8. Retrieval Result – Artifact, FREM not leave the user waiting. The hashing retrieval method described in this paper is extremely fast. Each polygonal region was used as a query, and the average retrieval time per query is depicted in relation to several parameters in Figure 9. These times were acquired on a 1.4 GHz Athlon processor. Further speedup could be accomplished by a strategic search through the index space when performing vector hashing. Authors in [3] state that a single comparison between two shapes takes 0.61 seconds on average on a 400 MHz Pentium II. This means that retrieval in an equivalent 1965 shape database would take approximately 1198 seconds, as opposed to approximately 0.7 seconds taken by the method presented here. Although a direct comparison is not possible, the processor difference alone does not account for the nearly 2000 times faster retrieval when using the vector hashing method. The slower retrieval times for CBREM can be attributed to the less reduced representation (5 by 5 blocks instead of 10 by 10). Since the speed of the hashing method is dependent on the number of n-grams, a less reduced representation results in more vertices in the approximated polygon, and therefore more n-grams.

This paper presented a shape-based retrieval system. Although the retrieval system could be applied to a variety of image types, the design and results were presented in the context of infrared satellite images. In the presented design, regions of interest were extracted from full hemisphere scans using a region growing method. The region shapes were then characterized by polygonal approximation. Vector hashing is performed on n-grams of the vertex angles and length ratios to develop associations in an indexing structure. When retrieval is performed, the local n-gram features are hashed and the associations are retrieved. These associations then vote on candidate shapes, which results in a score that can be used to rank the results. This vector hashing retrieval method is extremely fast, due to the indexing structure approach, rather than the more common shape comparison approach. On average, retrieval of a query shape from a database of 1965 shapes takes 0.7 seconds for the more reduced representation, and 2.8 seconds for the less reduced representation consisting of 1914 shapes. On the other hand, the indexing stage is somewhat slow, due to the region growing stage being run in Matlab. In conclusion, although future research could improve upon the presented system, the overall design is good for a moderately sized database. Expansions to accommodate a massive database could incorporate a hierarchal approach where a rough polygonal hashing approach filters the number of potential results, and a second level of polygonal hashing or other shape comparison approach could discriminate more finely between the potential results. The results show that the approach performs well, and that there is a substantial speed benefit for using the local association hashing method. Acknowledgment This work was funded in part by a fellowship from the Michigan Space Grant Consortium.

References [1] Y. Rui, T. Huang, and S. Chang, “Image retrieval: Past, present and future,” in International Symposium on Multimedia Information Processing, 1997. [2] A. K. dn Kikia Takagi, “Retrieval of satellite cloud imagery based on subjective similaity,” in Proc. 9th Scandinavian Conference on Image Analysis, 1995, pp. 449–456. [3] F. Acqua and P. Gamba, “Query-by-shape in meteorological image archives using the point diffusion technique,” IEEE Trans. Geoscience and Remote Sensing, vol. 39, no. 9, September 2001. [4] M. Datcu, K. Seidel, S. D’Elia, and P. G. Marchetti, “Knowledge-driven information mining in remotesensing image archives,” ESA Bulletin, no. 110, pp. 26–33, May 2002. [5] E. Jones and A. Roydhouse, “Intelligent retrieval of historical meteorological data,” AI Applications, vol. 8, no. 3, pp. 43–54, 1994. [Online]. Available: citeseer.ist.psu.edu/jones94intelligent.html [6] R. Davies, “MISR science goals and objectives – MISR’s study of clouds,” Jet Propulsion Laboratory. [Online]. Available: http://www-misr.jpl.nasa.gov/ introduction/goals3.html [7] A. Vellaikal, S. Dao, and C. J. Kuo, “Content-based retrieval of remote sensed images using a featurebased approach,” in Proc. of SPIE Visual Information Processing IV, 1995. [Online]. Available: citeseer.ist. psu.edu/article/vellaikal95contentbased.html [8] T.Bretschneider, “(RS)2I - retrieval system for remotely sensed imagery,” project Description. [Online]. Available: http://www.ntu.edu.sg/home/astimo/ [9] L. D. Bergman, V. Castelli, and C. Li, “Progressive content-based retrieval from satellite image archives,” D-Lib Magazine, October 1997. [Online]. Available: www.dlib.org/dlib/october97/ibm/10li.html [10] F. Artigas, R. Holowczak, S. Chun, J. Cho, and H.Stone, “An experimental study on content-based image classification for satellite image databases,” IEEE Trans. Geoscience and Remote Sensing, yearunknown, accepted. [Online]. Available: citeseer.nj. nec.com/holowczak01experimetal.html [11] E. D. Conway, “An introduction to satellite image interpretation – identifying cloud types in weather satellite imagery,” Maryland Space Grant Consortium, 1997. [Online]. Available: http://henry.pha.jhu.edu/ ssip/asat int/clouds.html

[12] X. Wu, “Satellite image processing and classification.” [Online]. Available: http://www.ualr.edu/ yxchan/WhitePaperrevise.htm [13] A. Kitamoto, “The development of typhoon image database with content-based search,” in Proceedings of the First International Symposium on Advanced Informatics, 2000, pp. 163–170. [14] United Kingdom, “Detection of volcanic ash by satellite,” in International Airways Volcano Watch Operations Group – First Meeting, March 2004. [15] A. Smeulders, M. Worring, S. Santini, A. Gupta, and R. Jain, “Content based image retrieval at the end of the early years,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 22, no. 11, December 2002. [16] A. Vailaya, Y. Zhong, and A. K. Jain, “A hierarchical system for efficient image retrieval,” in 13th International Conference on Pattern Recognition, August 1996. [17] T. Bretschneider and O. Kao, “A retrieval system for remotely sensed imagery,” in Proceedings of the International Conference on Imaging Science, Systems, and Technology, vol. 2, 2002, pp. 439–445. [18] M. Schroder, M. Walessa, H. Rehrauer, K. Seidel, and M. Datcu, “Gibbs random field models: A toolbox for spatial information extraction,” Computers and Geosciences, vol. 26, no. 4, 1999. [19] K. Koperski and G. Marchisio, “Multi-level indexing and gis enhanced learning for satellite imageries,” in Proc. International Workshop on Multimedia Data Mining, August 2000. [20] M. Li, I. K. Sethi, D. Li, and N. Dimitrova, “Region growing using online learning,” in Proceeding of the International Conference on Imaging Science, Systems, and Technology, June 2003. [21] A. Kolesnikov, “Efficient algorithms for vectorization and polygonal approximation,” Ph.D. dissertation, University of Joensuu, Finland, 2003. [22] N. Ramesh and I. K. Sethi, “A model based industrial part recognition system using hashing,” in Proc. 22nd Intl. Symposium on Industrial Robots, Intl. Robots and Vision Automation Conf., October 1991, pp. 37–51. [23] I. K. Sethi and N. Ramesh, “Local association based recognition of two-dimensional objects,” Machine Vision and Applications, vol. 5, pp. 265–267, 1992.