7 ; I
r- i
,.._h
Digital Information Retrieval Chabane DJERABA, Marinette BOUET m, -l-z 2 rue de la Houssini&e, BP 92208 - 44322 Nantes Cedex 3, France e-mail :
[email protected] Abstract In this paper, we present a digital information retrieval systemThe digital information (images) include both low Ieve features such as color, texture, shape and high level features such as spatial constraints of relevant regions and keywords. Features are powerfully represented. For example, shape features are encoded for scale, rotation and translation invariance. Based on object technology, the digital information features and behaviors are modeled and stored in a database. Images can be retrieved by examples (show me images similar to this image) or by selecting properties from pickers such as a sketched shape, a color histogram, a spatial constraint interface, a list of keywords and a combination of these. The integration of high and low level features in the object-oriented database is animportant property of our work.
1 Introduction Over the last years, research in content-basedretrievalof digital information has made significant progress. In the first step, they have rest&d in the development of systems that are text-based or attribute-based retrieval [Gud 951 such as MM1 [Gob 921, Chabot [Ogl95]. Typically, these systems depend on file identifiers, keywords or text associated with the images. And, although powerful, they - don’t allow queries based directly on the visual properties of the images, - are dependent on the particular vocabulary used, - don’t provide queries for images similar to a given image, and - the extraction process becomes tedious and increases rapidly the time cost of the database creation. In the second step, they have resulted in the development of systems that are based on image analysis, such as Virage [Jai 951 implemented on Object Store and commercialized by Virage Inc. Corporation, Wltimedia Manager commercialized by IBM, Visualseek [Smi 961, maf 951. (Sri 951, Qbic [Fal 931 commercialized by IBM. and the Visual Intelligence Blade from Illustra Information Technologies, Inc., Photobook pen 941, Trademark, and others. The image features are extracted automatically
or
semi-automatically,
using
segmentation
I’txnlissinnto ax&r digikdl~nrd copits ol’;dl or parl ol’this :nnIwial lbr pcrsonnlor cl:rssroou~ USCis gra~lisd\r itlmnt Cx provided thatthe capits arc no1made or distrihukd br prolil or conun&:d ndvnnkq. fht wpyright notice.1111: tills ol’~hcpohlicalionand its drill:appear.and noticeis giwn thnl copyrigJllis hy perksion ol’llx ACM Inc. To copy olhcnvise. to rqxd~lish.to posl on wwrs or lo redislrihull:to lisls. reqoirs specilic pcmk&n andfor I&
ClKhl
97 LmFegm Newnrh USI
Copyright
1997 ACM (I-89791-970-s/97/I I..SXZO
185
subsystems. Although, the automatic segmentation subsystems are interesting for specific application domains, they remain generally very difficult to apply in any application domain where the extracted regions have not real world meaning. Our long term objective is to : - integrate the text-based and image analysis approaches, in order to allow powerful queries based on both low (i.e. colors, textures, shapes) and high level features (real world concept) of images, - provide both similar image and user specification queries, - make the queries, and more generally the system framework extensible and independent of an application domain, - support knowledge (relationships between image features) extraction from data images, - support efficiently the previous functionalities in an object-oriented database. In this context, we designed and implemented an image retrieval system that considers two kinds of features. The first one may be extracted semi-automatically. It includes significant shapes (region-of-interest ROF) and keywords. The second one may be extracted automatically. It includes textures, colors, and spatial constraints. We introduced : - data model based on the object technology supported by an object-oriented database system, - a useful query interface, and - a domain-independent framework. An important difference of our system comparing to others, is the image queries of symbolic regions (Show me images that contain regions of such color, texture, shape with such spatial constraints). The considered regions are significant, so they have real world meaning (i.e. mountain, river, forest, sun, person, etc.). Another difference is the powerful mathematical representation and the real world meaning of the shape. A shape corresponds to a significant region, and its representation is based on powerful mathematical formulas (Fourier descriptors) that make it independent of the rotation, translation and scale. To the question “Show me the images that contain a cup shape” the system may return images that contain similar shapes independently of the scale, the rotation and the translation of the cup. The similar shapes may be bigger or smaller or in the same orientation or not as the user-specified shape, etc. Finally, the framework of the data model is based on object technology that makes it evolving and flexible. In this paper, we start by presenting the framework of the image retrieval system (section 2). The system consists of two parts, the first one extracts, models and stores image features that are based on object-oriented concepts, in the database (section 2.1). In this section, we highlight more particularly the significant regions and shapes, because they are important features of our model. The second one supports the user queries (section 2.2). Finally, we present some database queries and results (section 3).
. I
i
,
l
r’ I
..
‘i
... : ‘, *
2 Framework Designing a retrieval system based on image features involves Four primary issues : - Feature extraction. How can the content be extracted (automatically or semi-automatically) when an image is input to the database ? - Feature representation. How can an image feature be represented in terms of its properties ? (i-e, How can an object shape he represented in terms of its shape properties 7) - Retrieval method. How should the feafures in the database and their representations be organized to enabIe efficient search for access to features that are similar to a given query feature ? In other words, what indexing mechanism should be employed ? - Similarity measure. Given a representation scheme, how should any two features (i.e. two shapes) be compared. or matched ? What measures should be employed to determine the visual similarity of two features ?
In the second way, the user may insert a set of keywords that characterize, in his point of view, the image. After the storage of
images and keywords, the system extracts automatically the dominant texture and color of the image, and then the user, generally an expert of the application domain, may extract semiautomatically the shape of relevant regions, using a visual tool, The reIevant regions are pixel zones that have real world meaning in the perception of the user. When the user identifies and surrounds, by a closed border (shape), the real world (reIevant) regions, such as rivers, persons, animals, mountains, etc., the system extracts automatically their colors and textures, and stores the regions, their automatic extracted features {color, texture) and their semi-automatically extracted features (shapes, set of keywords) in the database, using the object-oriented model of the database managemeaystem 4.
Figure 3. A part of the object-oriented model
Figure 1. Framework
, s ^’ ., ‘, : ,,l, i I?.. , ,_a ,’ 8,. vi*‘.: I ,..‘_ , ‘,I ‘ : , .:‘: ‘, _ -.-:; /,
.I
,I’ I .;
:. ,: I
,,J,
-, d,
1
~. ,. ._ ;A:. .’ ..F , : ,.*a . . ‘;. “, ::.. .t,;‘-“.. t.. I’ .,, .,~ 1, .,., , ‘I. :; L -; :; * ;-:.:
. i.,,;:: ‘,I’
1
,_/ 1
1, :.:.; .* .).,. ‘., i_ :’ r 1 < .:.,;
/
,_ $’
i:
i
Our retrieval system answers to these four primary issues. It proposes a framework composed of two important components : feature representation and retrieval method (Figure 1.). In the first one, methods, aided by the user, identify relevant regions in images, and compute features describing color, texture and shape data of these regions. In the second one, images can be retrieved by selecting properties such as a color, asketched shape, a texture of image regions, or a combination of these. The system includes a visual query tool that lets users to form a query by painting, sketching and selecting textures, colors and shapes. Finally, the retrieval process computes distances between source and target features, and sorts the best similar images.
2.1 Featme representation The content of images are stored when the images are inserted in the colIection: The storage process takes two ways. In the first one, the image values are stored in files (i.e- gif, ipep;, -_ - ._. formats) outside the database. In the second one, the extracted content of images are inserted in the database. .liI I
I
‘:‘, ..,.“-_ j- ’ ” ’ ,_ ; :., ‘ 1:,: .,’ ,‘; I “3, I 3, ,, : e 4,_.
4 / -‘ ;” ,;*s ., : ‘j,‘f$rr.$:; , $5;:‘~.(&~:~i $.p&$3 j “5’ ...g&. ;,> ?..1-ge ,“$4” : ,$gj$; . . @g+&..;I Y@$!&<j ,:+. .sw..j ~~~~yj$~; *4 ‘p:y;:*,y,f{ ;~&;y li,,.:,h;-I;,::.. ;,:f;f:,:;‘:p$; y5,:.: ’, y .,.:,:* -3.. ; ._ ‘.,’ ,.i,1 1 ,, : >,‘ c,‘ , , , i,_I, ,’ ‘. ‘. ..I,, ,i I..< 8. ‘~.,< i I _.- : ‘,;, . i
shape using the visual tool. SecondIy, the content of the shape is automatically filled.
h-b%l-s+.arolnp-b.. *z==*
--J.-h$.=”
Figure 9. Extraction of the shape
Object-orientedmodeling in the database In the object oriented modeling, we have the class Shape which is characterized by a set of attributes and methods. The first attribute models the coordinates of a pixel in the original image, it locates the shape in the original image. The second attribute models the Freeman code in the form of a list of values. The third attribute modds the value (n) necessary to accurate the mathematical description of the shape. The fourth attribute models the length of Freeman code. The final attribute models a list I1 descriptors of Fourier. The methods return respectively the code of Freeman, Fourier descriptors, the value of n, and the length of Freeman code. Shapedoss claw Shape inheritObjectread type taple[codt~eeman: string. dcsctiproorforrsieclisr(oescriporFolrrr~, index: integer, Length-codefrctmnn: integer. x_slart: rnregec y_srorr:itrtegerJ
image. For example the region may have a favored direction, The detail of formulas are presented in [SLY 961. It is not R powerful texture representation, but may be interesting for retrieval process when mixing it with other features (color, shape, keywords). Object-orientedmodeling in the database In the first class, the texture of the region is represented by statistic formulas based on four moments (Ml, M2, M3, M4). In the object oriented modeling, we define the class TCXWC structured by a set of attributes that model the four stntlstle moments : Ml, M2, M3, M4 and the number of pixels of the region. The four moments are computed respectively by four methods Ml, M2, M3, M4. M1=~CCf(i,j),M,=fZ~~~i,j~y 1 i M3=+~~i2j2f(i,j) i i
1
I
' M,=$~~iJiJf(i,j)' f I
f(i, j) is the value of the pixel in i” line and j” column, In the second class, the texture of the region is represented by the histogram of the gray differences. We consider a distnnco as a segment between two points. The comer of the dircctlon of the distance and the length of the distance is dependent of the application domain. Based the histogram, the system computes the coarseness, the variance, the contrast and the dirccdonnlllyy. Tenure thus classTemre inheritObjectread &pe topk(histoJev&GREY: H~togmm_imager_lr~e~~rty,
osi-I:ristmdj, om:
188
IistfrdJ.
--~-
-‘.-
-A-
._ e>,
regions target and source. Before submitting the query, the user may choice the distance.
DS73: iist[real)) method public btit~histo:Hisrograo~~imogc_lcvei~rey, root_persistaace: boolean): Texture. publicsuppression&xtare end:
2.1.3 Color
Description The color is the third important image feature extracted automatically from an image or a region. In the first step of the extraction process, based on raster format, the region or image color is extracted and represented in the RGB model. In the second step, based on the RGB model, the color is transformed and represented in the HSV model, characterized by three means H (Hue - the color), S (Saturation - the vividness of the color) and V (Value - the brightness of the color). The HSV model is more suited than the RGB model, in which certain ambiguities appear between colors (such as Yellow and Green).
Figure 10. Extraction of coIors In theory, the results are different when using different distances, but our first experimentation showed that there are no significant differences between the first result images returned by the diverse distances. Indeed, the differences appear after the first result images (after the 510th image). Based on the sorted distances, the result images are ordered and sorted by similarity. Generally, we stop at the 15th result image, corresponding to the 15th image returned by the retrieval system.
HSVhistogmm cLass clars Histogmm&age~HSV inherit Histogmm read type taple[akmaia: list(Bio). mean-hue: real. ecart-vpe-hoe: real. moment~degre3~hue:real. moyeaae~satamtioa: real, ecan-Qpe-satumtion- ma.? moment_dcgre3~sataration:real, moyenne_value:real. ewrt_Iypc_voIue: real. moment-degm3-value: real) method privatemean. ptiate ecartsBpe. private moment.-degre3. public iairllicqueacy: lisqmar). aivaainr list(Bio). motpersistance: boolean): Histogram-image_HSV, public distaaceJl~histo(histo: Histogmm&ageJLYV): real, public distance~eaclidiaoJi(histo: Histogmm&ageeJrSV): real. public distaace~iotersem~oa~histo(histo:Hiiogmm-baageJSV): reaL public dismrrceJLmommtsQisro: Histogmm&zageJ5SV): mal. pablic di&pIay~histo~a~ha, publicsuppression-histo, pubbcdisplay~momeatr-stntirtiques. publicdicrancc_qundnucc_hhisro_originellrisro.Hisrogmm-baage>SY. matrksimilaty: list(listr[real))):real. public d&play-histo-graph, pablicmatrix~simila~JlSVz list(lkt(rea1)). public diuaace~qaadmticJdsto(histo: Histogmm&age~HSV. matrix~similaryrRst(list(rea1))):real end:
2.1.4 Spatial relations
The spatial relations modeled until now are not very powerful. They just locate the relations between image regions and compute the distance between two points of those regions. We consider each region in a minimal rectangle. The distance considers left and top points of each rectangle.
Figure 11. Color histogram
Object-oriented modeling in the database In the object-oriented modeling, we define two classes of coIors RGB and HSV. The first one, calkd RGB histogram is structured by three attributes red, green and blue and two methods. Each attribute takes as values an object structured by two attributes : the variation and the mean of the color. The two methods compute respectively for each color (red, blue or green) the mean and the variation of the color. The second cIass, called HSV histogram, contains a structure part that includes the histogram of colors and a set of distances, and amethod part that includes the methods that compute the attributes of the structure part. The colors of regions and images are quantified, so color histograms have the same cardinal. We associate to the color descriptions such as “ecti-type”, “‘moment-degree”, “mean” that may be helpful for more accurate color query. Each element of the histogram represents the number of pixels that have the suited color. So, comparing the colors of two regions is equivalent to compute the distance between the histogram of the
189
Spatial coastmti cblss cioss Gmstraiats~spatiiLr iahertt Object read oppc taple(xceate~ iategec yentec integer, x-MBR: iatege~ yJfBR: iategcr. length-MB& iateger. WidthJdBR: integer. focalization: string)
method public init: Coatmiatess,rporinlcs. pablicdkptkyfeahues. publIcsappr~s~a-w~~a~~-~a~a~, public resolatioa: integer. public o$set@idth: integer): integer, public dimcasioa-baflwidth: integer): integer, publicxmax integer, publicymax integer, pablicposition~XG: integer. YG: integer image: Image)
‘ ..I j
end:
The gravity center of the region is ( Xg, y, ) :
[
slander I
.‘I I .’
: -I
‘.‘i ; -,;-i _ ,; 1 ! . .:‘-’’2; :;; .._ > ,” , 1 ‘,;‘.‘,’ ! .. ;.. :,- ‘. -$;‘,:‘ i, ‘-$z;,, ;,,2, “:I,..-.I 55,._, &- 5: f +;~f;.&~; :( ‘L,..‘,\ ;-. .sz~Pg,,-: ,g?-#.;Q~~ ~, ?,~> 4 ‘:-uu3. . . . .‘1 .) IL.‘. Qz;r,x .‘If “g. 1
“:::,~,~~;~: { .,‘.$ .,: I.In IEEE Transaction on Pattern analysis and Machine Intelligence, July 1995. Jain R. “Infoscopes : Multimedia Information [Jai 951 Systems”, Virage Inc. and University of California at San Diego, 199.5. Mehrotra R. And Gary J. E. >,In IEEE Computer, pages
,’
‘,1!
1
Multlmedln
Information System )b.In Oroc. of VLDB-92.
$+~$g~
qJg$tg~’
Fargeaud P. B Retrieval and Extraction by Content
lm 971
4. Conclusion We presented a pIatform for the storage and the contentbased retrieval of images. The platform has resulted in a prototype with two major components : extraction and queries. We combine manual, semi-automatic and automatic approaches in order to extract image region features (color, texture, shape, spatial constraint, keywords). Regions, which are extracted semi-automaticaily, are significant and relevant for the final user. Shapes are represented by powerfid mathematical formulas, so the representation of the shape is independent of the rotation, scale and translation. The features extracted are modeled, using object technology, in the 0, database
192
57-62, September 1995. [Mou 903 Moulet a Segmentation process for both fixed and
I
animed images >>,PhD these, IRESTE, Nantes University, 1990.
i
V.E.Ogle, MStonebmker “Chabot : Retricvnl from ml 951 a Relational Database of Images”, IEEE Multimtdia (Septembrc 95), University of Califamia at Berkeley. Picard R. W., Sclaroff S. Pentland A., pen 941 x, in Proc. of SPIE-94, pages 34-47, Bcllingham,
i
Washington, 1994. Savory I. tr Retrieve Images by Textural Content % [Sav 961
I ’
Report of Master degree in Computer Sciences, Nnntcs University, 21 June 1996. Smith J. R., Chang S. F. t( Visual SEEK: n fully [Smi 961
!
automated
content-based
image
query
system u,
ACM
Multimedia’96, November 1996. Srihari R. K? u Automatic Indexing and Content[Sri 951 Based Retrieval of Captioned Images + In IEEE Computer, pages49-56, September 1995.
i
/ *