DRAFT Published as: Strasunskas, D. and Tomassen, S.L. (2009) ‘A Role of Ontology in Enhancing Semantic Search: the EvOQS Framework and its Initial Validation’, Int. J. of Knowledge and Learning, Vol. 4, No. 4. pp. 398-414
A Role of Ontology in Enhancing Semantic Search: the EvOQS Framework and its Initial Validation Darijus Strasunskas Dept. of Industrial Economics and Technology Management, Norwegian University of Science and Technology NO-7491, Trondheim, Norway
[email protected] http://folk.ntnu.no/dstrasun/
Stein L. Tomassen Dept. of Computer and Information Science, Norwegian University of Science and Technology NO-7491, Trondheim, Norway
[email protected] http://folk.ntnu.no/steint/ Abstract: Web search performance and efficiency is critical for many people and organizations in emerging knowledge society. Nowadays, ontologies are being applied in a number of ontology-based information retrieval systems in order to improve the performance of these systems wherein quality of ontology plays an important role. An important body of work exists in both information retrieval evaluation and ontology quality assessment areas. However, there is a lack of task- and scenario-based quality assessment methods. In this paper we discuss a framework to assess fitness of ontology for use in ontology-based search. We define metrics for ontology fitness to particular search tasks and metrics for ontology capability to enhance recall and precision. Further, we discuss preliminary results of an experiment showing applicability of the proposed framework and a value of ontology quality in ontology-driven Web search. Keywords: Ontology application, ontology quality, quality framework, information retrieval, semantic search.
1
Introduction
The Web is becoming a dominant information source for learning and acquiring new knowledge. Consequently, Web search is becoming one of the main means to access the required information, while the task of improving search performance is one of the most relevant for knowledge society. Some approaches are relying on semantic annotations (e.g., Yang (2006); Bergamaschi et al. (2007)) by adding additional metadata (Lytras & Sicilia, 2007); some are enhancing clustering of retrieved documents according to topic (e.g., Panagis et al., 2006); some are developing powerful querying languages (e.g., Bry et al., 2005). Therefore, there are a lot of efforts devoted to research on improvement of information retrieval (IR) by the help of ontologies that encode domain knowledge (Castells et al., 2007; Suomela & Kekalainen, 2005; Tomassen, 2008), i.e. by so called advanced knowledge technologies (Kalfoglou, 2007). The literature reports on improvement of search using ontology-based information retrieval tools (e.g., Suomela & Kekalainen (2005); Aitken & Reid (2000)), as well indicates that inexperienced users find ontology helpful in comprehending domain, familiarizing themselves with the terminology and formulating queries (Suomela & Kekalainen
2005; Brasethvik, 2004). In such cases, visualization of the ontology is a certain quality of an ontology-base search system, but that concerns graphical user interface, not ontology itself. In addition, it was found that linguistic enhancements (inclusion of synonyms) close the gap between ontology concepts and document text (Aitken & Reid, 2004; Brasethvik, 2004), and enable the ontology to perform better for queries that are required to find only a small number of documents. However, there is a lack of systematic investigation on what ontology features enhance or impair search performance. From another hand, the ontology's ability to capture the content of the universe of discourse at the appropriate level of granularity and precision and offer the application understandable correct information are important features that are addressed in many ontology quality frameworks (e.g., Burton-Jones et al., 2005; Alani et al., 2006; Gangemi et al., 2005). However, ontologies are not fixed specifications but always depend on the context of use. There are many different criteria proposed for ontology evaluation, but in order "to be meaningful and relevant, criteria need to be connected to scenarios of use, and these scenarios to be explained and further analyzed need to be connected to activity models" (Giboin et al., 2002). Therefore, the evaluation of the ontology also needs to take into account usage scenarios as well as the behaviour of the application. Therefore, the objective of this paper is to analyze ontology quality role in ontologydriven Web search application. Web search could be characterized by having a focus on retrieving documents rather than browsing knowledge or answering a question. This ontology application limits ontology quality aspects that we are considering and analysing here. Typically, subclass hierarchies are considered to be sufficient for document retrieval and any further ontology specification (properties and axioms) are required only for knowledge browsing and question answering (Gulla et al., 2007). However, here we consider a whole spectrum of ontology elements and show that ontology quality improvement (by specifying equivalent and disjoint classes, adding instances and properties) can significantly improve search results. The rest of the paper is structured as follows. First we briefly review related work that comes from two main areas, ontology-driven information retrieval and ontology quality. Second, we present a revised framework for Evaluation of Ontology Quality for Search (EvOQS) by Strasunskas & Tomassen (2007). Then we elaborate on an experiment and evaluation of ontology quality aspects and their role in search performance. The main results are presented and discussed. Finally, we conclude the paper and outline future work.
2
Related work
An increasing number of recent information retrieval systems make use of ontologies to help the users clarify their information needs and expand users' queries. First, we summarize ontology-driven information retrieval (OdIR) methods, taking a closer look what role ontology plays in the methods proposed. Second, we recapitulate the state-of-the-art in ontology quality evaluation. 2.1 Ontology-driven Information Retrieval In general, ontology-driven information retrieval could be characterized either as built on top of Knowledge base or built on top of vector-space machine (i.e. a conventional search engine). Correspondingly, they have different target, the former has a target to answer questions and browse the knowledge, while the latter is focused on improving large-scale search results. Below we detail on those two types of OdIR systems, while for more comprehensive overview the keen reader is referred to (Mangold, 2007). Knowledge Base based OdIR. These approaches use reasoning mechanism and ontology querying languages to retrieve instances form Knowledge Base. There, documents are treated either as instances or are annotated using ontology instances (Kiryakov et al., 2004, Paralic & Kostial 2003, Rocha et al. 2004, Song et al. 2005), i.e. there focus is on retrieving instances rather than documents. Main disadvantages of these approaches are as follows, use of formal ontology querying languages straitens their adoption by inexperienced users; requires annotation of web resources - the process is tedious and results may be misused by the content providers for the purpose of giving the documents a mis-
leading higher ranking by the search engines. These characteristics make the KB based approaches problematic for large-scale web search. Integrated with vector space model. These approaches combine OdIR with already traditional vector space model. Some start with semantic querying using ontology query languages (e.g. SPARQL, RDQL, OWL-QL) and use resulting instances to retrieve relevant documents using vector space model (Castells et al. 2007, Kiryakov et al., 2004, Nagypal, 2007; Tomassen, 2008). Where Castells et al. (2007) use weighted annotation when associating documents with ontology instances. The weights are based on the frequency of occurrence of the instances in each document, i.e. term frequency. Nagypal (2007) combines ontology usage with vector-space model by extending a non-ontological query. There ontology is used to disambiguate queries. Simple text search is run on the concepts' labels and users are asked to choose the proper term interpretation. 2.2 Evaluation of Ontology Quality An important body of work exists in ontology quality assessment area (e.g., (BurtonJones et al., 2005; Gangemi et al., 2005; Lozano-Tello & Gomez-Perez, 2004, Hu et al., 2007)). Most of them aim at defining a generic quality evaluation framework and, therefore, do not take into account specific application of ontologies. For instance, the Ontometric (Lozano-Tello & Gomez-Perez, 2004) methodology defines Reference Ontology that consists of metrics to evaluate ontology, methodology, language and tool (used to develop ontology) - 117 metrics in total. The OntoQA framework (Tartir et al., 2005) is proposed to evaluate ontologies and knowledge bases. There metrics are divided into two categories: schema metrics and instance metrics. The first category of metrics evaluates ontology design and its potential for rich knowledge representation. The second category evaluates the effective usage of the ontology to represent the knowledge modelled in ontology. Analysis of the literature shows that ontologies are typically examined according to five aspects: syntax, vocabulary, structure, population of classes and usage statistics. Where evaluation of syntax checks whether an ontology is syntactically correct. This quality aspect is most important in any ontology-based application, since syntactic correctness is a prerequisite to be able to process an ontology. Syntactic quality is a central quality aspect in most quality frameworks (e.g., Burton-Jones et al. (2005); Lozano-Tello & Gomez-Perez (2004)). Cohesion to domain and vocabulary. Congruence between an ontology and a domain is another important aspect in ontology quality evaluation. There ontology concepts (including taxonomical relations and properties) are checked against terminology used in the domain. In the OntoKhoj approach (Patel et al., 2003) ontologies are classified into a directory of topics by extracting textual data from the ontology (i.e. names of concepts and relations). Similarly, Brewster et al. (2004) extracted a set of relevant domainspecific terms from documents. The amount of overlap between the domain-specific terms and the terms appearing in the ontology is then used to measure the fit between the ontology and the corpus. Similar lexical approach is taken in EvaLexon (Spyns & Reinberger, 2005) where recall/precision type metrics are used to evaluate how well ontology triples were extracted from a corpus. Burton-Jones et al. (2005) define a metric called accuracy that is measured as a percentage of false statements in the ontology. Structural evaluation. Structural evaluation deals with assessment of taxonomical relations vs. other semantic relations, i.e. the ratio of IsA relationships and other semantic relationships in ontology is evaluated. Presence of various semantic relationships would identify the richness of ontology. In OntoSelect (Buitelaar et al., 2004) a metric, called structure, is used. The value of the structure measure is simply the number of properties relative to the number of classes in the ontology. Similarly, Density Measure defined in (Alani et al., 2006) indicates how well a given concept is defined in the ontology. While relationship richness (Tartir et al., 2005) reflects on the diversity of relations and placement of relations in the ontology. Population of classes. This quality aspect is based on instance related metrics. Tartir et al. (2005) define class richness that measures how instances are distributed across classes. The amount of classes having instances is compared with the overall number of classes. Average population (Tartir et al., 2005) indicates the number of instances compared to the number of classes. It is used to determine how well the knowledge base has been populated.
Table 1. Summary of main approaches to ontology evaluation Quality framework
AKTiveRank (Alani et al. 2006) OntoKhoj (Patel et al., 2005) Ontometric (Loxano-Tello & Gomez-Perez, 2004) OntoQA (Tartir et al., 2005) OntoSelect (Buitelaar et al., 2004) oqval (Gangemi et al., 2005) Semiotic metrics (Burton-Jones et al., 2005) Swoogle (Ding et al., 2004)
Syntax evaluation
Domain cohesion
X X
Structural evaluation
Population of classes
Usage statistics
X X
X X X X
X
X
X
X
X X
Usage statistics and metadata. Evaluation of this aspect focuses on the level of annotation of ontologies, i.e. the metadata about an ontology and its elements. There are defined three basic levels of usability profiling in (Gangemi et al., 2005) as follows. Recognition annotations take care of user-satisfaction, provenance and versioning information; efficiency annotations deal with application-history information; and the last level is about organizational-design information. Burton-Jones et al. (2005) define similar metrics, namely, relevance assesses the amount of statements that involve syntactic features marked as useful or acceptable to the user/agent; history accounts for how many times a particular ontology has been accessed relatively to other ontologies. Furthermore, the Swoogle approach (Ding et al., 2004) ranks retrieved ontologies based on references between them. Analogical metric to Swoogle's is defined in (Burton-Jones et al., 2005) and is called authority - i.e. how many other ontologies use concepts from this ontology. Table 1 summarizes ontology evaluation approaches with respect to the five aspects discussed above. In summary, cohesion to domain terminology, measured as a direct match of the vocabulary used to denote concepts in the ontology with a terminology used in text corpora, has positive impact on overall OdIR performance. Lexical fit allows better adoption of an ontology, both from user and document collection perspectives. Evaluation of a structural aspect determines richness of ontology, therefore it is important for KB and vector-space model based OdIR. Consequently, some of the above discussed metrics and criteria are applicable and feasible to assess capability of ontologies to enhance information retrieval. However, there is a lack of a systematic framework to assess fitness of ontologies for a particular search strategy and/or OdIR approach. Adequate optimality criteria should be selected to enable quality estimation of OdIR. These measures should be related to the users' information needs.
3
A Framework for Evaluation of Ontology Value in Search Application
In this section we present the EvOQS (Evaluation of Ontology Quality for Searching) framework including functional steps and assessment criteria as defined in Figure 1. The framework defines a stepwise ontology selection procedure and metrics. Ontology quality aspects are defined with respect to the search tasks and search enhancement requirements. It consists of three steps as follows. Step 1. Generic quality evaluation. This initial step concerns filtering out poor quality (i.e. syntactically incorrect) and irrelevant ontologies. More detail account on this step is provided in subsection 3.1. Step 2. Search task fitness. This step concerns evaluation of ontology fitness for a particular search task. Typical search tasks were discussed in section 3.2. For instance, ratio of taxonomic vs. non-taxonomic relationships is important when selecting an appropriate ontology for exploratory and comprehensive search tasks. For more detail the reader is referred to subsection 3.2. Step 3. Search enhancement capability. This final step in our framework concerns evaluating vocabulary of ontologies. Here we account for availability of internal lexical resources in ontologies, i.e. presence of specified synonyms, alternative labels that might potentially be used for a query expansion. More detail account on this step is given in
subsection 3.3.
Figure 1. The EvOQS framework for ontology fitness in Web search
3.1 Generic Quality Evaluation This step evaluates syntactic correctness and domain fitness. For the syntactic correctness we define a trivial measure (Eq. 1). SC=λ
1 E
(1)
Where, E is the number of error messages generated by a parser, and λ ∈ Δ and Δ is a set of OWL sub-language1 preference weights, i.e. λ={0.0; 0.5; 1.0}. For instance, based on a particular implementation of OdIR, OWL DL might be a preferable ontology language, though OWL Lite would be second choice. Correspondingly, an ontology in OWL DL is given a preference weight λ =1.0; OWL Lite, λ=0.5; and OWL FULL, λ=0.0. Furthermore, these coefficients can be related to a particular search task. For instance, an ontology specification in a form of subject hierarchy/taxonomy is enough to support an exploratory search task (for more details, see next subsection), therefore, an ontology specified in OWL Lite is an appropriate for this task. While for the domain fitness substep we adopt the AKTiveRank algorithm (Alani et al., 2006). 3.2 Search Task Fitness We have adopted a classification of search tasks into three categories, such as factfinding, exploratory, and comprehensive search tasks (Aula, 2003). In fact-finding, a precise set of results is important, while the amount of retrieved documents is less important. In exploratory search task, the user wants to obtain a general understanding about the search topic, consequently, high precision of the result set is not necessarily the most important thing, nor is high level of recall (Aula, 2003). Finally, a concern of comprehensive search task is to find as many documents as possible on a given topic, therefore the recall and precision should be as high as possible. Here we discuss what ontology features are needed to support these tasks. Fact-finding. Here, high precision can be achieved by using precise terms or phrases in the query, and typically, by formulating a query consisting of several terms. In order to enhance results in fact-finding search task, provided concepts need to be extended by their instances, object and datatype properties. Exploratory search. Here, the user may find topic-related documents by extending simple keyword-based search with subclass concepts. Comprehensive search. In order to cover broader-topic hypernyms, sibling concepts and object properties are included in the query (in addition to hyponyms), to cover the most important aspects of the search topic. Based on above discussion we define metrics to measure ontology fitness for a particular search task. The metrics are defined for a cluster of concepts (i.e. a fragment of ontology). Values computed for a cluster can be used to assess a particular query (i.e. concepts used to formulate a query), or a notion of cluster can be extended to a whole ontology. Consequently, evaluation of an ontology would indicate a general fitness of an ontology for a particular search task, while computed metrics for a cluster would allow evaluating an ontology-based query more rigid with regards to its impact on search re-
1
http://www.w3.org/TR/2004/REC-owl-guide-20040210/#OwlVarieties
sults. In other words, ontology evaluation should be used to pre-select existing ontologies, while a cluster (query) evaluation is useful for a thorough analysis of search results. We define a cluster being a set of concepts of interest (e.g., used in a query to specify information needs). In evaluation of the cluster we investigate the level of domain knowledge specified about a concept in the cluster, i.e. direct relationships (object and datatype properties, super- and sub-class relations) and associated instances. Namely, we define a coefficient for cluster's Fact Finding Fitness (FFF): FFFcl =α
Icl Ccl
+β
(2)
OPcl + DPcl Ccl
Where, I is the number of instances associated with concepts in a cluster (cl), OP and DP are OWL constructs owl:ObjectProperty and owl:DatatypeProperty, correspondingly. Here α, β are adjustment weights specific to implementation. Their purpose is discussed later. Fitness of ontology for exploratory search task is defined as an arithmetic average of subclass concepts associated with the concepts in a cluster under evaluation. The Exploratory search task Fitness (EXF) coefficient is defined in Eq. 3. EXFcl =
(3)
SubCcl Ccl
Where, SubC is the number of subclasses specified for concepts in a cluster (cl) or, eventually, defined in an ontology. Finally, ontology fitness for comprehensive search task is defined by the Comprehensive search task Fitness (COF) coefficient in Eq. 4. COFcl =
β OPcl +α ( SupCcl + SubCcl + SibCcl
)
(4)
Ccl
Where, C is the number of concepts in a cluster (cl), as above. OP is the number of object properties for the concepts in the cluster, and SupC, SubC and SibC are amount of super-, sub- and sibling concepts for a particular concept, respectively. 3.3 Search Enhancement Capability In order to improve the result of search, query expansion is typically used, where a query is refined to improve both, recall and precision. To enhance precision the following ontology elements are handy to be used in query expansion: sub-concepts, object properties and concepts related by those properties; disjoint concepts (to be used with Boolean operator NOT). Those elements provide lexical resources for enhancement of precision. While in order to enhance recall, we may use super-, sub- and sibling concepts, their instances, synonyms (specified by the rdfs:label construct), closely related concepts (by object properties). This set of ontology elements would serve as a set of lexical entries providing related terms. Consequently, we define a recall enhancement capability (REC) that shows average amount of synonyms and related terms specified for concepts in a cluster (or ontology) (see Eq. 5). RECcl =α
Lcl + eCcl Ccl
+β
(5)
uOcl + iOcl Ccl
Where, L=rdfs:label, eC=(owl:equivalentClass + owl:sameAs), iO=owl:intersectionOf, C=owl:Class, uO=owl:unionOf, and α, β are adjustment weights specific to implementation. A precision enhancement capability (PEC) is defined as follows (see Eq. 6). PECcl =α
cOcl + dWcl + uOcl + iOcl Ccl
+β
OPcl + DPcl
(6)
Ccl
Where, iO=owl:intersectionOf, cO=owl:complementOf, uO=owl:unionOf, C=owl:Class, dW=owl:disjointWith, and α, β are adjustment weights.
However, applicability of the above defined metrics depends a lot on a particular implementation of OdIR. Therefore, we include adjustment weights2 (α+β=1) to tailor metrics (by specifying preferable OWL constructs) to a particular implementation. Furthermore, all coefficients are normalized to fall into range [0..1]. A prototype of the EvOQS framework has been implemented in Java using OWL API3.
4
Experiment
For the assessment of ontology quality role in an ontology-driven search application we have conducted an experiment with four different ontologies (different domains) and two different versions of each of the ontologies. The experimental settings are detailed as follows. 4.1 Web information search tool For the experiment we have used the WebOdIR system4 (see Figure 2) that is an ontology-driven Web search system (Tomassen, 2008). There users can specify one or more concepts related to a domain of interest when formulating a query. In addition, it is possible to specify a set of keywords to narrow the search even further. A Web user interface is meant to be as simple as possible to use for the end user.
Figure 2. An interface of WebOdIR
In WebOdIR ontologies are used extensively in the search process. Each concept specified in an ontology is extended with a feature vector (fv) that relates the concept to a terminology used on a particular domain (a document collection). The feature vectors are constructed as follows. Documents for each concept and individual (including their neighbours) specified in an ontology are retrieved, and then the documents are clustered to group those documents having high similarity. For each cluster a set of candidate terms are created, terms are ranked and finally feature vectors are associated with the concept (for more details see (Tomassen, 2008)). In a current implementation, both concepts and instances are available for users to express their queries. The prototype used the Yahoo! Web Search API as the backend search engine. 4.2 Experiment settings and materials The participants in our experiment were mainly 4th year students. There were 21 subjects that participated; they were offered payment for used time after full completion of the experiment. The experiment consisted of two parts. The first part included formulating search queries for both WebOdIR and Yahoo. The participants were presented four domains with two topics of interest for each domain (see Table 4). They had to formulate in total 16 queries, eight to be submitted to WebOdIR and eight to Yahoo. The participants were divided into two groups that used different ontologies for the same domain. The first group used the original ontology while the second group used a modified version of 2
See, equations 2, 4, 5 and 6. http://owlapi.sourceforge.net/ 4 Prototype of WebOdIR, http://129.241.110.220 3
the original ontology. The original ontology was modified to include more relations and/or instances to see if this would influence the search results. Different feature vectors were generated as the result of modifications in ontologies. Group 1 contained 10 participants, while group 2 had 11 participants. In total users executed 81 query using the original ontologies and 92 queries using the modified ontologies, and 152 queries were simple keyword based executed directly to Yahoo. However, here we focus on search performance analysis using different quality of ontologies, therefore only ontology-based search is analyzed further. The keen reader is directed to (Tomassen, 2008) for comparison of Yahoo and WebOdIR. Table 2. Search tasks given to participants of the experiment Search topic Search task classification and task description id Food & Wine domain (http://www.w3.org/2001/sw/WebOnt/guide-src/wine.owl and integrated with http://www.w3.org/ 2001/sw/WebOnt/guide-src/food.owl) 1. Explorative search task. Imagine that you are going to prepare a dinner for tonight. You plan to make beef curry and would like some wine to drink with this meal. Find out what grapes are used for suitable wines to this meal. 2. Fact-Finding search task. Imagine that you are going to prepare a dessert as well. The main component of this dessert is chocolate but also contains some sweet fruits. You would like to find the perfect dessert wine but don't know which, try to find it. Travel domain (http://protege.cim3.net/file/pub/ontologies/travel/travel.owl) 3. Comprehensive search task. Imagine that you are going on a vacation and would like to try a safari. You don't know yet which country or what kind of safaris you would like. Try to get an overview of the kind of safaris that are available. 4. Fact-Finding search task. Suppose that you would like to see leopards and have decided to go on a leopard safari but don't know where. Explore the possibilities for a leopard safari. Animal domain (http://nlp.shef.ac.uk/abraxas/ontologies/animals.owl) 5. Explorative search task. Imagine that you should write an article about jaguars but don't know very much about jaguars. Try to find some facts about jaguars. 6. Comprehensive search task. Imagine that you would also like to write an article about jaguars and leopards and similar kind of cats. Try to get an overview of the cat family. Autos domain (http://gaia.isti.cnr.it/~straccia/download/teaching/SI/2006/Autos.owl) 7. Fact-Finding search task. Imagine that you have heard that the neighbour has bought a new car of the brand Saturn. Further, imagine that you have never heard of this brand before. Try to find some facts about this brand. 8. Comprehensive search task. Suppose your neighbour has recently bought a beautiful new car. Therefore, you would like to impress your neighbour as well buy getting a bigger car, an SUV. However, you don't know much about cars; try to get an overview of what SUVs are.
The participants needed to mark each of top 10 retrieved documents according to perceived relevance. The relevance score for each query has been calculated using the following equation: Scoreq =
1 10 ∑ PD ×PP 2 i=1 i i
(7)
Where PDi is an individual score for document Di, and PPi - the weighting factor for position Pi. Score for document is as follows: -1 for trash; 0 for non-relevant or duplicate; 1 - related; and 2 - good document. Document ranking position has weights as follows: 1st - 20; 2nd - 15; 3rd - 13; 4th - 11; 5th - 9; 6th & 7th - 8; 8th & 9th - 6; 10th - 4. Consequently, the final score falls into a range [-50, 100]. The relevance score substitutes a conventional precision metric. We have decided to focus on precision instead of recall since we targeted Web search, where precision (i.e. relevant documents at top positions) is more important. Consequently, we focus to validate the metrics defined in the second and partially the third steps of the EvOQS framework during this experiment. The first step (generic quality evaluation) has been conducted manually when preparing for the experiment and, furthermore, is outside of the scope of this paper.
5
Results
All four ontologies were modified by adding instances (all ontologies), specifying additional object properties (travel, animal and wine ontologies) and introducing equivalent classes (animal and autos ontologies). Difference in ontology fitness metrics and precision enhancement capability is displayed in Table 65,6. Consequently, comparing relevance scores for the original ontologies vs. the modified ones, we have found an improvement in mean score that equals to 10.6% (overall mean relevance for original ontologies score 42.1 vs. 46.6 for modified ontologies), see Table 5 and Figure 3 for comparison per search topic. Table 5. Average scores depending on ontology version
Table 6. Normalized values of the EvOQS metrics for search queries
As we can see from Figure 4, the changes in ontology have resulted in difference of the corresponding metrics. Decrease of metrics value is attributed to the heterogeneous queries specified by users (i.e. different concepts used). Increase/decrease of ontology quality (i.e., quality of a concept clusters used in search) had a corresponding effect on the search results, with exception of topic 4. This can also be attributed to a variance between user perception of what is a relevant document, since no explicit instance "leopard safari" has been added, just indirect instances as "Africa big 5 safari" etc., therefore some results could have been perceived as irrelevant. Consequently, this caused bigger variance between participants, i.e. using ontology version 1 mean score was 71.4, st.dev. 14.7, variation coefficient 21%, while usage of ontology version 2 resulted in a mean score of 63.0, st.dev 25.2 and coefficient of variance 40%. In general, inclusion of more instances and object properties has improved the mean relevance score of fact-finding search tasks, while addition of disjoint and equivalent concepts resulted in better performance of exploratory and comprehensive tasks (see Figure 4 a)). However, the least increase in search performance has been observed in a comprehensive search tasks (topics 3, 6 & 8). There performance has decreased on topics 3 and 8, and only because of dramatical increase of result in topic 6, the overall improvement has been achieved. This can be attributed to the significantly shorter concept-based queries used by participants in Group 2 (i.e. ones using modified ontologies), i.e. 17% and 39% shorter queries in the topic 3 and 8, respectively.
5
4th row in the table contains values of a fitness measure corresponding to the search task Values are normalized and computed as an average value of concept-based search queries 6
Figure 3. Comparison of ontology quality and search performance
Figure 4 b) visualizes results based on amount of concepts/instances used in queries. The concepts in the modified ontologies contained more precise knowledge specified about them (e.g. disjoint subclass relations), that helped to better discriminate the retrieved documents and improve the mean of relevance scores. 65.00 60.00
60.00 55.00 45.00
Score
Score
50.00 40.00 35.00 30.00 25.00
55.00 50.00 45.00 40.00 35.00 30.00 25.00 20.00
20.00
Comrehensive
Ontology version 1
Explorative
FactFinding
1
Ontology version 2
a) Mean score per search type
2
Ontology version 1
3
5
Ontology version 2
b) Mean score per amount of ontology elements (concepts and instances) in query
Figure 4. Analysis of ontology quality impact on the mean of relevance score
6
Conclusions and Future Work
The Web is a constantly increasing repository of knowledge that is vital for the evolving knowledge society. Therefore, in this article we have focused on improvement of access to this repository through search, namely we have investigated what properties of ontology can enhance performance of ontology-driven Web search. In this article we have proposed the EvOQS framework to assess ontology fitness and capability to improve ontology-based search. The framework consists of three functional steps that guide in selecting appropriate ontology for a particular search task. In summary, first step filters out syntactically incorrect and irrelevant ontologies. Second step classifies ontologies according their fitness for a particular search task. Whereas the last step classifies ontologies based on their characteristics to enhance recall and precision. Moreover, in this article we have discussed preliminary results of the experiment showing how different ontology quality aspects can improve ontology-driven Web search performance. The experiment has shown that a slight modification of ontologies can significantly improve the performance. However, a more controlled experiment should be conducted, since in the current experiment we have allowed the participants to interpret the task and freely construct the query and once again to interpret the relevance of retrieved documents. Consequently, we have found a quite big variance between users' assessments.
References Aitken, S., Reid, S.: Evaluation of an ontology-based information retrieval tool. In Gomez-Perez, A. et al., eds.: Workshop on the Applications of Ontologies and Problem-Solving Methods, ECAI 2000, Berlin (2000) Alani, H., Brewster, C., Shadbolt, N.: Ranking ontologies with AKTiveRank. In: ISWC 2006. LNCS 4273, Springer-Verlag (2006) 1-15 Aula, A.: Query formulation in Web information search. In: Proceedings of IADIS Int. Conf. WWW/Internet, IADIS (2003) 403-410 Bergamaschi, S., Bouquet, P., Giazomuzzi, D., Guerra, F., Po, L., Vincini, M.: An Incremental Method for the Lexical Annotation of Domain Ontologies. Int. J. on Semantic Web and Information Systems 3(3) (2007) 57-80 Brasethvik, T.: Conceptual modelling for domain specific document description and retrieval- An approach to semantic document modelling. PhD thesis, NTNU, Trondheim, Norway (2004) Brewster, C., Alani, H., Dasmahapatra, S., Wilks, Y.: Data driven ontology evaluation. In: Proceedings of Int. Conf. on Language Resources and Evaluation, Lisbon, Portugal (2004) Bry, F., Koch, C., Furche, T., Schaffert, S., Badea, L., Berger, S.: Querying the Web Reconsidered: Design Principles for Versatile Web Query Languages. Int. J. on Semantic Web and Information Systems 1(2) (2005) 1-21 Buitelaar, P., Eigner, T., Declerck, T.: OntoSelect: A dynamic ontology library with support for ontology selection. In: Proceedings of the Demo Session at ISWC 2004, Hiroshima, Japan (2004) Burton-Jones, A., Storey, V., Sugumaran, V., Ahluwalia, P.: A semiotic metrics suite for assessing the quality of ontologies. Data and Knowledge Engineering 55(1) (2005) 84-102 Castells, P., Fernandez, M., Vallet, D.: An adaptation of the vector-space model for ontology-based information retrieval. IEEE Transactions on Knowledge and Data Engineering 19(2) (2007) 261-272 Ding, L., Finin, T., Joshi, A., Pan, R., Cost, R.S., Peng, Y., Reddivari, P., Doshi, V., Sachs, J.: Swoogle: A search and metadata engine for the Semantic Web. In: Proceedings of CIKM 2004, ACM Press (2004) 652-659 Gangemi, A., Catenacci, C., Ciaramita, M., Lehmann, J.: Ontology evaluation and validation. An integrated formal model for the quality diagnostic task. Technical report, ISTC-CNR, Trento, Italy (2005) Giboin, A., Gandon, F., Corby, O., Dieng, R.: Assessment of ontology-based tools: a step towards systemizing the scenario approach. In: EON2002 workshop. (2002) Gulla, J., Borch, H., Ingvaldsen, J.: Ontology learning for search applications. In: ODBASE 2007. LNCS 4803, Springer-Verlag (2007) 1050-1062 Hu, B., Dasmahapatra, S., Lewis, P.: Semantic metrics. Int. J. of Metadata, Semantics and Ontologies 2(4) (2007) 242-258 Kiryakov, A., Popov, B., Terziev, I., Manov, D., D.Ognyanoff: Semantic annotation, indexing, and retrieval. Journal of Web Semantics 2(1) (2004) 49-79 Kalfoglou, Y. Knowledge society arguments revisited in the semantic technologies era. Int. J. of Knowledge and Learning 3(2/3) (2007) 225 – 244 Lozano-Tello, A., Gomez-Perez, A.: Ontometric: A method to choose appropriate ontology. Journal of Database Management 15(2) (2004) 1-18 Lytras, M.D., Sicilia, M-A.: Where is the value in metadata? Int. J. of Metadata, Semantics and Ontologies 2(4) (2007) 235-241 Mangold, C.: A survey and classification of semantic search approaches. Int. J. of Metadata, Semantics and Ontologies 2(1) (2007) 23 - 34 Nagypal, G.: Possibly imperfect ontologies for effective information retrieval. PhD thesis, University of Karlsruhe (2007) Panagis, Y., Sakkopoulos, E., Garofalakis, J. and Tsakalidis, A. Optimisation mechanism for web search results using topic knowledge. Int. J. Knowledge and Learning 2(1/2) (2006) 140–153 Paralic, J., Kostial, I.: Ontology-based information retrieval. In: Proceedings of the 14th Intl. Conf. on Information and Intelligent systems (IIS 2003), Varazdin, Croatia (2003) 23-28
Patel, C., Supekar, K., Lee, Y., Park, E.: OntoKhoj: A Semantic Web portal for ontology searching, ranking and classification. In: Proceedings of the Workshop on Web Information and Data Management, ACM Press (2003) 58-61 Rocha, C., Schwabe, D., de Aragao, M.: A hybrid approach for searching in the Semantic Web. In: Proceedings of WWW 2004, ACM Press (2004) 374-383 Spyns, P., Reinberger, M.L.: Lexically evaluating ontology triples generated automatically from texts. In: Proceedings of ESWC 2005. LNCS 3532, Springer-Verlag (2005) 563-577 Strasunskas, D., Tomassen, S.: Web search tailored ontology evaluation framework. In: Advances in Web and Network Technologies, and Information Management. LNCS 4537, SpringerVerlag (2007) 372-383 Song, J.F., Zhang, W.M., Xiao, W., Li, G.H., Xu, Z.N.: Ontology-based information retrieval model for the Semantic Web. In: EEE 2005, IEEE Computer Society (2005) 152-155 Suomela, S., Kekalainen, J.: Ontology as a search-tool: A study of real user's query formulation with and without conceptual support. In: Proceedings of ECIR’2005. LNCS 3408, SpringerVerlag (2005) 315-329 Tartir, S., Arpinar, I., Moore, M., Sheth, A., Aleman-Meza, B.: OntoQA: Metric-based ontology quality analysis. In: IEEE Workshop on Knowledge Acquisition from Distributed, Autonomous, Semantically Heterogeneous Data and Knowledge Sources, Houston, TX, USA, IEEE Computer Society (2005) 45-53 Tomassen, S.: Document Space Adapted Ontology: Architecture and Evaluation. In: Proceedings of 1st World Summit on the Knowledge Society. LNCS, Springer-Verlag (2008) in press. Yang, H-C. A method for automatic construction of learning contents in semantic web by a text mining approach. Int. J. Knowledge and Learning 2(1/2), (2006) 89–105