Semantic Characterizations of Navigational XPath - CiteSeerX

Report 2 Downloads 77 Views
Semantic Characterizations of Navigational XPath Maarten Marx and Maarten de Rijke Informatics Institute, University of Amsterdam Kruislaan 403, 1098 SJ, Amsterdam, The Netherlands {marx,mdr}@science.uva.nl

Abstract We give semantic characterizations of the expressive power of navigational XPath (a.k.a. Core XPath) in terms of first order logic. XPath can be used to specify sets of nodes and sets of paths in an XML document tree. We consider both uses. For sets of nodes, XPath is equally expressive as first order logic in two variables. For paths, XPath can be defined using four simple connectives, which together yield the class of first order definable relations which are safe for bisimulation. Furthermore, we give a characterization of the XPath expressible paths in terms of conjunctive queries.

1

Introduction

XPath 1.0 [17] is a variable free language used for selecting nodes from XML documents. XPath plays a crucial role in other XML technologies such as XSLT [21], XQuery [20] and XML schema constraints [19]. The recently proposed XPath 2.0 language [18] is much more expressive. It contains variables which are used in ifthen-else, for, and quantified expressions. The available axis relations are the same in both versions of XPath. What is missing at present is a clear characterization of the expressive power of XPath, be it either semantical or with reference to some well established existing (logical) formalism. As far as we know, Benedikt, Fan and Kuper [2] were the first and only to give characterizations, but only for positive fragments of XPath, and without considering the sibling axis relations. Their analysis can rather simply be expanded with the sibling axis, but adding negation asks for a different approach. This paper aims at filling this gap. Characterizations of the kind we are after are useful in understanding and (re-)designing the language. They are also useful because they allow us to transfer known results and techniques to the world of XPath. Vianu [16] provides several examples to this effect. All characterizations we give with respect to other languages are constructive and given in terms of translations. An important issue in such comparisons is the succinctness of one language with respect to another. We only touch

SIGMOD Record, Vol. 34, No. 2, June 2005

on this briefly. We use the abstraction to the logical core of XPath 1.0 (called Core XPath) developed in [7, 8]. Below we will often speak of XPath instead of Core XPath. Core XPath is interpreted on XML document tree models. The central expression in XPath is the location path axis :: node label [filter], which, when evaluated at node n, yields an answer set consisting of nodes n0 such that the axis relation goes from n to n0 , the node tag of n0 is node label , and the expression filter evaluates to true at n0 . Alternatively, axis :: node label [filter] can be viewed as denoting a binary relation, consisting of all nodes (n, n0 ) which stand in the above relation. XPath serves two purposes. First and foremost, it is used to select nodes from a document. This use is formalized by the notion of answer set. We study the expressive power of XPath with respect to defining answer sets in Section 3. Our main result is that Core XPath is as expressive as first order logic restricted to two variables in the signature with three binary relations corresponding to the child, descendant and following sibling axis relations and unary predicates corresponding to the node tags. The second use of XPath is as a set of binary atoms in more expressive languages with variables such as XQuery. For instance, we might want to select all nodes x satisfying (1)

∃y(x descendant :: A y ∧ ¬ x descendant :: B/descendant :: ∗ y).

That is, the set of all points which start a path without B nodes ending in an A node. We study this use in Sections 4 and 5. With respect to the expressive power of the relations expressed by Core XPath we establish the following: 1. The set of relations expressible in Core XPath is closed under intersection but not under complementation. 2. The Core XPath definable relations are exactly those that are definable as unions of conjunctive

41

queries whose atoms correspond to the XPath axis relations and to XPath’s filter expressions. 3. The Core XPath definable relations are exactly those that can be defined from its axis and nodetag tests by composition, union, and taking the counterdomain1 of a relation. The paper is organized as follows. The next section defines Core XPath. Sections 3 and 4 are about the expressive power of XPath for selecting sets of nodes, and selecting sets of paths, respectively. Section 5 establishes a minimal set of connectives for XPath.

Related work The paper most closely related to this work is [2], which characterizes positive XPath without sibling axis as existential positive first order logic. Similar results, stated in terms of conjunctive queries are obtained in [9]. Characterizations in terms of automata models have been given in [3, 15, 13, 14]. Connections with temporal logic have been observed by [7, 12] which sketch an embedding of the forward looking fragment of XPath into CTL. [1] exploits embeddings of subsets of XPath into computation tree logic to enable the use of model checking for query evaluation. [11] defines an extension of XPath in which every first order definable relation can be expressed. Closure under complementation is the distinguishing property of such languages: for expansions of Core XPath, it is equivalent to having full first order expressivity. Several authors have considered extensions far beyond XPath 1.0, trying to capture all of monadic second order logic.

2

Core XPath

[8] proposes a fragment of XPath 1.0 which can be seen as its logical core, but lacks much of the functionality that accounts for little expressive power. In effect, it supports all XPath’s axis relations, except for the attribute and namespace axis relations, it allows sequencing and taking unions of path expressions and full booleans in the filter expressions. It is called Core XPath, also referred to as navigational XPath. A similar logical abstraction is made in [2]. As the focus of this paper is expressive power, we discuss XPath restricted to its logical core. For the definition of the XPath language and its semantics, we follow the presentation of XPath in [8]. The expressions obey the standard W3C unabbreviated XPath 1.0 syntax. The semantics is as in [2, 8], in line with the standard XPath semantics [22]. 1 The counterdomain of a binary relation R (notation: ∼R) is the set {(x, y) | x = y ∧ ¬∃z xRz}.

42

Definition 1 The syntax of the Core XPath language is defined by the grammar in Table 1, where “locpath” (pronounced as location path) is the start production, “axis” denotes axis relations and “ntst” denotes tags labeling document nodes or the star ‘*’ that matches all tags (these are called node tests). The “fexpr” will be called filter expressions after their use as filters in location paths. By an XPath expression we always mean a “locpath.” The semantics of XPath expressions is given with respect to an XML document modeled as a finite node labeled sibling ordered tree 2 (tree for short). Each node in the tree is labeled with a set of primitive symbols from some alphabet Λ. Sibling ordered trees come with two binary relations, the child relation, denoted by R↓ , and the immediate right sibling relation, denoted by R→ . Together with their inverses R↑ and R← they are used to interpret the axis relations. We denote such trees as first order structures (N, R↓ , R→ , Pi )i∈Λ . Each location path denotes a binary relation (a set of paths). The meaning of the filter expressions is given by the predicate E(n, fexpr) which assigns a boolean value. Thus a filter expression fexpr is most naturally viewed as denoting a set of nodes: all n such that E(n, fexpr) is true. For examples, we refer to [8]. Given a tree M and an expression R, the denotation or meaning of R in M is written as [[R]]M . Table 2 contains the definition of [[ · ]]M . As discussed, one of the purposes of XPath is to select sets of nodes. For this purpose, the notion of an answer set is defined. For R an XPath expression, and M a model, answer M (R) = {n | ∃n0 (n0 , n) ∈ [[R]]M }. Thus the answer set of R consists of all nodes which are reachable by the path R from some point in the tree. Even Core XPath contains a bit of syntactic sugar. From Table 2 it is immediately clear that both following and preceding are definable. Also, the use of / in front of an expression can be eliminated, as follows: /R ≡ ancestor or self :: ∗[not parent :: ∗]/R. As our analysis considers expressive power, we may safely assume that these three do not occur in expressions and we do so without mentioning it.

3

The Answer Sets of XPath

We show that on ordered trees, Core XPath is as expressive as first order logic in two variables over the 2 A sibling ordered tree is a structure isomorphic to (N, R↓ , R→ ) where N is a set of finite sequences of natural numbers closed under taking initial segments, and for any sequence s, if s · k ∈ N , then either k = 0 or s · k − 1 ∈ N . For n, n0 ∈ N , nR↓ n0 holds iff n0 = n · k for k a natural number; nR→ n0 holds iff n = s · k and n0 = s · k + 1.

SIGMOD Record, Vol. 34, No. 2, June 2005

locpath ::= axis‘::’ntst | axis‘::’ntst‘[’fexpr‘]’ | ‘/’locpath | locpath‘/’locpath | locpath ‘|’ locpath fexpr ::= locpath | not fexpr | fexpr and fexpr | fexpr or fexpr axis ::= self | child | parent | descendant | descendant or self | ancestor | ancestor or self | following sibling | preceding sibling | following | preceding. Table 1: Syntax of Core XPath. [[axis :: Pi ]]M [[axis :: Pi [e]]]M [[/locpath]]M [[locpath/locpath]]M [[locpath | locpath]]M

= = = = =

{(n, n0 ) | n[[axis]]M n0 and Pi (n0 )} {(n, n0 ) | n[[axis]]M n0 and Pi (n0 ) and EM (n0 , e)} {(n, n0 ) | (root, n0 ) ∈ [[locpath]]M } [[locpath]]M ◦ [[locpath]]M [[locpath]]M ∪ [[locpath]]M

[[self]]M [[child]]M [[parent]]M [[descendant]]M [[descendant or self]]M [[ancestor]]M [[ancestor or self]]M [[following sibling]]M [[preceding sibling]]M [[following]]M [[preceding]]M

:= := := := := := := := := := :=

{(x, y) | x = y} R↓ −1 [[child]]M + [[child]]M ∗ [[child]]M −1 [[descendant]]M −1 [[descendant or self]]M + R→ −1 [[following sibling]]M [[ancestor or self]]M ◦ [[following sibling]]M ◦ [[descendant or self]]M [[ancestor or self]]M ◦ [[preceding sibling]]M ◦ [[descendant or self]]M

EM (n, locpath) = true EM (n, fexpr1 and fexpr2 ) = true EM (n, fexpr1 or fexpr2 ) = true EM (n, not fexpr) = true

⇐⇒ ⇐⇒ ⇐⇒ ⇐⇒

∃n0 : (n, n0 ) ∈ [[locpath]]M EM (n, fexpr1 ) = true and EM (n, fexpr2 ) = true EM (n, fexpr1 ) = true or EM (n, fexpr2 ) = true EM (n, fexpr) = f alse

Table 2: The semantics of Core XPath. signature with predicates corresponding to the child, descendant, and following sibling axis relations. More precisely, we show that for every XPath expression R, there exists an XPath filter expression A such that, on every model M,

Theorem 2 For every formula φ(x) in FO2tree with unary predicates from Λ, there exists a Core XPath expression R written with node tags Λ, such that on every tree M, answer M (R) = {n | M |= φ(n)}, and conversely.

(2)

Proof. First we show that for any Core XPath expression R there exists a Core XPath filter expression A, whose size is linear in the size of R, such that for each model M,

answer M (R) = {n | EM (n, A) = true}.

Then, we show that every first order formula φ(x) in the signature just mentioned is equivalent to an XPath filter expression A in the sense that for every model M, and for every node n, (3)

M |= φ(n) if and only if EM (n, A) = true.

First, though, we fix our terminology. We work with first order logic over node labeled ordered trees in a signature with unary predicates from Λ = {P1 , P2 , . . .} corresponding to the node tags, and with a number of binary predicates corresponding to “moves” in a tree. We use the predicates child, descendant and following sibling. Let FOtree be the first order language in this signature. FO2tree ⊂ FOtree denotes the set of first order formulas φ(x) in which at most x occurs free, and which contain at most two variables.

SIGMOD Record, Vol. 34, No. 2, June 2005

(2)

answer M (R) = {n | EM (n, A) = true}.

Consider an arbitrary XPath expression R. Obtain A by applying the converse operator (·)−1 as follows: (S | T )−1 (S/T )−1 (axis :: Pi [B])−1

≡ ≡ ≡

S −1 | T −1 T −1 /S −1 self :: Pi [B]/axis−1 :: ∗,

with axis−1 having the obvious meaning. Then (2) holds. Now we can show the easy side of Theorem 2. Let R be a Core XPath expression and let A = R−1 . Apply

43

the standard translation well-known from modal logic (cf. [4]) to A to obtain the desired first order formula. The translation is just the definition of E from Table 2 written in first order logic. The hard direction follows more or less directly from the argument used to show a similar statement for linear orders, characterizing temporal logic with only unary temporal connectives by Etessami, Vardi and Wilke [6]. Let φ(x) be the first order formula. We will provide an XPath filter expression A such that (3) holds. Whence /descendant or self :: ∗[A] is the desired absolute XPath expression. The proof is a copy of the one for linear temporal logic in [6, Theorem 1]. The only real change needed is in the set of order types: they are given in the right hand side of Table 3, together with the needed translations (A0 denotes the translation of A). The other change is rather cosmetic. For A an atom, A(x) needs to be translated using the self axis as self :: A. Thus, for instance, ∃y(y child x ∧ A(y)) translates to parent :: ∗ [self :: A]. Translating φ(x), the result of this process is a filter expression A for which in any model M, for every node n, EM (n, A) equals true iff M |= φ(n). qed We note that, as in [6], the size of the filter expression is exponential in the size of the first order formula. [6] shows that this is unavoidable, even finite linear structures, so also on trees this bound is tight. The first statement (2) in the above proof shows that Core XPath is as expressive as its filter expressions. Interestingly, Core XPath’s filter expressions were introduced already in [5] for exactly the same purpose as the XPath language: specifying sets of nodes in finite ordered trees. The only difference is that the language of [5] does not have the asymmetry between the vertical and the horizontal axis relations: the immediate left and right sibling relations are also present. [5] provides a complete axiomatization, in a logic called LOFT (Logic Of Finite Trees), which might be of interest for query rewriting.

the φi are formulas in FO2tree in one free variable. An example is Q(x, y) :− z descendant x, z following sibling z 0 , z 0 descendant y, P1 (z), P2 (y), which is equivalent to the XPath expression ancestor :: P1 /following sibling :: ∗/descendant :: P2 .

With a union of conjunctive path queries we mean a disjunction of such queries with all of them the same two free variables x and y. For example, descendant :: P2 | parent :: ∗/ancestor :: P1 is equivalent to the union of the two queries Q1 (x, y) Q2 (x, y)

The Paths of XPath

In the previous section, we characterized the answer sets of XPath. We now turn to the sets of paths that can be defined in XPath; they too admit an elegant characterization which we provide here. First, we define the appropriate first order language. A conjunctive path query is a conjunctive query of the form Q(x, y) :− R1 , . . . , Rn , φ1 , . . . , φm , in which the Ri are relations from the signature {descendant, child, following sibling} and all of

44

x descendant y, P2 (y). z child x, z ancestor y, P1 (y).

From Theorem 2 and some simple syntactic manipulation we immediately obtain Proposition 3 Every XPath expression is equivalent to a union of conjunctive path queries. The converse also holds, which gives us a characterization of the XPath definable sets of paths. Theorem 4 For every union of conjunctive path queries Q(x, y) there exists a Core XPath expression R such that for every model M, {(n, n0 ) | M |= Q(n, n0 )} = [[R]]M . Proof. By [2] or [9] every positive existential first order formula in two free variables is equivalent to a positive XPath expression. We can treat the first order formulas φi in a query as atomic symbols Pi , obtain the equivalent XPath expression and use (3) to substitute the Pi by XPath filter expressions which are equivalent to φi . qed

4.1

4

:− :−

Structural Properties of XPath

Benedikt, Fan and Kuper [2] have given an in-depth analysis of a number of structural properties of fragments of XPath. Their fragments are all positive (no negations inside the filters) and restricted to the “vertical” axis relations defined along the tree order. All their fragments allowing filter expressions are closed under intersection, while none is closed under complementation. Here, we show that this is also true for full XPath. From Theorem 4 and Proposition 3 we obtain Theorem 5 Core XPath is closed under intersections. That is, for every two Core XPath expressions A, B, there exists a Core XPath expression C such that on every model M, [[A]]M ∩ [[B]]M = [[C]]M .

SIGMOD Record, Vol. 34, No. 2, June 2005

τ (x, y) ∃y(τ (x, y) ∧ A(y)) x=y self :: ∗ [A0 ] x child y child :: ∗ [A0 ] y child x parent :: ∗ [A0 ] x following sibling y following sibling :: ∗ [A0 ] y following sibling x preceding sibling :: ∗ [A0 ] x descendant y ∧ ¬x child y child :: ∗/descendant :: ∗ [A0 ] y descendant x ∧ ¬y child x parent :: ∗/ancestor :: ∗ [A0 ]. Table 3: Order types and their translation

On the other hand, unfortunately, Theorem 6 Core XPath is not closed under complementation. Proof. Suppose it was. We will derive a contradiction. Then (1) would be expressible. (1) is equivalent to the first-order formula ∃y(x descendant y ∧ A(y) ∧ ∀z((x descendant z ∧ z descendant y) → ¬B(z))). A standard argument shows that this set cannot be specified using less than three variables. This contradicts Theorem 2 which states that the answer set of every XPath expression is equivalent to a first order formula in two variables. qed

5

The Connectives of XPath

In this section we look at the connectives of XPath and argue that they are very well chosen. We disregard the following and preceding axis relations as well as absolute expressions (those are expressions starting with a /) as they are just syntactic sugar. What are the connectives of XPath? This question is not trivial. Clearly, there is composition (‘/’) and union (‘|’) of paths. Then there is composition with a filter expression (‘[F ]’). And inside the filter expressions all boolean connectives are allowed. This set can be streamlined as follows. Consider the following definition of path formulas: (4)

R ::= axis | ?Pi | R/R | R|R | ∼R,

for axis one of XPath’s axis relations, Pi a tagname, and the following meaning for the two new connectives: [[?Pi ]]M [[∼R]]M

= =

{(x, x) | x is labelled with Pi } {(x, y) | x = y and ¬∃z (x, z) ∈ [[R]]M }.

We call this language SCX (short for Short Core XPath). ?Pi simply tests whether a node has tag Pi . Thus child :: Pi can be written as child/?Pi . The unary operator ∼ is sometimes called counterdomain.

SIGMOD Record, Vol. 34, No. 2, June 2005

For instance, ∼child defines the set of all pairs (x, x) for x a leaf, and ∼parent the singleton {(root, root)}. Below we explain why this set of connectives is so nice. First we show that this definition is equivalent in a very strong sense to that of XPath. Theorem 7 There exist linear translations t1 , t2 with t1 : XPath −→ SCX and t2 : SCX −→ XPath such that for all models M, the following hold: • for every XPath expression R, [[R]]M = [[t1 (R)]]M , • for every SCX expression R, [[R]]M = [[t2 (R)]]M . Proof. Because the counterdomain of a relation R is definable in XPath as self :: ∗[not R], every relation defined in (4) can be expressed as an XPath formula. For the other side, first observe that axis :: Pi and axis/?Pi are equivalent. As both languages are closed under composition and union, we only have to show that all filter expressions are expressible. With the following equivalences we can extend ? to all filter expressions (cf. [4, Lemma 2.82]): ?(axis :: Pi ) ?(axis :: ∗) ?(axis :: Pi [A]) ?(not A) ?(A and B) ?(A or B)

≡ ≡ ≡ ≡ ≡ ≡

∼∼(axis/?Pi ) ∼∼(axis/(?Pi | ∼?Pi ) ∼∼(axis/?Pi /?A) ∼?A ?A / ?B ?A | ?B.

A simple semantic argument shows the correctness of these equations. qed So we can conclude that the “true” set of XPath connectives consists of testing a node tag, composition, union and counterdomain. This set of connectives between binary relations is closely connected to the notion of bisimulation, as exemplified in Theorem 8 below. Before we state the result, we need a couple of definitions. For P a set of tag names, and R a set of relation names, let BP,R denote the P, R bisimulation relation. Let D, D0 be first order models and BP,R ⊆ |D| × |D0 |, with |D| denoting the domain of D. We call BP,R a P,R bisimulation if, whenever xBP,R y, then the following conditions hold, for all relations S ∈ R,

45

tag x and y have the same tag names, for all tag names in P ;

also complete for expressing every first order definable set of paths.

forth if there exists an x0 ∈ D such that xSx0 , then there exists a y 0 ∈ D0 such that ySy 0 and x0 BP,R y 0 ;

Acknowledgments

back similarly for y 0 ∈ D0 . Let α(x, y) be a first order formula in the signature with unary predicates P and binary relations R. We say that α(x, y) is safe for P, R bisimulations if the back and forth clauses of the bisimulation definition hold for α(x, y), for all P, R bisimulations. In words, if α(x, y) is safe for bisimulations, it acts like a morphism with respect to bisimulations. It is easy to see that all relations defined in (4) are safe for bisimulations respecting the node tags and the atomic axis relations. The other direction is known as Van Benthem’s safety theorem (see [4, Theorem 2.83]): Theorem 8 (Van Benthem) Let α(x, y) be as above. If α(x, y) is safe for P, R bisimulations it can be defined by the grammar in (4). Why is this result so important? XPath is a language in which we can specify relations between nodes, and in several applications it is used in this way. Theorems 8 and 7 together guarantee that XPath is in a well defined sense complete: every relation which is safe for bisimulations respecting node tags and XPath’s axis relations can be defined in XPath.

6

Conclusions

We have given semantic characterizations of navigational XPath in terms of natural fragments of first order logic. Besides that, we looked at the connectives of XPath and argued that they are nicely chosen. We conclude that the navigational part of XPath is a very well designed language. On ordered trees it corresponds to a natural fragment of first order logic. This holds both for the sets of nodes and the sets of paths definable in XPath. The only negative aspect we discovered concerning XPath is that it is not closed under complementation. Thus first order logic is more expressive than XPath, both in defining sets of nodes and sets of paths. Marx [10] showed that expanding XPath with conditional axis relations3 yields expressive completeness for answer sets. Marx [11] shows that the same language is 3 A conditional axis relation is of the form (child :: ntst[fexpr])∗ which denotes the reflexive and transitive closure of the relation denoted by child :: ntst[fexpr]. Using this we can express the set of nodes in (1) by

self :: ∗[(child :: ∗[not self :: B])∗ /child :: A].

46

We want thank Loredana Afanasiev, Jan Hidders, and Petrucio Viana for valuable feedback. Maarten Marx was supported by the Netherlands Organization for Scientific Research (NWO), under project numbers 612.000.106 and 017.001.190. Maarten de Rijke was supported by grants from NWO, under project numbers 365-20-005, 220-80-001, 612.069.006, 612.000.106, 612.000.207, 612.066.302, 264-70-050, and 017.001.190.

References [1] L. Afanasiev, M. Francheschet, M. Marx, and M. de Rijke. CTL Model Checking for Processing Simple XPath Queries. In Proc. TIME 2004, 2004. [2] M. Benedikt, W. Fan, and G. Kuper. Structural properties of XPath fragments. In Proceedings. ICDT 2003, 2003. [3] G. Bex, S. Maneth, and F. Neven. A formal model for an expressive fragment of XSLT. Information Systems, 27(1):21– 39, 2002. [4] P. Blackburn, M. de Rijke, and Y. Venema. Modal Logic. Cambridge University Press, 2001. [5] P. Blackburn, W. Meyer-Viol, and M. de Rijke. A proof system for finite trees. In CSL’96, pages 86–105, 1996. [6] K. Etessami, M. Vardi, and Th. Wilke. First-order logic with two variables and unary temporal logic. [7] G. Gottlob and C. Koch. Monadic queries over treestructured data. In Proc. LICS, Copenhagen, 2002. [8] G. Gottlob, C. Koch, and R. Pichler. The complexity of XPath query evaluation. In PODS’03, pages 179–190, 2003. [9] G. Gottlob, C. Koch, and K. Schulz. Conjunctive queries over trees. In Proc. PODS, pages 189–200, 2004. [10] M. Marx. Conditional XPath, the first order complete XPath dialect. In Proc. PODS’04, pages 13–22, 2004. [11] M. Marx. First order paths in ordered trees. In T. Eiter and L. Libkin, editors, Proc. ICDT 2005, volume 3363 of LNCS, pages 114–128, 2005. [12] G. Miklau and D. Suciu. Containment and equivalence for an XPath fragment. In Proc. PODS’02, pages 65–76, 2002. [13] T. Milo, D. Suciu, and V. Vianu. Typechecking for XML transformers. In Proc. PODS, pages 11–22. ACM, 2000. [14] M. Murata. Extended path expressions for XML. In Proc. PODS, 2001. [15] F. Neven and T. Schwentick. Expressive and efficient pattern languages for tree-structured data. In Proc. PODS, pages 145–156. ACM, 2000. [16] V. Vianu. A Web odyssey: from Codd to XML. In Proc. PODS, pages 1–15. ACM Press, 2001. [17] W3C. XML path language (XPath): Version 1.0. http://www.w3.org/TR/xpath.html. [18] W3C. XML path language (XPath): Version 2.0. http://www.w3.org/TR/xpath20/. [19] W3C. XML schema part 1: Structures. http://www.w3.org/TR/xmlschema-1. [20] W3C. XQuery 1.0: A query language for XML. http://www.w3.org/TR//xquery/. [21] W3C. XSL transformations language (XSLT): Version 2.0. http://www.w3.org/TR/xslt20/. [22] P. Wadler. Two semantics for XPath. Technical report, Bell Labs, 2000.

SIGMOD Record, Vol. 34, No. 2, June 2005