Logic Programming With Sets 1 Introduction - Semantic Scholar

Report 10 Downloads 31 Views
Logic Programming With Sets Gabriel M. Kuper IBM T. J. Watson Research Laboratory Yorktown Heights, New York

1 Introduction In recent years there has been a large amount of interest in extending the relational model of a database to capture more of the structure of the data, e.g., [HY82] [JS82] [KV84] [OY85] [RR83] [SP82] [SS77]. Most of this work has focussed on two issues: How to model the physical structure of the data, and how to de ne query languages on structures that are more general than

at relations. One popular model has been non- rst normal form (non 1NF), or nested, relations. These are relations in which the components of tuples may be sets of objects rather than simple, atomic, objects. Another important direction of database research has been in using logic programming languages, such as Prolog, as database query languages [BMSU86] [GM78] [HN84] [Kow79] [Nai83] [Rei78] [Ull85]. Such languages provide a very natural way to express queries on a relational database. Furthermore, by allowing recursion, we get a language that is more powerful than the relational algebra of Codd [AU79]. There has been very little work, so far, on combining these two approaches [BK86] [TZ86] [BNR*87] [SN87]. By this we mean both extending the data model to capture more of the semantics of the data, while at the same time using a logic programming language as the query language. In order to do this, we need to have some way to deal with complex objects in logic programs. Some types of data structures can be easily represented by Prolog structures. Others, in particular aggregation, i.e., forming sets of objects, cannot. Therefore, if we are to use a logic programming language as a query language for nested relations, the language has to be extended to handle sets of objects. The subject of this paper is how to extend logic programming to handle aggregation. We regard such an extension as an essential prerequisite to extending logic programming to non-1NF databases. 1

Furthermore, the extension we propose may also be useful in other applications of logic programming besides database query languages. To illustrate our approach, let us see how a programmer would normally deal with a set of objects in Prolog. Usually a set would be represented as a list. Predicates involving sets would then be de ned by iteration on the elements of the list. For example, the membership predicate could be de ned as follows. member(x; [xjL]) :{ : member(x; [yjL]) :{ member(x; L) When a predicate involves more than one set, the rules can become quite complicated and unintuitive. For example, one way to de ne a predicate that holds when two sets are disjoint is as follows. disj(L; []) :{ : disj([]; L) :{ : disj([xjL1]; [yjL2]) :{ x 6= y; disj([xjL1]; L2); disj(L1; [yjL2]) Even if we replaced these rules by a more ecient de nition, such an approach would still be contrary to the general philosophy of logic programming. Whenever possible, the programmer should not have to deal with the control structures in the program. In examples such as these, the logical de nitions of the predicates are fairly simple. Despite this, the programmer has to specify a lot of details about the implementation, such as how to iterate over the sets. We propose an extension to logic programming called LPS (Logic Programming with Sets). The LPS language has two types of objects: Individual objects, and sets whose elements are individual objects. We rst consider only one level of set nesting in order to concentrate on the key problems that arise, in as simple a framework as possible. The rules in the LPS language are fairly similar to the Horn clauses of logic programming. The main difference between LPS rules and Horn clauses is that the right-hand side of an LPS rule may be preceded by restricted universal quanti ers. This means that an LPS rule has the form

A :{ ( x1 X1)

( xn 2 Xn )(B1 ^    ^ Bm ) This is a fairly conservative extension of Horn clause logic, since whenever the sets X1, : : : , Xn have known values, the body can be reduced to a normal 8

2

 8

2

Horn clause, i.e., the conjunction of the body (without the quanti ers) over all the elements of the sets. We shall see that our extension of Horn clause logic, unlike extensions that allow arbitrary quanti cation on the right-hand side [LT84] [BNR*87], preserves the semantics of Horn clause logic.

Example 1: The disj predicate could be de ned by the rule disj(X; Y ) :{ ( x X )( y Y )(x = y) 8

2

8

2

6

(x 6= y could be de ned as :(x = y ); negation will be discussed later on.)

Example 2: LPS contains the membership predicate as a primitive. Using this predicate, we can de ne the subset relationship by the rule subset(X; Y ) :{ ( x X )(x Y ) Example 3: The predicate union(X; Y; Z ) means that Z is the union of 8

2

2

X and Y . It is de ned by the rule

union(X; Y; Z ) :{ subset(X; Z ) subset(Y; Z ) ( z Z )(z X z Y ) ^

^ 8

2

2

_

2

(Note that because this rule uses disjunction it is not really an LPS rule as de ned above. In Section 4 we show how to convert this a rule into an equivalent set of LPS rules.

Example 4: Let R(x; Y ) be a non- rst normal-form relation. The unnest operation of [JS82] can be de ned by the rule

S (x; y) :{ R(x; Y ) y Y ^

2

Example 5: Suppose that X is a set of numbers, and that we want to compute their sum. Let disj-union(X; Y; Z ) mean that Z is the disjoint union of X and Y , i.e.,

disj-union(X; Y; Z ) :{ union(X; Y; Z ) disj(X; Y ) ^

Then sum(Z; k) can be de ned by the recursive rule

sum(Z; k) :{ disj-union(X; Y; Z ) sum(X; m) sum(Y; n) m + n = k ^

with the base case

^

sum(X; n) :{ X = n f

3

g

^

Example 6: Let parts(x; Y ) be a non 1NF relation that means that object x is built from the component parts in set Y . Let cost(x; n) be a relation

that gives the costs of the component parts. Then the cost of object x can be computed by the rules. obj-cost(x; n) :{ parts(x; Y ) ^ sum-costs(Y; n) sum-costs(Z; n) :{ disj-union(X; Y; Z ) ^ sum-costs(X; m) ^ sum-costs(Y; n) ^ m + n = k sum-costs(X; n) :{ X = fpg ^ cost(p; n)

In the next section we de ne the LPS language formally. In Section 3 we discuss the formal semantics of LPS, and show that the language has model-theoretic and xed point semantics similar to those of [vEK76]. In Section 4 we investigate the expressive power of the LPS language. We show that we can allow disjunction on the right-hand side of an LPS rule without increasing the power of the language, but, on the other hand, LPS is not powerful enough to construct the set of all objects that satisfy a given predicate. In Section 5 we show how to extend the LPS language to arbitrary nite sets, and nally, in Section 6 we compare our approach to other ways of adding sets to logic programming, such as that of [BNR*87].

2 The LPS Language 2.1 The Underlying Logic

The LPS language is based on a two-sorted logic. The two sorts, which be be denoted by a and s, correspond to atomic objects and sets of atoms. The language contains function symbols, which, with one exception are from sort a to sort a1. This exception is for building sets, for which we have terms such as fx1; x2; x3g, where the variables x1 , x2 and x3 are of sort a. These terms are de ned using special function symbols fn , where fn (x1; : : :; xn ) represents the set fx1; : : :; xng.

De nition 1: An LPS language L consists of:

1. Predicate symbols p i , i = 1, : : : , n1 , where i is a string of a's and s's of length  0 that speci es the sort of the predicate. L also contains three special \built-in" binary predicate symbols: =aa , =ss , and 2as . i

1 In Example 8 we explain why uninterpreted function symbols with other sorts are not allowed.

4

2. Function symbols fim , i = 1, : : : , n2 . fim is from sort am to sort a. L also contains the special function symbols fn for all n  0. fn is from sort an to sort s. 3. Constants cai , i = 1, : : : , n3 , all of sort a. i

i

i

4. Variables x i , i = 1, : : : , n4 . x i is of sort i . In practice, the sorts of the functions and predicates will often be clear from the context, and then we shall not mention the sort explicitly. In particular, we shall usually not distinguish between the two types of equality. We use the convention that lower case letters near the end of the alphabet (x, y , z ) represent variables of sort a, upper case letters (X , Y , Z ) represent variables of sort s, and we often assume that predicate symbols are written with all the variables of sort a preceding those of sort s. Normally, when we refer to predicate or function symbols in a language, we include the special \built-in" ones, unless we explicitly state otherwise. i

i

De nition 2: A term t over an LPS language L is one of: 1. A constant ci . The sort of such a term is a. 2. A variable xi . The sort of such a term is the sort of the variable. 3. f (t1 ; : : :; tk ), where f is a function symbol, and each ti is of sort a. The sort of the term is s if f is one of the fn 's, and is a otherwise. We write ft1 ; : : :; tn g for the term fn (t1 ; : : :; tn ), and ; for the term f0 . We then de ne atomic formulas and well-formed formulas in the usual way. An LPS model is a rst-order model of the logic, that satis es several additional constraints. These constraints require that the model interpret sort s as sets of elements of sort a, and that it interprets the special function and predicate symbols correctly. In other words, the equality and membership predicates are interpreted as identity and membership over the corresponding domains, and the function symbol fn is interpreted as a set constructor.

De nition 3: Let L be an LPS language. An LPS model M of an LPS language L consists of 1. Two sets D and D . D is the interpretation of sort a, and D is the interpretation of sort s. Because sort s corresponds to sets of objects of sort a, we require that D be a subset of n (D). P

5

2. Interpretations of the predicate symbols. (a)

k l z }| { z }| {

is a non-special predicate symbol of sort i = a    a s    s, then M(p i ) is a subset of If p i

i

i

k }|

z

D

{ z

 

D D 



l }|  

{

D

(b) M(=aa ) is the equality relation on D  D (c) M(=ss ) is the equality relation on D  D (d) M(2) is the membership relationship between elements of D and

D

3. Interpretations of the function symbols. (a) If fin is a n-ary function symbol other than fn , then M(fi ) is a z

n }|

{

function from D      D to D (b) M(fn) is de ned as M(fn)(d1; : : :; dn) = fd1; : : :; dn g, for all elements d1, : : : , dn of D 4. An interpretation M(ci) 2 D of each constant. Interpretations of terms and formulas are de ned as usual. Note that the interpretations of quanti ers (8x) and (8X ) are over D and D , respectively. Free variables, closed formulas, validity, the symbol :{ , etc., are also de ned as usual.

2.2 LPS Programs

De nition 4: A restricted universal quanti er is a quanti er of the form

(8x 2 X ).

The formula (8x 2 X ) is an abbreviation for (8x)(x 2 X ) ). Note that (8x 2 X ) is true whenever X is the empty set.

De nition 5: An LPS clause over L is a formula of the form A :{ ( x1 X1) 8

2

( xn 2 Xn )(B1 ^    ^ Bm )

 8

6

where each Bi is an atomic formula, each xi is a variable of sort a, each Xi is a variable of sort s, and A is a non-special atomic formula2 . The case n = 0 is allowed, in which the clause is an ordinary Horn clause. This clause in an abbreviation for the closure of the formula (8x1 2 X1 )    (8xn 2 Xn )(B1 ^    ^ Bm ) ) A

A has to be a non-special atomic formula, since otherwise we could write a clause that rede nes equality or membership, and this could cause problems later on. Goals, empty clauses, substitutions, etc., are de ned in the usual way.

De nition 6: An LPS Program P is a nite set of LPS clauses.

3 Semantics

3.1 Minimal Model Semantics

We rst extend the notion of a Herbrand model to the LPS language.

De nition 7: The Herbrand universe has two components: 1. ULa , consisting of (a) All constants ci. (b) All terms fi u1    un where u1 , : : : , un are in ULa . 2. ULs = P n (ULa ), i.e., all nite sets of elements of ULa .

A signi cant di erence between our de nition and the standard de nition of a Herbrand model is that terms of the form fn (x1; : : ::xn ) are interpreted as nite setsof elements of ULa , rather than by the term that we would get by concatenating the function symbol fn to its arguments. We do this to ensure that a Herbrand model is also an LPS model.

De nition 8: The Herbrand base is the set of all atomic formulas of the form

k

l

z }| { z }| { 1. p i (u1; : : :; uk ; u1; : : :; ul ), where i = a    a s    s. i

2

i.e., one whose predicate symbol is not an equality or membership predicate.

7

2. u1 =aa u2 3. u1 =ss u2 4. u1 2 u1 where u1, : : : , uk are in ULa and u1 , : : : , ul are in ULs .

De nition 9: A Herbrand model of L is an LPS model of L that satis es 1. The domains of the sorts a and s are ULa and ULs , respectively.

2. The interpretation of a constant is the constant itself. 3. The interpretation of a non-special function symbol is the function that consists of concatenating the function symbol to its arguments.

Lemma 1: Let M1 and M2 be Herbrand models of L, and let  be a ground substitution for the formula (x 2 X ). Then M1 = (x X ) M2 = (x X ) j

2

,

j

2

Proof: The substitution  replaces x by some ground term u in ULa, and

replaces X by some U in ULs . By the de nition of ULs , U is a nite set fu1 ; : : :; un g of ground terms of sort a. But then, for any Herbrand model M (in fact, any LPS model), M satis es (x 2 X ) i u is equal to one of the terms u1 , : : : , un . A key property of Horn clauses is that any set of Horn clauses that has a model must have a Herbrand model. This result also holds for LPS clauses.

Lemma 2: Let P be a set of LPS clauses, and let M be an LPS model of P . Then P has a Herbrand model M . Proof: In order to de ne M, we have to de ne the interpretations of the non-special predicate symbols. M (pi ) is de ned as the set of all tuples i

(u1 ; : : :; uk ; u1; : : :; ul ) such that each ui is in ULa , each uj is in ULs , and such that   M(u1); : : :; M(uk); M(u1); : : :; M(ul ) 2 M(p i ) Whenever A is a ground instance of an atomic formula, we have i

M = A M = A j

,

8

j

We claim that any LPS clause that is satis ed by M is also satis ed by M. To show this, assume that the clause A :{ (8x1 2 X1)    (8xn 2 Xn)(B1 ^    ^ Bm) is not satis ed by M. Then there is a ground substitution  for which M j=6 A but 

M = ( x1 X1) ( xn Xn)(B1 j

8

2

 8

2



Bm ) 

^^

Since M j= 6 A, we will complete the proof by showing that 

M = ( x1 X1) ( xn Xn)(B1 j

8

2

 8

2

^^



Bm ) 

Let d1, : : : , dn be in the interpretation of sort a in M with the property that M j= (xi 2 Xi)fx1 =d1; : : :; xn =dn g for all i from 1 to n. We have to show that M j= (B1 ^    ^ Bm )fx1=d1; : : :; xn=dng Let ui be the term that  substitutes for the variable Xi . Since ui is ground, it must be of the form fu1i ; : : :; uki g, and therefore M(ui ) is equal to k 1 fM(ui ); : : :; M(ui )g. But then M j= fxi 2 Xi ) (x1 =d1 ; : : :; xn =dn g holds i di is equal to M(uji ) for some ji. This implies that i

i

i

M = xi Xi)(x1=uj1 ; : : :; xn=ujn j

f

1

2

n

g

holds for all i, which in turn implies M j= (B1 ^    ^ Bm)fx1=uj1 ; : : :; xn=ujn g Therefore, since di is equal to M(uji ), M j= (B1 ^    ^ Bm )fx1=d1; : : :; xn=dng Example 7: It is well known that rst-order formulas may have rst-order models but not have Herbrand models. An example of this is the formula p(a) ^ (9x):p(x) in a language with the single individual constant a. Since (9x):p(x) is the same as the Horn clause :{ (8x)p(x), we might think that the LPS clauses p(a) and :{ (8x 2 X )p(x) would have LPS models but not Herbrand models, contradicting the Lemma. The reason this does not happen is that :{ (8x 2 X )p(x) has no LPS models. :{ (8x 2 X )p(x) is equivalent to the closed formula (8X )(9x 2 X ):p(x), and (9x 2 X ):p(x) is always false when X = ;. 1

i

9

n

Example 8: The requirement that the non-special function symbols always be of sort a is essential for If we were to allow function symbols from sort a to sort s, and if f were such a function symbol, then the clauses f

:{ A(f (a)); A(X ) :{ (8x 2 X )B (x)g

would has an LPS model, since we could interpret f (a) by the set fag, and make B (a) false. However, these clauses do not have a Herbrand model, since x 2 f (a) is false in every Herbrand, forcing A(f (a)) to be true in such a model.

De nition 10: Let P be a set of LPS clauses. The least Herbrand model of P is de ned as the intersection MP of all the Herbrand models of P . The next Theorem can be proved in the same way as for Horn clauses. Lemma 2 is the key result that makes use of the fact that our clauses are of a more general form.

Theorem 3: 1. MP is a Herbrand model of P . 2. MP consists of those formulas in the Herbrand base that are logical consequences of P .

3.2 Fixpoint Semantics

Lemma 4: Every ground instance of an LPS clause is logically equivalent to a ground instance of some Horn clause. Proof: Let  be a ground substitution for the LPS clause A :{ ( x1 X1) 8

2

( xn 2 Xn )(B1 ^    ^ Bm )

 8

The ground term that  substitutes for Xi must be of the form fu1i ; : : :; uji g. Since xi 2 fu1i ; : : :; uji g is logically equivalent to xi = u1i _    _ xi = uji , it follows that the ground instance of this LPS clause is logically equivalent to the Horn clause i

i

A :{

^

i

(B1 ^    ^ Bm )fx1 =uk1 ; : : :; xn =ukn 1

(k1 ;::: ;kn )

1ki ji

10

n

g

De nition 11: Let P be an LPS program. We de ne a mapping TP from Herbrand interpretations to Herbrand interpretations as follows. Tp(M) is the set of all formulas A in the Herbrand base for which there is a ground instance A :{ B1 ^    ^ Bm of a clause in P , with the property that all the Bi 's are in M. Using Lemma 4, we can modify the standard proof to show. Theorem 5: Let P be an LPS program. Then MP = lfp(TP ) = TP " !. The standard procedural semantics can also be extended to LPS. However, to do so, we have to use arbitrary uni ers, rather than the most speci c one. For this reason, it is no longer a practical decision procedure.

4 The Expressive Power of the Language 4.1 Using Disjunction in LPS Rules

Some of the examples in the introduction to this paper use clause that do not t the formal de nition of LPS. For example, we de ned union by the clause union(X; Y; Z ) :{ (8x 2 X )(x 2 Z ) ^ (8y 2 Y )(y 2 Z ) ^(8z 2 Z )(z 2 X _ z 2 Y ) This is is not an LPS rule since the clause contains disjunction, and universal quanti ers were allowed only in front of the entire right-hand side. The second point might not seem to make a di erence|after all, the formula (8x)(A ^ B ) is equivalent to A ^ (8x)B whenever x does not appear free in A. With restricted quanti ers, however, (8x 2 X )(A ^ B ) is not, in general, equivalent to A ^ (8x 2 X )B , since when X is the empty set, the rst formula is always true, while the second is equivalent to A. Ignoring this problem for a moment, and considering only disjunction, we might try to handle it in a similar way to Horn clauses. The Horn clause A :{ B _ C is equivalent to the pair of clauses A :{ B and A :{ C . If we try the same thing with the union predicate, we get (using X  Y as an abbreviation for (8x 2 X )(x 2 Y )) union(X; Y; Z ) :{ X  Z ^ Y  Z ^ (8z 2 Z )(z 2 X ) union(X; Y; Z ) :{ X  Z ^ Y  Z ^ (8z 2 Z )(z 2 Y ) 11

But this is equivalent to (X = Z ^ Y  Z ) _ (Y = Z ^ X  Z ) which is not what we wanted. However, if we allow the use of additional, auxiliary, predicates, we can then express the union predicate in LPS by the clauses. union(X; Y; Z ) :{ t3(X; Z ) ^ t3(Y; Z ) ^ t1 (X; Y; Z ) t1(X; Y; Z ) :{ (8z 2 Z )t2 (X; Y; z) t2 (X; Y; z) :{ z 2 X t2 (X; Y; z) :{ z 2 Y t3 (X; Y ) :{ (8x 2 X )(x 2 Y ) We now show that this construction generalizes to any positive formula. De nition 12: Positive formulas are de ned by induction as follows. 1. An atomic formula. 2.  _ and  ^ , where  and are positive formulas. 3. (8x 2 X ) and (9x 2 X ) where  is a positive formula, and x and X are variables of sorts a and s, respectively. Theorem 6: Let P be a set of clauses of the form Ai :{ Bi , where each Ai is an atomic formula and each Bi is a positive formula over L. Then there is an extension L of L and a program P  over L such that for every formula  over L, P j=  , P  j=  Proof: We assume that P consists of the single clause A :{ B, as the construction can easily be extended to programs with more than one clause. We de ne the program P  = f (A :{ B ) by induction on the size of B as follows. 1. B is an atomic formula. Then f (A :{ B ) consists of the clause A :{ B. 2. B is C1 ^ C2. Let x1 , : : : , xn and y1 , : : : , ym be the free variables in C1 and C2, respectively. Let N1 (n-ary) and N2 (m-ary) be new predicates. f (A :{ B) = f (N1(x1; : : :; xn ) :{ C1) [ f (N2(y1 ; : : :; ym) :{ C2) [fA :{ N1 (x1 ; : : :; xn ) ^ N2 (y1 ; : : :; ym )g 12

3. B is C1 _ C2. Let x1 , : : : , xn and y1 , : : : , ym be the free variables in C1 and C2, respectively. Let N1 (n-ary) and N2 (m-ary) be new predicates.

f (A :{ B) = f (N1(x1; : : :; xn ) :{ C1) f (N2(y1 ; : : :; ym) :{ C2) A :{ N1(x1; : : :; xn) A :{ N2(y1 ; : : :; ym ) [

[f

g[ f

g

4. B is (9x 2 X )C . Let x1 , : : : , xn , x be the free variables in C . Let N be a new (n + 1)-ary predicate.

f (A :{ B) = f (N (x1; : : :; xn) :{ C ) A :{ N (x1; : : :; xn; x) x X [f

^

2

g

5. B is (8x 2 X )C . Let x1 , : : : , xn , x be the free variables in C . Let N be a new (n + 1)-ary predicate.

f (A :{ B) = f (N (x1; : : :; xn) :{ C ) A :{ ( x X )N (x1; : : :; xn; x) [f

8

2

g

Obviously, P  is an LPS program. By induction on the size of B , we show that A :{ B is equivalent to the program f (A :{ B ). The base case of the induction is when B is atomic, in which case f (A :{ B ) consists of the single clause A :{ B . For the general case, we show how to deal with the case A :{ C1 _ C2. The other cases are similar. The de nition of f (A :{ C1 _ C2 ) is

f (N1(x1; : : :; xn) :{ C1) f (N2 (y1; : : :; ym) :{ C2 ) A :{ N1(x1; : : :; xn) A :{ N2(y1 ; : : :; ym ) [

[f

g[ f

g

By the inductive hypothesis, the program f (N1(x1 ; : : :; xn ) :{ C1) is equivalent to the rule N1(x1 ; : : :; xn) :{ C1, and similarly for N2 and C2. Since all the new predicates that were introduced by our construction are distinct, it follows that any formula  over L does not contain any of the Ni's, and hence that  is a consequence of

A :{ N1; A :{ N2; N1 :{ C1; N2 :{ C2

f

if and only if  is a consequence of A :{ C1 _ C2 . 13

g

Example 9: The de nition of union in LPS that we gave above is somewhat simpler that the de nition that we get from the general construction in the proof. The proof gives us the program union(X; Y; Z ) :{ N1(X; Z ) N2(X; Y; Z ) N1(X; Z ) N2(X; Y; Z ) N3(x; Z ) N4(Y; Z ) N5(X; Y; Z ) N6(y; Z ) N7(X; Y; z) N7(X; Y; z) N8 (z; Z ) N9(z; Y )

:{ :{ :{ :{ :{ :{ :{ :{ :{ :{

^

(8x 2 X )N3(x; Z ) N4 (Y; Z ) ^ N5(X; Y; Z )

x Z ( y Y )N6(y; Z ) ( z Z )N7(X; Y; z ) y Z N8 (z; Z ) N9 (z; Y ) z Z z Y 2

8

2

8

2

2

2

2

It turns out that auxiliary predicates are essential to the proof of this Theorem, since the union predicate cannot be de ned without them.

Theorem 7: Let L be a language whose only non-special predicate is a

ternary predicate p. There is no LPS program P with the property that for all sets A, B and C in ULs ,

MP = p(A; B; C ) A B = C j

,

[

Proof: See Appendix A.

4.2 Construction of Sets

Suppose we are given a predicate A(x) and we want to construct the set S = x A(x) , i.e., the set of all those x's that satisfy the predicate A(x). f

j

g

Since LPS is a nonprocedural, language, we have no mechanism to actually construct such a set, but could instead ask whether we can de ne a predicate B(X ) that is satis ed exactly when X = S . If we try to de ne B(X ) by the clause B(X ) :{ (8x 2 X )A(x) B(S ) would indeed hold, but B(X ) would also hold for all subsets X of S . We now show that such a predicate B (X ) cannot be de ned in LPS. The 14

statement of the proof is rather complicated, since we have to be careful not to allow the de nition of B (X ) to rede ne A(x), or vice versa. The proof shows that such a predicate cannot be de ned in any language with minimal model semantics like LPS.

Theorem 8: Let A(x) be a unary predicate and let P be a program that does not contain the predicate B (X ). There does not exist an LPS program P  such that MP [P  j= B(U ) if and only if U is equal to the set of all elements u in the Herbrand universe for which MP j= A(u). Proof: Assume that such a program P  exists. Let c1 and c2 be two constants in L, let P1 be the program A(c1) , and let P2 be the program A(c1); A(c2) . Then MP = A(x) i x = c1, and therefore MP [P  should f

f

g

g

j 1

1

satisfy B (fc1 g) but should not satisfy B (X ) for any other set X . On the other hand, MP j= A(x) whenever x = c1 or c2 , and therefore MP [P  should satisfy B (fc1 ; c2g) but should not satisfy B (X ) for any other set X . 6 B(fc1g). Since every model of P2 is a model of P1 , In particular MP [P  j= since MP [P  is a Herbrand model of P2 and of P  , it follows that MP [P  is a Herbrand model of P1 [ P  . But then B (fc1g) is not satis ed in at least one Herbrand model of P1 [ P  , and hence is not satis ed in the least Herbrand model MP [P  , a contradiction. In order to construct the set fx j A(x)g we have to know, for each x, whether or not A(x) is true. This is closely related to making the closed world assumption, since we need negative information (which x's do not satisfy A(x)) as well as positive information (which x's do). Since LPS is similar to Horn clause logic it only provides positive information, and not the negative information needed for set construction. The relation between set construction and negation is very close. We can negation to LPS in a straightforward way, losing of course the minimal model semantics. Extending the notion of a strati ed program [ABW86] to LPS is also straightforward. We could then de ne the predicate B (X ) by the clauses C (X ) :{ X  Y ^ (8y 2 Y )A(y) B(X ) :{ (8x 2 X )A(x) ^ :C (X ) where X  Y is de ned by 2

2

2

2

2

1

X Y :{ ( x X )(x Y ) z Y z X Informally, C (X ) says that there is some set Y that is larger than X , all of whose elements satisfy A(x). B (X ) says that all of X 's elements satisfy 

8

2

2

15

^

2

^ :

2

A, but that there is no larger set with this property, which is equivalent to saying that X = fx j A(x)g.

5 Extending LPS to Arbitrary Sets We outline how LPS can be extended to arbitrary nite sets, i.e., how to allow more than one level of set nesting. There are two ways we could do this. One way is to use a fully typed language, i.e., to have a type n for all sets of nesting depth n. The other way is to use an untyped language. We choose the second approach, primarily to facilitate comparing our results to those of LDL in the next section. In an untyped language we must still be careful to avoid letting the values of function symbols contain any elements, since then the minimal model semantics would no longer hold. We therefore use a model of set theory that has atoms as well as sets. We call the new language ELPS (Extended LPS). An ELPS language L is an untyped rst-order logic with equality, with special function symbols fn and 2. An ELPS model M is a model of L over a domain D that interprets fn and 2 as the corresponding set-theoretic constructors and relations. We require that if f is a function symbol in L, the range of M(f ) must consist entirely of atoms, i.e., objects in D that themselves have no elements. Similarly, constants must also be interpreted as atoms.

De nition 13: The Herbrand universe UL for L is the smallest set that contains 1. All the constants ci in L, 2. All terms of the form fi u1    un where fi is a function symbol in L, and u1 , : : : , un are in UL, and 3. All nite subsets of UL.

The Herbrand base and Herbrand models are de ned in the same way as before. Our previous results then hold in ELPS.

Theorem 9: Let P be an ELPS program. 1. There exists a minimal model MP that is a Herbrand model of P , and consists of those formulas in the Herbrand base that are logical consequences of P . 16

2. MP = lfp(TP ) = TP " ! Theorems 6 and 8 can also be generalized to ELPS.

6 Comparison with Other Ways of Adding Sets to Logic Programming We compare our approach to other ways of adding sets to logic programs, speci cally that of LDL [BNR*87]. The LDL language has several type of clauses, and built-in predicates, but for our purposes an LDL program will consist of Horn clauses and grouping rules (de ned below). Other LDL primitives, such as scons, will be treated separately. We assume that all the languages contain the special predicates of ELPS.

De nition 14: An LDL grouping clause [BNR*87] over L is a formula of

the form

A(x1; : : :; xn; x ) :{ B1 h i

^ ^

Bm

The meaning of this clause is that the left-hand side is satis ed by a tuple (d1; : : :; dn ; d) i d is equal to the set of those values of x for which the body of the clause holds.

De nition 15: Let L be a rst-order logic 1. L + union consists of L with the additional predicate union(x; y; z ). Every model Munion of L + union is required to satisfy Munion = union(d1; d2; d3) d3 = d1 d2 2. L + scons consists of L with the additional predicate scons(x; y; z ). Every model Mscons of L + scons is required to satisfy Mscons = scons(d1; d2; d3) d3 = d1 d2 j

,

j

,

[

[f

g

We always assume that the union and scons predicates are not predicates of L. Our scons predicate is essentially the same as the scons operation of [BNR*87], but for technical reasons we prefer to de ne it as a predicate rather than as a function symbol. We further require that clauses never have the predicates union or scons in the head. 17

There are obvious mappings between models of any one of these languages to models of any other, and we can use these mappings to de ne equivalences between theories over L and over L + union or L + scons. We shall assume that each language also has some auxiliary predicates not shared by the other language. Equivalence will be relative only to the predicates that the languages have in common.

De nition 16: An LDL program over L is a nite set of Horn and LDL grouping clauses over L.

6.1 Languages Without Negation

We rst look at languages without negation, introduced either directly or indirectly (e.g., via a grouping operation). All of these languages have minimal model and least xpoint semantics similar to ELPS.

Theorem 10: The following are equivalent 1. ELPS programs over L 2. Horn programs over L + union 3. Horn programs over L + scons Proof: 1. Any Horn clause over L + union can be converted into a set of ELPS clauses over L by replacing all occurrences of the union predicate by

a new predicate p(x; y; z ) that does not occur in the original program, and adding the clause

p(x; y; z) :{ ( w z)(w x w y) ( w x)(w z ) ( w y )(w z ) 8

2

2

_

2

^ 8

2

2

^

8

2

2

Note that we have to use Theorem 6 to eliminate the disjunction, and this construction introduces additional auxiliary predicates. 2. A Horn clause over L + scons can be converted into a set of ELPS clauses by replacing scons by a new predicate r(x; y; z ), and adding the clause

r(x; y; z) :{ ( w x)(w z) y z ( w z)(w x w = y) 8

2

2

^

18

2

^

8

2

2

_

3. Let

A :{ ( x1 y1 )

( xn 2 yn )(B1 ^    ^ Bm ) be an ELPS clause over L. This rule is equivalent to the rule 8

2

 8

A :{ union(y10 ; y100; y1) A y1=y10 ^

f

g^

A y1 =y100 f

g

with the \base case"

A :{ x1 = y1 f

g^

(8x2 2 y2 )    (8xn 2 yn )(B1 ^    ^ Bm )

Repeating this construction n times converts the original rule into a set of Horn clauses over L + union. 4. The same technique converts this clause into a set of Horn clauses over L + scons, i.e.,

A :{ scons(y10 ; x1; y1) A y1 =y10 ^

f

g^

(8x2 2 y2 )    (8xn 2 yn ) (B1 ^    ^ Bm )

with the base case

A :{ scons( ; x1; y1) ( x2 y2 ) ;

^

8

2

( xn 2 yn )(B1 ^    ^ Bm )

 8

6.2 Languages with Negation

We rst look at unstrati ed negation. Since such programs do not have unique minimal, or even preferred, models, equivalence will be with respect to the class of all the models of the program.

Theorem 11: The following are equivalent 1. 2. 3. 4.

ELPS programs with unstrati ed negation over L Horn programs with unstrati ed negation over L + union Horn programs with unstrati ed negation over L + scons LDL programs over L

Proof: The equivalences 1{3 follow easily from the proof of Theorem 10. We can convert a Horn program over L + union into an LDL program over L by replacing the union predicate by a new predicate q(x; y; z) de ned by: 19

q(x; y; z ) :{ p(x; y; z) p(x; y; z) :{ z x p(x; y; z) :{ z y h i

2

2

The resulting program may still use negation, but we can use the construction in [BNR*87] to express the negation in terms of grouping.3 Finally, we can convert an LDL program into an ELPS program by converting the LDL grouping clause

A(x1; : : :; xn; x ) :{ B1 h i

^ ^

Bm

into the ELPS clauses with negation

q(x; y) :{ ( z x)(z y) w y 8

2

2

2

w x

^:

p(x1; : : :; xn; y) :{ q(y; z) ( x z)(B1 ^

and

^

A(x1; : : :; xn ; y) :{ ( x y)(B1 8

2

8

2

^^

Bm )

2

^^

Bm )

p(x1; : : :; xn; y)

^ :

This is essentially the same technique used to construct sets at the end of Section 4.2. q (x; y ) holds when x is a proper subset of y , p(x1; : : :; xn ; y ) holds when there is some proper superset of y all of whose elements satisfy the right-hand side of the original rule. Finally the last rule says that all of y's elements satisfy the right-hand side of the original rule, and that there is no larger set with the same property. Most of these equivalences also hold for strati ed programs, sincethe corresponding proofs map strati ed programs into strati ed ones.

Theorem 12: The following are equivalent

1. Strati ed ELPS programs with negation over L 2. Strati ed Horn programs with negation over L + union 3. Strati ed Horn programs with negation over L + scons

3 We could have used instead the proof of [BNR*87] to convert an ELPS programs into an LDL program. However, the proof we have given is much simpler, and does not depend on the existence of suitable function symbols.

20

Furthermore, all these languages are at least as powerful as strati ed LDL programs over L. The question whether every strati ed ELPS program is equivalent to some strati ed LDL program remains open.

A Proof of Theorem 7

Theorem 7: Let L be a language whose only non-special predicate is a

ternary predicate p. There is no LPS program P with the property that for all sets A, B and C in ULs ,

MP = p(A; B; C ) A B = C j

,

[

Proof: Assume that such a program P exists. We rst show that whenever

the head of a rule in P is of the form p(t1; t2 ; Z ), where t1 and t2 are terms of sort s that are di erent from the variable Z , we may assume that the rule contains no quanti ers in the body. 1. If the body contains a quanti er of the from (8w 2 W ), and the variable does not appear in the head, then the body is always true for W = ;, and therefore the head is always true. This clause is therefore equivalent to the quanti er-free clause

union(t1; t2; Z ): In the remaining cases, we assume that the sets over which quanti ers in the body range, actually appear in the head. 2. t1 and t2 are of the form fx1 ; : : :; xng and fy1 ; : : :; ymg. We can then replace the clause by

p( x1; : : :; xn ; y1; : : :; ym ; x1; : : :; xn; y1; : : :; ym ): f

g f

g f

g

Here we use the fact that P is assumed to de ne a union predicate. Therefore, if our clause implies anything not covered by the fact above, our assumption is false. If, on the other hand, it does not imply all these facts, some other clause in P must imply them, and adding them here as well does no harm. 21

3. t1 = fx1; : : :; xn g and t2 = Y (or t1 = X and t2 = fy1 ; : : :; ymg). If a quanti er in the body if the clause ranges over the elements of Z , then p(fx1; : : :; xng; Y; ;) must hold for all Y , a contradiction. If a quanti er in the body ranges over the elements of Y then p(fx1; : : :; xn g; ;; Z ) must for all Z , once more a contradiction. 4. t1 = X and t2 = Y . If a quanti er in the body ranges over the elements of Z , then p(X; Y; ;) must hold for all X and Y , a contradiction. If a quanti er in the body ranges over the elements of X (or Y ), then p(;; Y; Z ) must for all Y (or union(X; ;; Z ) for all X ) and Z , once more a contradiction. 5. t1 = t2 = X . If a quanti er in the body ranges over the elements of Z , then p(X; X; ;) must hold for all X . If a quanti er in the body ranges over the elements of X , then p(;; ;; Z ) must hold for all Z , in both cases a contradiction. Now, let A, B and C be three sets in ULs that satisfy 1. A [ B = C , A 6= C and B 6= C 2. C has more than 2N elements, where N is the largest number for which the function symbol fN is used in the program P . 3. There exists a derivation of p(A; B; C ) from P , and there exists no sets A0 , B 0 and C 0 satisfying 1 and 2, such that p(A0; B 0; C 0) has a shorter derivation from P . Let be an element of ULa that is not a member of C . We shall show that MP j= p(A; B; C [ f g). Since A [ B 6= C [ f g, this will contradict the de nition of P and complete the proof. The last step of the derivation of union(A; B; C ) has to use a ground instance of the rule

p(t1; t2 ; Z ) :{ B1

^ ^

Bk

to derive p(A; B; C ), after having already derived the ground instances of B1 , : : : , Bk . The last step must use such a rule for the following reason. Since C has more than N elements, the third argument of p must be a variable. Since A 6= C and B 6= C , neither t1 nor t2 can be equal to the variable Z . As shown above, we may therefore assume that the rule has no quanti ers in the body. 22

Let the free variables in this rule be

w1; : : :; wn; W1; : : :; Wm The last step of the derivation of p(A; B; C ) from P uses this rule to derive p(A; B; C ), having already derived (B1 Bk ), where the substitution  is of the form  = w1=u1; : : :; wn=un ; W1=U1; : : :; Wm=Um For any set U in ULs , let U  be U whenever C U , and let U  be U otherwise. We claim that (B1 Bk ) can also be derived from P , ^ ^

f

g

[ f

g



^ ^

where

 = u1=u1; : : :; wn=un; W1=U1; : : :; Wm=Um This implies that p(A; B  ; C ) can be derived from P . Since A and B are both proper subsets of C , A = A, B  = B , and C  = C , completing f

g

[ f

g

the proof. We therefore have to show that each Bi  can be derived from P . We do this by enumeration of the possible forms of Bi . 1. wi 2 Wj . (wi 2 Wj ) is equal to ui 2 Uj , Uj is always a subset of Uj , and therefore ui 2 Uj, i.e., (wi 2 Wj ) . 2. wi = wj . In this case (wi = wj ) is identical to (wi = wj ) . 3. Wi = Wj . Since Ui = Uj always implies Ui = Uj , (Wi = Wj ) implies (Wi = Wj ) . 4. p(t1; t2 ; t3). There are six cases to consider. (a) t1 , t2 and t3 are the variables Wi , Wj and Wk . Since the derivation of p(Ui ; Uj ; Uk ) is part of the derivation of p(A; B; C ), it must be shorter. But then, Ui , Uj and Uk cannot satisfy both conditions 1 and 2 in the de nition of A, B and C . Since P satis es p(Ui; Uj ; Uk ), Ui [ Uj must be equal to Uk , and condition 1 can be violated only if Ui = Uk (or if Uj = Uk ). But Ui = Uk implies Ui = Uk, and since Ui [ Uj = Uk , we have U1 [ U2 = U3 , which implies that P must imply p(U1; U2; U3). If condition 2 is violated, then Uk must have  2N elements. Since C has more that 2N elements, C cannot be a subset of Uk , and therefore Ui = Ui , Uj = Uj and Uk = Uk . 23

(b) t1 is the term fwi ; : : :; wi g, t2 is the variable Wj and t3 is the variable Wk . Then p(fui ; : : :; ui g; Uj ; Uk ) must have a shorter derivation than p(A; B; C ). As before, one of conditions 1 and 2 must be violated. If fui ; : : :; ui g = Uk , then C is too large to be a subset of Uk , and therefore Uk = Uk which implies the result. If Uj = Uk , or if Uk has  2N elements, the same argument as in case (a) applies. (c) t3 is the term fwk ; : : :; wk g, t1 is the variable Wi and tj is the the variable Wj . In this case the sets fuk ; : : :; uk g and hence Ui and Uj have  N elements and therefore Ui = Ui and Uj = Uj . (d) t1 is the term fwi ; : : :; wi g, t2 is the term fwj ; : : :; wj g, and t3 is the variable Wk . In this case Uk has  2N elements, and therefore Uk = Uk . (e) t1 is the term fwi ; : : :; wi g, t3 is the term fwk ; : : :; wk g, and t2 is the variable Uj . In this case Uj has  N elements, and therefore Uj = Uj . (f) t1 is the term fwi ; : : :; wi g, t2 the term fwj ; : : :; wj g, t3 is the term fwk ; : : :; wk g. In this case there are no set variables to replace. 1

a

1

a

1

a

1

c

1

1

a

1

b

1

a

1

c

1

1

c

a

1

b

c

References [ABW86] Krzysztof R. Apt, Howard Blair, and Adrian Walker. Towards a Theory of Declarative Logic. Technical Report RJ11681, IBM, Watson Research Center, 1986. [AU79] A. V. Aho and J. D. Ullman. Universality of data retrieval languages. In Conference Record of the Sixth Annual ACM Symposium on Principles of Programming Languages, pages 110{120, 1979. [BK86] F. Bancilhon and S. Khosha an. A calculus for complex objects. In Proc. Fifth Annual ACM Symposium on Principles of Database Systems, pages 53{59, ACM, Cambridge, Mass., 1986. [BMSU86] F. Bancilhon, D. Maier, Y. Sagiv, and J. D. Ullman. Magic sets and other strange ways to implement logic programs. In Proc. Fifth Annual ACM Symposium on Principles of Database Systems, pages 1{15, ACM, Cambridge, Mass., 1986. 24

[BNR*87] Catriel Beeri, Shamim Naqvi, Raghu Ramakrishnan, Oded Shmueli, and Shalom Tsur. Sets and negation in a logic database language (LDL1). In Proc. Sixth Annual ACM Symposium on Principles of Database Systems, pages 21{37, ACM, 1987. [GM78] H. Gallaire and J. Minker. Logic and Databases. Plenum Press, NY, 1978. [HN84] L. J. Henschen and S. A. Naqvi. On compiling queries in recursive rst-order databases. Journal of the ACM, 31(1):47{85, 1984. [HY82] R. Hull and C. K. Yap. The format model: A theory of database organization. In Proc. First Annual ACM Symposium on Principles of Database Systems, pages 205{211, ACM, Los Angeles, CA, 1982. [JS82] G. Jaeschke and H.-J. Schek. Remarks on the algebra of non rst normal form relations. In Proc. First Annual ACM Symposium on Principles of Database Systems, pages 124{138, ACM, Los Angeles, CA, 1982. [Kow79] R. Kowalski. Logic for Problem Solving. North Holland, Amsterdam, 1979. [Kup87a] G. M. Kuper. An Extension of LPS to Arbitrary Sets. Technical Report to appear, IBM, Watson Research Center, 1987. [Kup87b] G. M. Kuper. Logic programming with sets. In Proc. Sixth Annual ACM Symposium on Principles of Database Systems, San Diego, CA, 1987. [Kup87c] G. M. Kuper. LPS: A Logic Programming Language for Nested Relations. Technical Report RC 12624, IBM, Watson Research Center, 1987. [KV84] G. M. Kuper and M. Y. Vardi. A new approach to database logic. In Proc. Third Annual ACM Symposium on Principles of Database Systems, pages 86{96, ACM, Waterloo, Ontario, 1984. [LT84] J. W. Lloyd and R. W. Topor. Making PROLOG more expressive. Journal of Logic Programming, 1(3):225{240, October 1984. 25

[Nai83] [OY85] [Rei78] [RR83] [SN87] [SP82] [SS77] [TZ86] [Ull85] [vEK76]

L. Naish. Automatic Generation of Control for Logic Programs. Technical Report TR 83/6, University of Melbourne, 1983. Z. M. Ozsoyoglu and L.-Y. Yuan. A normal form for nested relations. In Proc. Fourth Annual ACM Symposium on Principles of Database Systems, pages 251{260, ACM, Portland, OR, 1985. R. Reiter. Deductive question answering in relational databases. In H. Gallaire and J. Minker, editors, Logic and Databases, pages 147{177, Plenum Press, 1978. M. Rafanelli and F. L. Ricci. A Data De nition Language for a Statistical Database. Technical Report TR-62, IASI-CNR, July 1983. Oded Shmueli and Shamim Naqvi. Set grouping and layering in Horn clause programs. In Proc. Fourth International Conference on Logic Programming, pages 152{177, 1987. H.-J. Scheck and P. Pistor. Data structures for an integrated data base management and information retrieval system. In Proc. Fourth Intl. Conf. on Very Large Data Bases, IEEE, 1982. J. M. Smith and D. C. P. Smith. Database abstractions: Aggregation and generalization. ACM Transactions on Database Systems, 2(2):105{133, 1977. S. Tsur and C. Zaniolo. LDL: A logic-based data-language. In Proc. Twelfth Intl. Conf. on Very Large Data Bases, pages 33{ 41, IEEE, Kyoto, Japan, 1986. J. D. Ullman. Implementation of logical query languages for databases. In Proc. ACM Int'l Conf. on Management of Data, ACM, Austin, TX, 1985. M. H. van Emden and R. A. Kowalski. The semantics of predicate logic as a programming language. Journal of the ACM, 23(4):733{742, 1976.

26