J. Eder / L. A. Kalinichenko (Eds): Advances in Databases and Information Systems (ADBIS'95), 185{203, Springer, 1995
Magic Sets vs. SLD-Resolution
Stefan Brass
Institut fur Informatik, Universitat Hannover Lange Laube 22, D-30159 Hannover, Fed. Rep. Germany
Abstract
It is by now folklore that the bottom-up evaluation of a program after the \magic set" transformation is \as ecient as" top-down evaluation. There are a number of formalizations of this in the literature. However, the naive formalization is false: As shown by Ross, SLD-resolution can be much more ecient than bottom-up evaluation with magic sets on tail-recursive programs. We show that this happens for tail-recursive programs, and that the only problem of magic sets is the materialization of \lemmas". So magic sets are always \as goal-directed as" SLD-resolution. These results are not surprising, but we believe that the variants given here are especially useful for teaching purposes. We also give rather simple proofs. Furthermore, we demonstrate that SLD-resolution can be directly simulated by bottom-up evaluable programs if all recursions are tailrecursive. This is based on the meta-interpreter approach of Bry and seems to be very promising. only
1 Introduction The eld of deductive databases is mainly the story of query evaluation algorithms for recursive rules using database techniques [1]. Among these, the many variants of the \magic set" technique were the most successful and are now a standard component of any deductive database system (e.g., [7, 9]). An important reason for the success of the \magic set" technique was the claim that it is as least as ecient as the \top-down evaluation" known from logic programming. An often cited formalization of this has been proven by Ullman in a paper called \Bottom-Up beats Top-Down for Datalog" [14]. Less well known is the fact that the result applies only to Ullman's QRGTresolution, not to the standard top-down algorithm, namely Prolog's SLDresolution. To be fair, already Ullman stated in a footnote of [14]: \However, Prolog implementations usually use a form of tail recursion optimization that, for certain examples, such as the right-linear version of transitive closure, will avoid rippling answer tuples up the rule/goal tree, and thus can be faster than QRGT." Although it is true that Prolog implementations have a tail recursion optimization, its main goal is to save memory, the improvement of the running time is only a side eect. And in fact, the dierence in performance can already be understood on the abstract level of SLD-trees, we do not have to look at the internal data structures of a Prolog system. 1
Example 1 Let us consider the standard program for computing the transitive closure: path (X; Y ) edge (X; Y ): path (X; Z ) edge (X; Y ) ^ path (Y; Z ): For simplicity, let the edge -relation be one straight line of length n: edge := (i ? 1; i) 1 i n : Now the SLD-tree for the query path (0; X ) looks as follows: path (0; X )
?? @@edge (0; Y ) ^ path (Y; X )
edge (0; X )
X=1
path (1; X ) .. .
path (n; X )
?? @@edge (n; Y ) ^ path (Y; X )
edge (n; X )
It has 4n + 3, i.e. O(n) nodes. The application of the two rules is possible in? constant time, whereas looking up a value in the edge -relation may take O log(n) -time (assuming that there is an index over the rst argument). So we can say that a reasonable implementation of SLD-resolution should have ? the running time O n log(n) . However, the standard magic set technique rewrites this program as follows: path (X; Y ) m path bf (X ) ^ edge (X; Y ): path (X; Z ) m path bf (X ) ^ edge (X; Y ) ^ path (Y; Z ): m path bf (0): m path bf (Y ) m path bf (X ) ^ edge (X; Y ): In this particular example, we can derive m path bf (i) for all 0 i n, and therefore we also get path (i; j ) for all 0 i < j n, thus the behaviour is at least O(n2 ) (not counting the computations of joins, duplicate elimination, and so on). 2 Ross has proposed in [10] a variant of the magic-set technique with a tailrecursion optimization, and Ramakrishnan and Sudarshan have compared in [8] the eciency of this method with Prolog-evaluation. And at least on an abstract level the result was that this method is as ecient as Prolog. So the problem is in principle solved. However, in this paper, we are interested in the eciency of the original magic set method. First, the method of Ross is far more complicated than
2
the original magic sets. So if one teaches deductive databases, one will most probably explain the original method rst. Second, the new method is not always superior to the original one. It also has not found its way into actually implemented deductive database systems, as far as I know. So results on the original method are still of practical importance. Third, this paper contains proofs which can be demonstrated to students. Ullman's eciency comparision is a lot more detailed than our one (besides relating to another algorithm). But this of course makes his proofs quite complicated, and we expect that the proofs of the results stated in [8] are also quite dicult. In this paper, we will consider a very simple cost measure, namely the number of nodes of the SLD-tree on one side, and the number of applicable rule instances on the other side. Certainly, a more detailed analysis is very valuable, but it is good to have a simple result rst. Related to our paper is also [13]: There, Seki compares the eciency of \Alexander Templates" (a variant of magic sets) with a version of SLDresolution with memoization. But precisely because of this memoization, the problem with tail-recursions does not occur. Finally, an important reference for the comparison between top-down and bottom-up evaluation is [3]: There, Bry shows how top-down evaluation can be simulated by evaluating a meta-interpreter bottom-up. Many known queryevaluation methods, and especially the \magic sets", can be derived from this \backward xpoint procedure". However, the memoization is implicit in this approach, so our problem again does not occur. Nevertheless, we show in section 7, that the idea of using a meta-interpreter can be adopted to get a complete simulation of SLD-resolution. At the moment, we were only able to do the necessary partial evaluation in the case where all recursions are tail-recursive. But even restricted to such programs, the method is already more general than earlier optimizations for tail-recursions. And, in contrast to the \magic set" technique, this transformation yields non-recursive programs if applied to non-recursive programs. It seems that this method can often be superior to magic sets, and certainly it will improve our understanding of query evaluation in deductive databases.
2 Basic Notions In this paper we consider allowed DATALOG programs, i.e. The atoms are of the form p(t1 ; . . . ; tn ), where p is a predicate and the ti are variables or constants (so function symbols are excluded). The rules contain no negation, i.e. they are of the form A B1 ; . . . ; Bm , where A and the Bi are atoms. Every variable of the head literal A must appear also in a body literal Bi (allowedness/range-restriction). Of course, modern systems such as CORAL [7] allow more general Prolog rules, but the magic set technique was originally developed for the above class of programs, and we wish to avoid unnecessary complications here and concentrate on the eciency problem. 3
For simplicity, we assume that the given query Q is a single literal. This is no restriction, since for a more general query a rule like answer (X1 ; . . . ; Xn ) can always be added. As usual, we distinguish two kinds of predicates: IDB-predicates are de ned by rules of the logic program, and EDB-predicates are de ned by facts in a database. Given a program P and a query Q, the goal of the magic sets transformation is to construct a new program P 0 and a new query Q0 which are \more eciently evaluable" (but see [11]) and answer-equivalent, i.e. for all databases DB and all substitutions the following holds: P [ DB ` Q () P 0 [ DB ` Q0 : Since our main emphasis is on the program, we sometimes say that a fact is derivable from a program (and even use the notation P ` A), but we always mean \with respect to an arbitrary but xed DB ". We call a rule A B1 ^ ^ Bm tail-recursive i the predicate of its last body-literal Bm depends (possibly indirectly) on the predicate of its headliteral A. As usual, we say that a predicate p depends directly on a predicate q i there is a rule of the form p(. . .) . . . q(. . .) . . .. Finally, we need the notion of a \lemma" in SLD-resolution. In this paper we consider only the \ rst literal" selection function of SLD-resolution (as known from Prolog). So the current goal operates as a stack of literals. If a node A1 ^ ^ Ak has a descendant node (A2 ^ ^ Ak )1 n , where the i are the applied most general uni ers (mgus), we say that A1 1 n has been proven as a lemma during SLD-resolution. Conversely, we have the following proposition: Lemma 2 Let A1 ^ ^ Ak be a node in an SLD-tree, and let be a ground substitution for A1 such that P ` A1 . Then this node has the descendant node (A2 ^ ^ Ak ). Proof: By the completeness of SLD-resolution, there is an SLD-refutation A1 ; 2 ; . . . ; n ; 2 with applied substitutions 1 ; . . . ; n such that 1 n restricted to the variables of A1 is (because of the range-restriction, the computed answer substitution must be a ground substitution). The applied rule variants can be chosen in such a way, that the i do not substitute any variables in A2 ^ ^ Ak except those occurring also in A1 . Now A1 ^ ^ Ak has the child node 2 ^ (A2 ^ ^ Ak )1 and so on until nally 2 ^ (A2 ^ ^ Ak )1 n is reached, which is exactly (A2 ^ ^ Ak ). 2
3 Recti cation
As noted by Ullman [14], the magic set approach can be much less ecient than SLD-resolution because it cannot handle variable-to-variable bindings. Ullman's example was recursive, but the following simpler example also demonstrates the problem: 4
Example 3 Consider the program consisting of the following two rules:
p(Y1 ; Y2 ; Y3 ) q(X; X; Y1 ; Y2 ; Y3 ): q(a; b; Y1 ; Y2 ; Y3 ) r(Y1 ) ^ r(Y2 ) ^ r(Y3 ): Let the query be p(Y1 ; Y2 ; Y3 ) and r be an EDB-predicate with n facts. In this case the magic set transformation treats q as called without any arguments
bound, so the bottom-up evaluation of the transformed program really constructs n3 facts about q. However, the SLD-tree for the same query contains only 2 nodes. 2 This is in fact no problem, because any program can be \recti ed": De nition 4 (Recti ed Program) A logic program P is recti ed i no body atom contains the same variable twice, i.e. for every atom p(t1 ; . . . ; tn ) in the body of a rule holds: If ti = tj for i 6= j , then ti is a constant. We also call a query p(t1 ; . . . ; tn ) recti ed, if it does not contain the same variable twice. In [14, 15], a \subgoal-recti ed" program is also not allowed to contain constants in the rule bodies. For our eciency result, however, this is not important. Note that our condition also does not exclude multiple occurrences of the same variable in the rule head | because of the allowedness restriction, the variable will be bound to a constant before it can do any harm. It is not dicult to rectify any given program. The idea [14, 15] is to introduce new predicates where equal arguments are merged into one argument. In the example, we replace q(X; X; Y1 ; Y2 ; Y3 ) by q11234 (X; Y1 ; Y2 ; Y3 ). In general, the j -th argument of a predicate p of arity n is represented as the ij -th argument of pi1 ...in. Then the rules about p are specialized to pi1 ...in by unifying the head with p(Xi1 ; . . . ; Xin ). We will not give the algorithm here (the algorithm in [14, 15] produces an even stricter recti cation than we need), but note the important properties: The resulting program Prect and query Qrect are recti ed. Prect and Qrect are anwer-equivalent to the original program and query. The SLD-tree for Prect [ f Qrectg has the same number of nodes as the original SLD-tree. In fact, one can simply replace all literals of the form pi1 ...in (t1 ; . . . ; tk ) by their old counterparts p(ti1 ; . . . ; tin ) in order to get the SLD-tree for P [ f Qg. In general, the recti cation can create a lot of new predicates and really \blow up" the program. However, in practice, this seldom ever happens. Furthermore, although the blowup can be more than exponential in the arity of the predicates, this arity is usually very small and xed for any given program. Of course, if there are many dierent versions of one predicate, the naive bottomup evaluation of the recti ed program takes much more time than the evaluation of the original program. However, we will show that the magic set evaluation of the recti ed program is as fast as SLD-resolution for the recti ed program, and we already know that this is as fast as the SLD-resolution for the original program. So for our theorem the possible blowup due to the recti cation does no harm, whereas not doing the recti cation is harmful, as Example 3 showed. 5
4 Explicit Binding Patterns (Adornments)
There are many variations of the magic set technique in the literature. In order to be precise and self-contained, we give a quick review of the magic set transformation in this section and the next. We also prove all necessary properties, which are needed as lemmas for our eciency result. Before the magic set transformation itself can be applied, we have to make the binding patterns explicit which would occur during SLD-resolution. For instance, if we have the rule p(X; Z ) q(X; Y ) ^ r(a; Y; Z ) and p is called with the rst argument bound, e.g. with a query of the form p(a; X ), then rst q is called with X bound and Y free, and then r is called with the rst two arguments bound (a is a constant and Y was bound by the call to q). This is denoted by \adornments" to the predicates as follows: pbf (X; Z ) qbf (X; Y ) ^ rbbf (a; Y; Z ): For simplicity of the presentation, we always assume that the body literals are called from left to right. Much research has been done on \sideways information passing (SIP)" strategies which select a better sequence. For instance, one can choose to evaluate next the literal with a maximal number of bound arguments, or a minimal number of unbound ones, and one can also use possible knowledge about functional dependencies (keys) or existing indexes. It is usually assumed that a standard relational query optimization will be done later, but in fact, the SIP strategy already determines a lot. Note also that, while in logic programming the query is usually xed (so the programmer can order the body literals in an optimal way), this is not the case in deductive databases: There the same program can be run with very dierent queries, so it is necessary that the system itself determines a good order of body literals. However, for our purposes, we can ignore these problems. Of course, the \selection function" of SLD-resolution must be compatible with the SIP-strategy, so we consider only the \ rst literal" selection function (known from Prolog) in this paper.
De nition 5 (Adorned Program) A binding pattern/adornment for a literal p(t1 ; . . . ; tn ) is a string over fb; f g of length n. If [i] = b, the i-th argument ti is called bound,
otherwise it is free. Only literals with IDB-predicates are adorned. Given a rule A B1 ^ ^ Bm and a binding pattern for A, the binding patterns i for the body literals Bi qi (ti;1 ; . . . ; ti;ni ) are determined as follows: If ti;j is a constant or a variable which appeared already in a Bk with k < i or in a bound position of A, then i [j ] = b, else i [j ] = f . Given a program P , its adorned version Padorn consists of the adorned versions of all rules in P for every possible binding pattern for the head literal. The adornment of a query p(t1 ; . . . ; tn ) is de ned by: [i] = b if ti is a constant, and [i] = f otherwise.
6
Theoretically, it is simplest to assume rst that all possible binding patterns for all predicates are constructed | so there will be 2n versions of a rule about a predicate p of arity n. Of course, given a speci c query p (. . .), only rules about predicates on which p depends are needed, and all other rules can obviously be eliminated without changing the answer. Therefore, an implementation of adorning will do the two steps together and never construct the completely adorned program. But for the complete version Padorn it is especially simple to see that adorning does not change the set of correct answers:
Lemma 6 If P ` p(c1 ; . . . ; cn ), then Padorn ` p (c1 ; . . . ; cn ) for all adornments . If Padorn ` p (c1 ; . . . ; cn ) for some adornment , then P ` p(c1 ; . . . ; cn ). Proof: This is proven by induction on the height of a bottom-up derivation tree
for p(c1 ; . . . ; cn ) (i.e. the number of iterations in the xpoint construction). Consider the last derivation step and let
B1 ^ ^ B m be the applied rule and be the instanciation. Then Padorn contains a p(t1 ; . . . ; tn )
rule of the form
p (t1 ; . . . ; tn ) B1 1 ^ ^ Bm m : But if P ` Bi , then Padorn ` Bi i by the induction hypothesis.
This is trivial, one only has to delete the adornments in a derivation of p (c1 ; . . . ; cn ) from Padorn to get a derivation of p(c1 ; . . . ; cn ) from P . 2 Lemma 7 The SLD-tree for the adorned program Padorn and query Qadorn has the same number of nodes as the SLD-tree for P [ f Qg. Proof: This is again trivial: Simply remove the adornments to get the SLDtree for P [ f Qg. 2 For the preceding lemmas, it was completely irrelevant, which binding patterns are assigned to the body literals. In order to understand the real meaning of the adornments, we must prove that they correspond to the binding patters occurring during SLD-resolution. To formalize this, we assume that also SLD-resolution is applied to the adorned program. This is slightly unusual, because the adornments were only introduced as a preprocessing step for the magic set transformation. But due to Lemma 7 adorning does not in uence SLD-resolution. Furthermore, the same was true for the recti cation. So it suces to consider in the following recti ed and adorned programs, because these properties can be achieved without decreasing the eciency of SLD-resolution. De nition 8 (Correct Adornment) A literal p (t1; . . . ; tn) has the correct adornment i the following holds: 7
If [i] = b, then ti is a constant. If [i] = f , then ti is a variable. No variable appears twice in p (t1 ; . . . ; tn ). Theorem 9 Let P and Q be recti ed and adorned. Then the rst literal of every node in the SLD-tree of P [ f Qg has the correct adornment. Proof: The proof is by induction on the length of the SLD-derivation to that node (i.e. the distance from the root). In order to make the induction work, we prove the following statements about nodes A1 ^ ^ Ak of the SLD-tree: Let X be a variable which appears in Ai , but in no Aj with j < i (i.e. this occurrence is the rst). Then X appears in a free argument position in Ai . Furthermore, X appears only once in Ai . Let X be a variable which appears in Ai and in some Aj , j < i (i.e. this occurrence is not the rst). Then the argument position in Ai is bound. Constants appear only in bound argument positions. For the rst literal A1 , this directly implies the statement of the theorem. For the given query Q, the conditions hold by de nition of the adornment (and because the query is recti ed). Now let a rule (or EDB-fact) p (u1 ; . . . ; un ) B1 ^ ^ Bm be applied to A1 p (t1 ; . . . ; tn ). As usual, we assume that the variables of the rule are disjoint from the variables of A1 ^ ^ Ak . By the induction hypothesis, we know which ti are variables, so we can compute the most general uni er as follows: If [i] = f , we subsitute ti by ui . If [i] = b, and ui is a new variable, we subsitute ui by ti . (If ui is a constant, it must be equal to ti . If ui = uj with j < i, then ti = tj must hold.) Now we have to verify our conditions for the for the child node (B1 ^ ^ Bm ^ A2 ^ ^ Ak ): Since the variables of the Bi and Aj were disjoint, the Bi are eected only by the second kind of substitution. So exactly the variables in bound argument positions of the head were substituted by constants, and the conditions directly follow from the construction of the adorned rule and the recti cation. Now let us consider the Aj . Since the conditions were satis ed before the derivation step, it suces to look at the variables ti which were substituted. Occurrences of ti in Aj , j 2, are not the rst occurrence of these variables (the rst occurrence was in A1 ). So they occur only in bound argument positions. If ti was substituted by a constant, the condition is obviously satis ed. If ti was substituted by a variable ui , this variable appears also in B1 ^ ^ Bm (due to the allowedness condition), so occurrences in Aj , j 2, are again not the rst occurrence. 2 8
5 The Magic Sets Transformation Now the magic set transformation itself can be done. The idea is to introduce for every IDB-predicate p and binding pattern a new predicate m p which contains the bindings for which we want to compute all corresponding p-facts. Then we make the rules applicable only if we really need their result.
De nition 10 (Magic Literals) Let magic [p (t1 ; . . . ; tn )] := m p (ti1 ; . . . ; tik ); where 1 i1 < < ik n are the indices with [ij ] = b. Literals with the new predicates m p are called magic literals, literals with the original predicates p are called non-magic literals.
De nition 11 (Magic Set Transformation) Let P and Q be recti ed and adorned. Then the result Pmagic of the magic set transformation consists of: For every rule A B1 ^ ^ Bm from P the \modi ed rule" A
magic [A] ^ B1 ^ ^ Bm :
For every rule A B1 ^ ^ Bm from P and every Bi with IDB-predicate the \magic rule" magic [Bi ]
magic [A] ^ B1 ^ ^ Bi?1 :
For the query Q the \seed": magic [Q]:
The \modi ed rules" correspond to the old program, however, they are restricted to \useful" applications, which contribute to our goal. The \magic rules" formalize that if we want to know the head, we also want to know the body literals, as long as there is still the chance that the rule is applicable. The \seed" nally states that we want to know all derivable instances of the query. The correctness of the magic sets transformation can be veri ed as follows:
Lemma 12 Let A be a non-magic literal. If Pmagic ` A, then P ` A. If P ` A and Pmagic ` magic [A], then Pmagic ` A. Proof: This is trivial: The modi ed rules are only restricted versions of the original rules. So if they are applicable, also the original rules can be applied to derive A. 9
The proof is by induction on the number of iterations in the xpoint
computation of P needed to derive A. Let A be derived by the rule instance
A
B1 ^ ^ B m :
Let Bi1 ; . . . ; Bik the body literals with IDB-predicate (from left to right). Since Pmagic ` magic [A], we can for j = 1; . . . ; k rst apply the rule instance magic [Bij ]
magic [A] ^ B1 ^ ^ Bij ?1
to get Pmagic ` magic [Bij ] and then apply the inductive hypothesis to get Pmagic ` Bij (The literals with EDB-predicates are trivial, since the database is equal in both cases.). Now we only have to apply
A
magic [A] ^ B1 ^ ^ Bm
to get Pmagic ` A.
2
Corollary 13 (Well-known from the literature) The result of the magic
set transformation is answer-equivalent to the original program. Proof: If Pmagic ` Q, then P ` Q. Conversely, if P ` Q, then Pmagic ` Q since magic [Q] = magic [Q] is given as a fact. 2
6 Eciency of Magic Sets
We are now in a position to state and prove our main results on the eciency of the magic set method. First, as intuitively expected, the magic facts really corresond to selected literals in the SLD-tree: De nition 14 (Magic Facts vs. Goal Literals) A literal p (t1; . . . ; tn) matches a magic fact m p (c1 ; . . . ; ck ) i p (t1 ; . . . ; tn ) has the correct adornment and magic [p (t1 ; . . . ; tn )] = m p (c1 ; . . . ; ck ). Theorem 15 Let P and Q be a recti ed and adorned. Then for every magic fact m p (c1 ; . . . ; cn ) derivable from Pmagic there is node in the SLD-tree of P [ f Qg with matching rst literal. Proof: The proof is by induction on the number of derivation steps needed for m p (c1 ; . . . ; cn ). The \seed" fact directly corresponds to the root of the SLD-tree. Now let m p (c1 ; . . . ; cn ) be derived by a magic rule magic [Bi ] magic [A] ^ B1 ^ ^ Bi?1 instanciated by . So m p (c1 ; . . . ; cn ) = magic [Bi ] = magic [Bi ]: 10
Let 0 be the restriction of to the variables in magic [A], 1 be the restriction of to the variables which rst appear in B1 , and so on. So can be written as = 0 i?1 . Since magic [A0 ] was derived with fewer steps, we can apply the inductive hypothesis to conclude that there is a node A1 ^ ^ Ak in the SLD-tree where A1 matches magic [A0 ]. Now we of course resolve A1 with A B1 ^ ^ Bm . We compute the mgu as in the proof of Theorem 9, i.e. we substitute variables at bound argument positions in A by the corresponding constants in A1 , this gives exactly 0 , and variables in free argument positions of A1 by the corresponding values in A, let this substitution be called . Because 0 is a ground substitution, the mgu can we written as 0 . Then the child node in the SLD-tree is (B1 ^ ^ Bm )0 ^ (A2 ^ ^ Ak )0 (Since the variables were disjoint before the resolution, B1 ^ ^ Bm is not effected by .) But now, since we know that Pmagic ` B1 0 1 , we get P ` B1 0 1 by Lemma 12, and then Lemma 2 ensures the existence of a descendant node (B2 ^ ^ Bm )0 1 ^ (A2 ^ ^ Ak )0 1 : Continuing this, we nally get to (Bi ^ ^ Bm )0 1 i?1 ^ (A2 ^ ^ Ak )0 1 i?1 : But now Bi 0 1 i?1 i.e. Bi matches m p (c1 ; . . . ; cn ): By Theorem 9 it has the correct adornment, and magic [Bi ] = m p (c1 ; . . . ; cn ) holds by construction. 2 Next, let us look at the non-magic facts. We will show that they are proven as \lemmas" in the SLD-resolution. Suppose that we have a node A1 ^ ^ Ak in the SLD-tree. If it has a descendant node of the form (A2 ^ ^ Ak ), where is the composition of all applied substitutions, we say that A1 has been proven as a lemma. Theorem 16 Let P and Q be a recti ed and adorned. Then every non-magic fact p (c1 ; . . . ; cn ) derivable from Pmagic is proven as a lemma during SLDresolution. Proof: Let p (c1; . . . ; cn) be derived by the rule A magic [A] ^ B1 ^ ^ Bm instanciated with . So magic [A] is derivable from Pmagic, and Theorem 15 tells us that there is a node A1 ^ ^ Ak in the SLD-tree with matching rst literal. We now proceed as in the proof to Theorem 15, i.e. compute the uni er 0 , and then use Pmagic ` Bi together with Lemma 12 and Lemma 2 to resolve the Bi away. This nally leads to a descendant node (A2 ^ ^ Ak )0 m : But since already A1 0 = A0 , we can conclude that the proven lemma A1 0 m is in fact A. 2 11
This proves our claim that bottom-up evaluation after the magic set transformation is \as goal-directed as" SLD-resolution. This is important, since it is usually considered to be the main problem of pure bottom-up evaluation that it is not goal-directed. But let us now look at the eciency results. As explained above, we compare the number of applicable rule instances in Pmagic to the number of nodes in the SLD-tree. Both are very high level (simple) cost measures. First, the magic rules never pose a problem, they correspond directly to nodes of the SLD-tree: Theorem 17 Let P and Q be a recti ed and adorned. Then the number of applicable instances of magic rules in Pmagic is the number of nodes in the SLD-tree of P [ f Qg. Proof: Let us modify for the moment SLD-resolution in such a way that when it resolves with A B1 ^ ^ Bm , it attaches magic [Bi ] magic [A] ^ B1 ^ ^ Bi?1 to Bi (and also applies all substitutions to these attachments). We now assign to each node the rule instance attached to its rst literal. Now we of course have to verify that all applicable magic rule instances are actually assigned to some node of the SLD-tree, i.e. the mapping is surjective. Let us consider magic [Bi ] magic [A] ^ B1 ^ ^ Bi?1 instanciated by . If this rule instance is applicable, magic [A] is derivable, so by Theorem 15 there is a node with matching rst literal which can be resolved with A B1 ^ ^ Bm . At this point, the magic rule in question is assigned to Bi . Now if we do the following derivation as in the proof of Theorem 15, i.e. utilize Pmagic ` Bj (j = 1; . . . ; i ? 1) together with Lemma 12 and Lemma 2, Bi nally gets to the front. 2 As Example 1 shows, a corresponding statement for non-magic facts does not hold. The problem is that one node in the SLD-tree can prove an unbounded number of lemmas. For instance, consider the success node X = n. It completes the proofs of the lemmas path (i; n) for all nodes path (i; X ) above it. However, it is not dicult to see that this can happen only if there is a tail-recursive rule. Otherwise, by resolving away a single literal, we can complete only a restricted number of rule bodies above it (namely, as many as we have predicates). Theorem 18 Let P and Q be a recti ed and adorned. If P contains no tail-recursive rules (i.e. the last literal of every rule body does not depend on the head), the number of applicable instances of non-magic (modi ed) rules in Pmagic is ? O number of nodes in the SLD-tree of P [ f Qg : Proof: We can again visualize the applied rule instances in the SLD-tree. This time, when we apply A B1 ^ ^ Bm to a node A1 ^ ^ Ak in the SLDtree, we put the corresponding modi ed rule as a comment into the child node (similar to the framed literals of OL-resolution): ? B1 ^ ^ Bm ^ [A magic [A] ^ B1 ^ ^ Bm ] 0 ^ (A2 ^ ^ Ak )0 : 12
As explained above, if the rule instance is applicable, the Bi will be resolved away and the \framed" portion will come to the beginning of a descendant node. We delete it from the node at that time, because we need to access literals behind it, but we assign to every node the framed rule instances deleted at that node. There can be several such framed rule instances before the beginning of proper (non-framed) literals, but not more than there are predicates: Because we remove the framed rule instances once a literal has been completely proven, the situation
. . . [A0 . . .] ^ [A magic [A] ^ B1 ^ ^ Bm ] ^ . . . can only occur if Bm has called A0 (i.e. they have the same predicate). So if we had a chain which is longer than the number of predicates, we would have a tail recursion. 2
So this veri es that the only problem with magic sets is the materialization of \lemmas" in the case of tail recursive programs.
7 Translation of SLD-Trees into Rules So magic sets are quite similar to SLD-resolution, but the correspondence is not exact. Let us therefore try to map SLD-resolution into bottom-up evaluable programs as directly as possible. We use Bry's idea to start with a metainterpreter, which describes top-down evaluation, but is intended for bottomup evaluation [3]. However, Bry's meta-interpreter corresponds to the QRGTalgorithm, therefore he got the standard magic-set transformation. In fact, [3] also contains a meta-interpreter which corresponds more directly to SLDresolution, but it still implicitly uses the memoization of derived lemmas. So we now only have to start with a meta-interpreter which describes SLDresolution as directly as possible. We represent rules as facts about a binary predicate rule , where the rst argument is the head literal and the second argument is the list of body literals. For instance, path (X; Z ) is stored as
edge (X; Y ) ^ path (Y; Z )
?
rule path (X; Z ); [edge (X; Y ); path (Y; Z )] :
Of course, this fact is neither DATALOG (the old predicates are now function symbols) nor allowed (it contains variables, but does not have a rule body). Meta-interpreters always need a more general rule language, but this is only temporary, because we will later try? to remove this by partial evaluation. EDB-facts are represented as db p(c1 ; . . . ; cn ) . The query Q is represented by the fact ?
query [Q; ans (X1 ; . . . ; Xn )] ; where X1 ; . . . ; Xn are the answer variables. The task of the additional atom ans (X1 ; . . . ; Xn ) is to store the substitution for these variables.
13
Now the following program obviously computes the nodes of the SLD-tree: node (Query ) query (Query ): node (Child ) node ([Lit jRest ]) ^ rule (Lit ; Body ) ^ append (Body ; Rest ; Child ): node (Rest ) node ([Lit jRest ]) ^ db (Lit ): answer (X1 ;?. . . ; Xn ) node [ans (X1 ; . . . ; Xn )] : In the last rule, it is assumed that the number n of answer variables is known. If not, the Prolog predicate =.. can be used.
Example 19 This meta-interpreter can be evaluated bottom-up by systems like CORAL [7] which allow structured terms and non-ground facts. For instance, if we apply it to the transitive closure (Example 1) for n = 1, CORAL answers the query \?node(Node)" with Node=[path(0,_X0),ans(_X0)]. Node=[edge(0,_X0),ans(_X0)]. Node=[edge(0,_X0),path(_X0,_X1),ans(_X1)]. Node=[ans(1)]. Node=[path(1,_X0),ans(_X0)]. Node=[edge(1,_X0),path(_X0,_X1),ans(_X1)]. Node=[edge(1,_X0),ans(_X0)].
So this approach directly simulates SLD-resolution.
2
Of course, the use of lists and non-ground facts signi cantly decreases the performance. Therefore, we will try to eliminate them. The appraoch suggested by [3] is partial evaluation with respect to the given rules and the query. However, simply unfolding the body literal rule (Lit ; Body ) does not help much. The problem is of course the relation node | we need more information about the derivable node -facts in order to continue the partial evaluation. In the example, we see that if we abstract from the constants, there are only four \types" of nodes:
De nition 20 (Node Type) Let Xi and Ci (i 2 IN0) be disjoint sequences of variables. We will call the Ci in the following parameters. Let a1 ; . . . ; ak be the constants appearing in the given program P . A node type N is a conjunction A1 ^ ^ An of literals containing only variables from fX0 ; X1 ; . . .g[fC0 ; C1 ; . . .g and constants in fa1 ; . . . ; ak g. Furthermore, if Xi appears somewhere in the query, then fX0 ; . . . ; Xi?1 g appear already to the left (and the same for Ci ). A node A1 ^ ^ An has type N if there is a mapping from the parameters Ci to constants such that N = A1 ^ ^ An . 14
In the transitive closure example, the following four node types appear | independent of the EDB-relation edge : path (C0 ; X0 ) ^ ans (X0 ). edge (C0 ; X0 ) ^ ans (X0 ). ans (C0 ). edge (C0 ; X0 ) ^ path (X0 ; X1 ) ^ ans (X1 ). Now the idea is to turn these node types into predicates with the parameters as arguments. For instance, we represent the query path (0; X0) ^ ans (X0 ) as path C0 X0 ans X0 (0): Of course, it is essential for this method that a nite set of node types suces to represent all nodes occurring in the SLD-tree. This can be guaranteed for the following class of programs:
De nition 21 (Restricted Recursion) A program is at most tail-recursive i for every rule A B1 ^ ^ Bm in P the predicates of B1 ; . . . ; Bm?1 do not depend on the predicate of A, i.e. no body literal except possibly the last is recursive. Note that this class of programs is much larger than the class for which the \right recursion optimization" of [15] is applicable. We believe that a large percentage of the programs occurring in practice fall into our class. Also, generations of Prolog programmers have been taught to write such programs.
Lemma 22 Let P be an at most tail-recursive program and Q be a query. Then there is a nite set N of node types, such that for any values of the EDB-predicates, the nodes occurring in the SLD-tree of P [f Qg have a type in N . Proof: In at most tail-recursive programs, a rule can only applied recursively
on its last body literal, so iterative recursive rule applications cannot make the stack of \to be proven" literals longer. Therefore, the length (number of literals) of nodes in the SLD-tree is bounded. An upper bound would be the sum of the lengths of all rule bodies. But for any xed length of nodes, a nite N obviously suces. 2 The next question is of course, how to compute the occurring node types N for a given program P and query Q. When we do that, we will at the same time construct the partially evaluated meta-interpreter MP .
Algorithm 23 (Computation of Node Types) Let P be an at most tailrecursive program and Q be any query. First, we initialize N and MP with the given query: Rename the variables in Q to X1 ; . . . ; Xn (in the order of appearence). Append ans (X1 ; . . . ; Xn ) to Q. 15
Replace every occurrence of a constant by a parameter Ci . Let N be the
result and c1 ; . . . ; ck be the corresponding constants. Now initialize N := fN g, MP := hN i(c1 ; . . . ; ck ) (we use hN i to denote a predicate corresponding to the node type N ). Then we repeat the following steps until N and MP cannot be further expanded: Choose a node type A1 ^ ^ An from N . Let C1 ; . . . ; Ck be its parameters. If the predicate p of A1 is an IDB-predicate, then: Choose a rule A B1 ^ ^ Bm about p from P . Rename the0 variables to make them disjoint from the Xi and Cj . Let the result be A B10 ^ ^ Bm0 . Compute a most general uni er of A1 and A0 , subject to the following restriction: Parameters Ci may only be substituted by other parameters or constants, but not by any other variables. Since this restricts only the direction of variable-to-variable bindings, it is always possible. Let Ci1 ; . . . ; Cil be the parameters remaining in (B10 ^ ^ Bm0 ^ A2 ^ ^ An ) (in the order of rst occurrence). Now rename Cij into Cj and also normalize the other variable names to X0 ; X1 ; . . . (again in the correct order). Let N be the resulting node type. Add N to N and the following rule to MP : hN i(Ci1 ; . . . ; Cil ) hA1 ^ ^ An i(C1 ; . . . ; Ck ): If p is an EDB-predicate of arity m, then: Compute a most general uni er of A1 and p(Ck+1 ; . . . ; Ck+m ) (restricted as above). Let Ci1 ; . . . ; Cil be the parameters in (A2 ^ ^ An ) (again in the order of rst occurrence). Let N be the result of the normalization of parameter and variable names (as above). Add N to N and the following rule to MP : hN i(Ci1 ; . . . ; Cil ) hA1 ^ ^ An i(C1 ; . . . ; Ck ) ^ p(Ck+1 ; . . . ; Ck+m ): When this computation terminates, N are all node types occurring during SLDresolution and MP simulates SLD-resolution in the sense that hN i(c1 ; . . . ; ck ) is derivable from MP i N fC1 =c1; . . . ; Ck =ck g appears in the SLD-tree of P and Q (modulo a renaming of variables and the appended ans -literal). 2 16
Example 24 In the transitive closure example, we get the following program
(corresponding to the partial evaluation of our meta-interpreter wrt the node types): path C0 X0 ans X0 (0): edge C0 X0 ans X0 (C0 ) path C0 X0 ans X0 (C0 ): edge C0 X0 path X0 X1 ans X1 (C0 ) path C0 X0 ans X0 (C0 ): ans (C2 ) edge C0 X0 ans X0 (C0 ) ^ edge (C0 ; C2 ): path C0 X0 ans X0 (C2 ) edge C0 X0 path X0 X1 ans X1 (C0 ) ^ edge (C0 ; C2 ): Let us for instance explain in a little more detail, how the last rule was computed. There we had the node type edge (C0 ; X0 ) ^ path (X0 ; X1 ) ^ ans (X1 ): Then we computed the mgu of the rst literal edge (C0 ; X0 ) and the generic EDB-fact edge (C1 ; C2 ). Due to our restriction, X0 must be substituted by C2 , whereas the binding between C0 and C1 can be chosen arbitrarily. We decided to substitute C1 by C0 . Then we get the following child node in the SLD-tree: path (C2 ; X1 ) ^ ans (X1 ): Now we only have to normalize the parameters and variables again in order to get the node type represented in the head literal. But of course, the argument of the head literal remains C2 , otherwise we would loose the connection to the rule body. Obviously, we can eliminate the pure copying rules in the above program. If we do that, we get the standard optimized version of this program [15, 6]: m path (0): m path (Z ) m path (X ) ^ edge (X; Z ): ans (Y ) m path (X ) ^ edge (X; Y ): But even without further improvements, bottom-up evaluation of the above program is as ecient as SLD-resolution in the sense that any applicable rule instance corresponds to an edge in the SLD-tree. Furthermore, we can guarantee termination, even if the edge -relation is cyclic. 2 Of course, generalizations beyond \at most tail-recursive programs" are a topic of future research. In some more general cases we could apply \counting" techniques. For instance, in the well-known \same generation" example, the nodes have the form sg (c; X0 ) ^ down (X0 ; X1 ) ^ ^ down (Xn?1 ; Xn ) ^ ans (Xn ): We can represent the information of this node by a fact sg down (c; n). Of course, it is dicult to automize such an analysis, but it is nice that the counting 17
technique as well as its generalizations [4] can be derived from our approach. However, if we leave the domain of tail-recursive programs, we also simulate the possible non-termination of SLD-resolution (which is a known problem of counting techniques). Note nally that our transformation yields non-recursive programs if applied to non-recursive programs. This is an important dierence to the standard magic set technique. Also values for anonymous variables are never explicitly represented. So this approach seems to be very promising.
8 Conclusions
In the rst part of this paper, we have clari ed the eciency of the original \magic set" method in comparison with the top-down query evaluation method, namely SLD-resolution. The result was that the only problem of magic sets is the materialization of lemmas, and that this only causes trouble if there is tail-recursion. Given earlier results for other variants of top-down evaluation, this might be not very surprising, but an explicit proof was still needed. Furthermore, the problem is of great practical importance, and should be treated in every course on deductive databases. We believe that the presentation given here is suciently simple for teaching purposes. In the second part of this paper, we have demonstrated how the potential advantages of SLD-resolution can be utilized in a program-transformation quite dierent from the magic sets. Currently, the transformation is applicable only to non-recursive and tail-recursive programs, and in both cases it seems to have advantages over magic sets. Naturally, we plan to generalize our transformation to arbitrary programs and to support our eciency claims with further investigations and practical experiments. Also, lower-level eciency comparisons would be very helpful. It is a strange situation that the currently probably fastest \deductive database", namely XSB [12], uses Prolog-technology and not database technology. Furthermore, we should consider also negation. We believe that a generalization to strati ed programs should not be too dicult. It is a known problem of the magic set method that it might destroy the strati cation. However, in [5] it was shown that in this case a meta-interpreter (similar to the one by Bry, but extended to much more general rules) results in a locally strati ed program. Also, since our method does not introduce new recursions, it even seems plausible that it will not destroy a strati cation. A generalization to non-strati ed programs should be possible using ideas of [2].
References
[1] F. Bancilhon and R. Ramakrishnan. An amateur's introduction to recursive query processing. In C. Zaniolo, editor, Proc. of SIGMOD'86, pages 16{52, 1986. [2] S. Brass and J. Dix. A general approach to bottom-up computation of disjunctive semantics. In J. Dix, L. M. Pereira, and T. Przymusinski, editors, Nonmonotonic Extensions of Logic Programming, number 927 in LNAI, pages 127{155. Springer, 1995. 18
[3] F. Bry. Query evaluation in recursive databases: bottom-up and top-down reconciled. Data & Knowledge Engineering, 5:289{312, 1990. [4] S. Greco and C. Zaniolo. Optimization of linear logic programs using counting methods. In A. Pirotte, C. Delobel, and G. Gottlob, editors, Advances in Database Technology | EDBT'92, 3rd Int. Conf., number 580 in LNCS, pages 72{87. Springer-Verlag, 1992. [5] L. Kalinichenko and V. Zadorozhny. A generalized information resource query language and basic query evaluation technique. In C. Delobel, M. Kifer, and Y. Masunaga, editors, Deductive and Object-Oriented Databases, 2nd Int. Conf. (DOOD'91), number 566 in LNCS, pages 547{ 566. Springer, 1991. [6] D. B. Kemp, K. Ramamohanarao, and Z. Somogyi. Right-, left- and multi-linear rule transformations that maintain context information. In D. McLeod, R. Sacks-Davis, and H. Schek, editors, Proc. Very Large Data Bases, 16th Int. Conf. (VLDB'90), pages 380{391. Morgan Kaufmann Publishers, 1990. [7] R. Ramakrishnan, D. Srivastava, S. Sudarshan, and P. Seshadri. The CORAL deductive system. The VLDB Journal, 3:161{210, 1994. [8] R. Ramakrishnan and S. Sudarshan. Top-down vs. bottom-up revisited. In V. Saraswat and K. Ueda, editors, Proc. of the 1991 Int. Symposium on Logic Programming, pages 321{336. MIT Press, 1991. [9] K. Ramamohanarao. An implementation overview of the Aditi deductive database system. In S. Ceri, K. Tanaka, and S. Tsur, editors, Deductive and Object-Oriented Databases, Third Int. Conf., (DOOD'93), number 760 in LNCS, pages 184{203. Springer, 1993. [10] K. A. Ross. Modular acyclicity and tail recursion in logic programs. In Proc. of the Tenth ACM SIGACT-SIGMOD-SIGART Symp. on Princ. of Database Systems (PODS'91), pages 92{101, 1991. [11] Y. Sagiv. Is there anything better than magic? In S. Debray and M. Hermenegildo, editors, Proc. of the North American Conf. on Logic Programming, pages 235{254. MIT Press, 1990. [12] K. Sagonas, T. Swift, and D. S. Warren. XSB as an ecient deductive database engine. In R. T. Snodgrass and M. Winslett, editors, Proc. of the 1994 ACM SIGMOD Int. Conf. on Management of Data (SIGMOD'94), pages 442{453, 1994. [13] H. Seki. On the power of Alexander templates. In Proc. of the Eighth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems (PODS'89), pages 150{159, 1989. [14] J. D. Ullman. Bottom-up beats top-down for Datalog. In Proc. of the Eighth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems (PODS'89), pages 140{149, 1989. [15] J. D. Ullman. Principles of Database and Knowledge-Base Systems, Vol. 2. Computer Science Press, 1989. 19