A Logical Notion of Conditional Independence - Semantic Scholar

Comment

Report 3 Downloads 103 Views

A Logical Notion of Conditional Independence Properties and Applications Adnan Darwiche Rockwell Science Center 1049 Camino Dos Rios Thousand Oaks, CA 91360 [email protected] May 15, 1997 Abstract

We propose a notion of conditional independence with respect to propositional logic and study some of its key properties. We present several equivalent formulations of the proposed notion, each oriented towards a speci c application of logical reasoning such as abduction and diagnosis. We suggest a framework for utilizing logical independence computationally by structuring a propositional logic database around a directed acyclic graph. This structuring explicates many of the independences satis ed by the underlying database. Based on these structural independences, we develop an algorithm for a class of structured databases that is not necessarily Horn. The algorithm is linear in the size of a database structure and can be used for deciding entailment, computing abductions and diagnoses. The presented results are motivated by similar results in the literature on probabilistic and constraint-based reasoning.

1 Introduction A major factor in slowing down logical computations is that reasoners tend to consider irrelevant parts of a given database when computing answers to queries. This has prompted a considerable amount of research on the notion of irrelevance, with the goal of identifying these irrelevant parts of a database so they can be avoided by logical reasoners [13, 15, 24, 23]. Irrelevance has also been the subject of extensive research in probabilistic reasoning, where it is typically referred to as independence [19]. The scope of independence in probability, however, seems to be bigger than its scope in logic for at least two reasons. First, although there is a standard de nition of independence in probability, there seems to be no agreement on a de nition of irrelevance in logic. This also contributes to the lack of a comprehensive theory of logical irrelevance similar to the one for probabilistic independence. Second, the computational utility that irrelevance brings to logical reasoning is perceived as a luxury instead of a necessity since in uential algorithms for logical reasoning have not been based on irrelevance. This is contrary to what one nds in probabilistic reasoning where independence is the building block of almost all state-of-the-art algorithms. Motivated by the role of independence in probability, our goal in this paper is to show that independence can play a similar role in logical reasoning, at least in the context of propositional logic. Therefore, our de nition of conditional independence with respect to propositional databases resembles the de nition of conditional independence with respect to probability distributions. We use Logical Conditional Independence, LCI, to refer to the proposed de nition. Our contribution here is not LCI perse, but rather (a) the framework we propose for exploiting it computationally and (b) the various formulations of LCI that we provide with respect to dierent reasoning tasks. We show that LCI introduces to propositional reasoning many of the tools and techniques that conditional Currently at: Department of Mathematics, American University of Beirut, PO Box 11 - 236, Beirut, Lebanon. Email: [email protected].

1

independence has introduced to probabilistic reasoning. This includes a paradigm for logical reasoning in which the amount of independence information decides the computational complexity of reasoning, just as independence controls the complexity of probabilistic reasoning. The various formulations of LCI that we present make it clear how independence can be computationally valuable to the corresponding reasoning tasks. We utilize these formulations by developing an algorithm that can be used for deciding entailment, computing abductions and diagnoses. The algorithm provides a good example of how to use independence information when deriving algorithms for logical reasoning. In the following section, we provide a more extended introduction to the proposed notion of independence where we explain the choices we had to make in developing it. We also outline the structure of the paper in light of the results to be presented.

2 Key Choices for Independence When formulating a notion of independence, one faces a number of choice points. We devote this section to enumerating some of these points and to presenting our position on them. This helps in relating our approach to the spectrum of other existing approaches. It also provides a good opportunity for outlining the structure of the paper.

What objects can appear in an independence relation? One nds proposals in which sentences,

atomic propositions, predicates, even algorithms appear as part of an independence relation [13]. In our proposal, independence is a relation between four objects: a propositional database and three sets of atomic propositions X , Y and Z . Speci cally, LCI decides whether the database nds X independent of Y given Z . When the independence holds, we write Ind (X; Z; Y ) and refer to it as an LCI assertion. This is similar to probabilistic independence which tells us whether a probability distribution nds X independent of Y given Z .

What decides the correctness of a de nition of independence? In considering the literature on independence (both probabilistic and logical), one nds two main positions: 1. a philosophical position that starts with postulating some intuitive properties of independence and then proposes a de nition that adheres to these properties. In most of these approaches, independence is formulated in terms of belief change, that is, formalizing the irrelevance of certain information to certain beliefs [11, 19]. The probabilistic notion of independence can clearly be motivated on these grounds where belief in a proposition corresponds to its probability. 2. a pragmatic position, where independence is not an absolute notion but rather a task speci c one. That is, there is no correct or incorrect de nition of independence but rather a useful or not-veryuseful one. For example, in deciding an entailment test j= , a de nition of independence may target the identi cation of sentences in that if removed from will not aect the test result. LCI, at least as developed in this paper, is meant to be a pragmatic notion. In fact, we provide dierent and equivalent formulations of LCI, each explicating its usefulness to a speci c reasoning task. The tasks we cover are: deciding entailment and satis ability, which are dual tasks, and computing abductions and diagnoses, which are also dual tasks. This leads to a total of four formulations of LCI that are meant to explicate its computational role in these dierent reasoning tasks. LCI, however, does have a formulation based on belief change, which happens to be the closest one to probabilistic independence. We therefore start in Section 3 with this formulation and then lead into the other four formulations in Section 4 (entailment and satis ability) and Section 5 (abduction and diagnosis). How should independence be used computationally? If one is adopting a pragmatic approach to independence, then the usage of independence will depend on the reasoning task of interest. A very popular use of independence though is in pruning a database before attempting certain computations. For example, in testing for logical entailment, the current practice is to use independence for reducing an entailment test 2

j= into a simpler test 0 j= , where 0 is obtained by removing sentences from that are irrelevant to the test. Although LCI will support such usage, its main computational role is dierent. In deciding entailment, for example, LCI will be used to decompose a global entailment test j= into a number of local tests 1 j= 1 , 2 j= 2, : : : , n j= n, where each i is so small that its corresponding test i j= i can be performed in constant time. To introduce the computational role of LCI, we point out that the following decompositions are generally not valid in logic: 1. Entailment: Decomposing an entailment test j= _ into two simpler tests j= and j= . 2. Satis ability: Decomposing a satis ability test with respect to [ f ^ g into two tests with respect to the smaller databases [ fg and [ f g. 3. Abduction: Decomposing the abductions of a nding _ into the abductions of and those of . 4. Diagnosis: Decomposing the diagnoses of an observation ^ into the diagnoses of and those of . We shall demonstrate, however, that each one of these decompositions becomes valid when certain independences hold between the atoms appearing in and those appearing in . In fact, Sections 4 and 5 provide examples of how such decompositions can be used to decompose a global computation into a number of local computations.

What is the source of independence information? Most existing approaches attempt to discover

independence information by pre-processing a given database [15, 23, 17, 16]. Although this is consistent with our utilization of independence, we advocate a dierent strategy that is motivated by the following result: If a database is graphically structured | that is, satis es some conditions that are dictated by a directed acyclic graph | then the topology of the structure reveals many of the independences satis ed by the database. Therefore, instead of automatically discovering the independences satis ed by a database, we will propose explicating them by constructing a structured database in the rst place. Such databases are de ned precisely in Section 6, which also contains a key result on how to read independences from the topology of a database structure.

What is the measure of success when using independence? When using independence to prune a database, the measure of success is the degree of pruning it allows. But when using independence to decompose a global reasoning task into local tasks, the measure of success is the number of local tasks that result from the decomposition. In Section 7, we identify a class of structured databases for which we can decompose a reasoning task (entailment, abduction, diagnosis) into a number n of local tasks, where n is linear in the number of arcs and nodes of the database structure. We also discuss extensions of the algorithm to other classes of structured databases. We start in the next three sections by providing ve formulations of LCI, each oriented towards a speci c reasoning task. The dierent formulations are then shown to be equivalent.

3 Belief Change Formulation Common intuitions about independence suggest that it is strongly related to the notion of belief change. Therefore, many expect a formal de nition of independence to be based on such a notion. Although we shall present a few formulations of LCI in this paper, it is for these intuitions that we start with a formulation in terms of belief change. This formulation also happens to be the one most resembling probabilistic independence. Before we present this rst formulation, we need to de ne the notion of information. 3

De nition 1 Information about a set of atomic propositions X is a propositional sentence constructed from these propositions. We use X~ to denote information about propositions X .

For example, if X contained the atoms p and q, then p _ q, :p and p q are three pieces of information about X . De nition 2 Full information about a set of atomic propositions X is a conjunction of literals, one for each atomic proposition in X . We use X^ to denote full information about propositions X . For example, there are four pieces of full information about atoms p and q: p ^ q, p ^:q, :p ^ q and :p ^:q. We will also use the term conjunctive clause over X to mean full information about X . Intuitively, the rst formulation of LCI says that atoms X are independent of Y given Z with respect to database if obtaining information about Y is irrelevant to entailing more information about X given that we have already obtained full information about Z : De nition 3 Let X , Yb, and Z be disjoint sets of atomic propositions. Database nds X independent of Y given Z , written Ind (X; Z; Y ), i ^ Y~ g j= X~ [ fZ^ g j= X~ precisely when [ fZ; ~ Y~ ; Z^ such that [ fZ; ^ Y~ g is consistent. If Z is empty, we simply say that X is independent of Y . for all X; ^ Y~ g results The database [ fZ^ g results from adding the full information Z^ to , and the database [ fZ; ~ from adding the extra information Y . De nition 3 says that X is independent of Y given Z if both of these databases entail the same information about X , that is, the information about Y is irrelevant.

Examples

To further illustrate De nition 3, consider the following database, = fit rained _ sprinkler was on wet groundg: Let X = fsprinkler was ong, Y = fit rainedg and Z = ;. Then Ind b (X; Z; Y ) because 6j= sprinkler was on [ fit rainedg 6j= sprinkler was on [ f:it rainedg 6j= sprinkler was on; and, moreover, 6j= :sprinkler was on [ fit rainedg 6j= :sprinkler was on [ f:it rainedg 6j= :sprinkler was on: That is, adding information about it rained to the database does not change the belief in any information about sprinkler was on. We say in this case that database nds sprinkler was on independent of it rained. However, if we let Z = fwet groundg, then we no longer have Ind b (X; Z; Y ) because [ fwet groundg 6j= sprinkler was on [ fwet ground; :it rainedg j= sprinkler was on: That is, in the presence of complete information about wet ground in the database, adding information about it rained does change the belief in some information about sprinkler was on. We say in this case that the database nds sprinkler was on dependent on it rained given wet ground. Consider now the database, = fit rained wet ground; wet ground wet shoesg: 4

Let X = fwet shoesg, Y = fit rainedg and Z = ;. Then we do not have Ind b (X; Z; Y ) because 6j= wet shoes [ fit rained g j= wet shoes: That is, adding information about it rained to the database changes the belief in some information about wet shoes. We say in this case that database nds wet shoes dependent on it rained. However, if we let Z = fwet groundg, then we have Ind b (X; Z; Y ) because, in the presence of complete information about wet ground in the database, adding information about it rained does not change the belief in any information about wet shoes | we leave this for the reader to verify. We say in this case that the database nds wet shoes independent of it rained given wet ground.

Properties of LCI

De nition 3 of LCI satis es the following properties. Theorem 1 Let X , Y , Z and L be disjoint sets of atomic propositions and let be a propositional database. Then 1. Ind b (X; Z; ;) 2. Ind b (X; Z; Y ) precisely when Ind b (Y; Z; X ) 3. Ind b (X; Z; Y ) and Ind b (L; Z [ X; Y ) precisely when Ind b (X [ L; Z; Y ).

Properties (2) and (3) are known as the semi-graphoid axioms [19]. Property (1) was added recently to them under the name trivial independence [26]. Property (2) is known as symmetry. Property (3) is typically broken into three other properties: - Decomposition: Ind b (X [ L; Z; Y ) only if Ind b (X; Z; Y ), - Weak union: Ind b (X [ L; Z; Y ) only if Ind b (L; Z [ X; Y ) and - Contraction: Ind b (X; Z; Y ) and Ind b (L; Z [ X; Y ) only if Ind b (X [ L; Z; Y ). The semi-graphoid axioms are important for at least two reasons. First, they are intuitive properties that independence is expected to satisfy [19]. Second, they lead to an important result about the identi cation of LCI assertions by examining the topology of a structured database, which is the subject of Section 6.

Structured Databases

We now introduce structured databases, which will be discussed in more detail in Section 6. Figures 1, 2 and 4 depict a number of structured databases. In general, a structured database has two parts: (1) a directed acyclic graph over atomic propositions, and (2) a number of local databases, one for each atom n in the graph. The local database for atom n can only refer to n, its parents, and atoms that do not appear in the directed graph. The local database is typically depicted next to the atom as shown in Figures 1, 2 and 4. If a database is structured, some of the LCI assertions it satis es can be detected by visual inspection of the database structure, using a method that we shall describe in Section 6. For example, using this method, one can detect the following LCI assertions regarding the databases in Figure 1:1 - Ind b (fAg; ;; fBg) in Figure 1.1. - Ind b (fAg; fBg; fC g) in Figure 1.2. - Ind b (fBg; fAg; fC g) in Figure 1.3. We provide the formal de nition of structured databases together with some of their key properties in Section 6. The next two sections provide four other formulations of LCI, corresponding to its use in testing logical entailment and satis ability, and in computing abductions and diagnoses. 1 These assertions are meant to review what will be covered in Section 6. The reader is not expected to understand how these assertions are detected at this stage.

5

2

1 A

B

C A implies C B implies C A = rain B = sprinkler was on C= wet grass

3 A

B

A A implies B B

C

B implies C

A = rain B = wet ground C = slippery ground

C

A implies B not A implies not C A = battery ok B = lights on C = car starts

Figure 1: Three structured databases. Each graph identi es LCI assertions that are satis ed by its associated database.

4 Entailment and Satis ability Formulations This section discusses the computational role of LCI in deciding logical entailment and satis ability. For this purpose, we present two formulations of LCI, each oriented towards one of these dual tasks. We start with the formulation of LCI that is oriented towards testing logical entailment, that is, deciding whether some propositional sentence is logically entailed by a database . We rst present the formulation and then show how it can be applied to logical entailment when the database is structured. The computational diculty in testing for logical entailment (at least in the propositional case) stems from the inability to do decompositional testing. That is, although one can decompose the test j= ^ into testing whether j= and j= , one cannot (in general) decompose the test j= _ into testing whether j= or j= . To see this, note that if = fp _ qg, then j= p _ q holds while neither j= p nor j= q holds. Therefore, the decomposition j= p _ q precisely when j= p or j= q is not valid in this case. If such a decomposition were valid, however, testing for entailment would be very easy. We would always be able to rewrite the test f1; : : :; ng j= into the equivalent test : j= :1 _ : : : _ :n and then decompose the latter into n tests : j= :1, : : : , : j= :n, which in turn are equivalent to 1 j= , : : : , n j= . This would make testing for logical entailment linear in the number n of sentences in a database. Although the decomposition of a disjunctive test j= _ is not valid in general, it can be valid when certain LCI assertions are satis ed by the database . The second formulation of LCI is meant to explicate these assertions.

Entailment formulation

We rst need the following supporting de nition.

De nition 4 A disjunctive clause over a set of atomic propositions X is a disjunction of literals, one literal for each proposition in X . We use X to denote a disjunctive clause over propositions X .

For example, there are four disjunctive clauses over p and q: p _ q, p _ :q, :p _ q and :p _ :q. The following formulation of LCI is in terms of decomposing disjunctive tests. Theorem 2 proves the equivalence between this formulation and the rst one based on belief change. De nition 5 Let be a propositional database and let X , Y and Z be three disjoint sets of atomic propositions. Database nds X independent of Y given Z , written Ind e (X; Z; Y ), i j= X _ Y _ Z precisely when j= X _ Z or j= Y _ Z Y ; Z . for all disjunctive clauses X; 6

A

B

C

D

A => ~C

A & B => D

~A => C

~(A & B) => ~D

A

X

C

B

Y

D

Figure 2: A structured database representing a digital circuit.

Theorem 2 Ind b (X; Z; Y ) precisely when Ind e (X; Z; Y ). Theorem 2 is then showing the equivalence between 1. the validity of decomposing certain disjunctive tests, which is a computation-oriented formulation of LCI (De nition 5) and 2. the irrelevance of certain information to certain beliefs, which is an intuition-oriented formulation of LCI (De nition 3). An important special case of De nition 5 is when Z = ;. Here, false is the only disjunctive clause over Z , and we have Ind e (X; ;; Y ) i j= X _ Y precisely when j= X or j= Y for all disjunctive clauses X and Y .

Example

Consider the database = fp r; r sg. This database satis es the LCI assertion Ind b (fsg; frg; fpg). Therefore, the following decompositions are valid: j= p _ r _ s precisely when j= p _ r or j= r _ s j= :p _ r _ s precisely when j= :p _ r or j= r _ s j= p _ :r _ s precisely when j= p _ :r or j= :r _ s .. . j= :p _ :r _ :s precisely when j= :p _ :r or j= :r _ :s: Note, however, that does not satisfy the assertion Ind b (fsg; ;; fpg). Therefore, it should not be surprising that the decomposition j= :p _ s precisely when j= :p or j= s does not hold ( j= :p _ s holds but neither j= :p nor j= s does).

The computational value of independence

We have shown so far that: 1. disjunctive tests of the form j= _ cannot be decomposed in general; 2. if disjunctive tests were always decomposable, then testing for logical entailment would become linear in the number of sentences in a database; 7

∆ entails ~C v ~D

AND ∆ entails A v ~C v ~D

∆ entails ~A v ~C v ~D

OR f ∆ entails ~C v A 1

t ∆ entails ~D v A v B 2

OR

∆ entails A v ~D

AND ∆ entails

AND

t ~A v ~D

∆

f

t

t ∆2 entails ~D v A v ~B

∆2entails ~D v ~A v B

1

entails ~C v ~A

∆ entails ~D v ~A v ~B 2

Figure 3: A rewrite process for decomposing a global test for entailment into a number of local tests. Also shown is the evaluation of local tests. 3. disjunctive tests could be decomposed when certain LCI assertions are satis ed by the database. When all disjunctive tests are decomposable, taking advantage of this decomposability is easy as we have shown. But when only certain tests are decomposable | that is, limited independence is available | then the issue is no longer that simple. The algorithm we present in Section 7 is a good example of how limited independence information can be utilized computationally. It decomposes a global test j= into a number of local tests i j= i, which are assumed to require constant time. The decomposition is accomplished using two rewrite rules: 1. case-analysis: j= is rewritten into j= _ and j= _ : and 2. decomposition: j= _ _ is rewritten into j= _ or j= _ . The case-analysis rule is always valid because 1. is equivalent to ( _ ) ^ ( _ : ) and 2. j= ( _ ) ^ ( _ : ) i j= _ and j= _ : . However, the decomposition rule is valid only when certain LCI assertions hold according to De nition 5. Figure 3 shows an example of this rewrite process as applied to the test j= :C _:D with respect to the database in Figure 2. Speci cally, the gure depicts a trace of a rewrite process that decomposes the global test j= :C _:D into six local tests: 1 j= :C _ A, 1 j= :C _:A, 2 j= :D _ A _ B , 2 j= :D _ A _:B , 2 j= :D _ :A _ B , and 2 j= :D _ :A _ :B . Here 1 and 2 are the local databases: 1 = fA :C; :A C g and 2 = fA ^ B D; :(A ^ B ) :Dg: The process proceeds top-down, rewriting each test into a logical combination of the tests below it. The case-analysis rule rewrites a test into a conjunction of other tests. The decomposition rule rewrites a test into a disjunction of other tests. Therefore, the rewrite process constructs an and-or tree, the root of which is the global test, and the leaves of which are local tests. Note here that the two applications of the decomposition rule are validated by the LCI assertion Ind e (fC g; fAg; fDg), which follows from the structure in Figure 2. Figure 3 also depicts the result of evaluating local tests. It is clear that this tree evaluates to true , which is the answer computed for the global test j= :C _ :D. We have a few points to make with respect to this example. First, it exempli es the computational paradigm that we are advocating in this paper: decompose a global computation into a number of local, constant time computations. Second, for each of the computational tasks we shall consider, there will be a corresponding case-analysis and decomposition rules. The case-analysis rule will always be valid, but the decomposition rule will be valid only when certain LCI assertions hold. Therefore, to develop an independence-based algorithm, one must 8

A

B

C

D

A

X

A & OKX => ~C

A & B & OKY => D

~A & OKX => C

~(A & B) & OKY=> ~D

C

B

Y

D

Figure 4: A structured database representing a digital circuit. 1. know when the decomposition rule is valid (soundness) 2. control the rewrite process so that: (a) it terminates (completeness) (b) it invokes the least number of rewrites possible (eciency) Structured databases that we shall introduce in Section 6 are central for achieving these goals: the topology of a database tells us which decompositions are valid and can be used to control the rewrite process so that it terminates. The number of invoked rewrites decides the complexity of reasoning and is the parameter for measuring the superiority of one algorithm versus another. As we shall see, for certain database structures, we can decompose a task using only a linear number of rewrites. These computational issues will be discussed in more detail later.

Satis ability formulation

We now provide the third formulation of LCI, which is oriented towards testing satis ability: De nition 6 Let be a propositional database and let X , Y ands Z be three disjoint sets of atomic proposi^ X^ g and [fZ; ^ Y^ g tions. Database nds X independent of Y given Z , written Ind (X; Z; Y ), i [fZ; ^ ^ ^ ^ ^ ^ are both satis able precisely when [ fZ; Y ; X g is satis able for all conjunctive clauses Z , Y and X . This formulation is dual to the second formulation for testing entailment. The equivalence is established below: Theorem 3 Ind s (X; Z; Y ) precisely when Ind e (X; Z; Y ). Now that we have established the equivalence between Ind b , Ind e and Ind s , we will sometimes drop the superscripts and simply write Ind . The speci c interpretation of Ind will then be chosen depending on the context.

5 Diagnosis and Abduction Formulations This section presents two more formulations of LCI oriented towards the dual tasks of diagnosis and abduction. We de ne the notion of a consequence for diagnosis tasks, which is a semantical characterization of all diagnoses. We also de ne the notion of an argument for abduction tasks, which is a semantical characterization of all abductions. We show that the diculties in computing diagnoses and abduction are rooted in the inability to decompose the computations of consequences and arguments in general. We also show that independence assertions can validate this decomposition in certain cases. We present two more formulations of LCI to spell out these cases. The two formulations are dual since consequences are dual to arguments. We start with the fourth formulation of LCI that is oriented towards diagnostic applications. Before we present it though, we need to review some basics of diagnosis [7]. 9

Basics of diagnosis

In the diagnostic literature [7], a system is typically characterized by a tuple (; P; W ) where is a database constructed from atomic propositions P [ W . Moreover, a system observation is typically characterized by a sentence constructed from atoms P . The atoms in W are called assumables and those in P are called non-assumables. The intention is that the database describes the system behavior, the assumables W represent the modes of system components and the sentence represents the observed system behavior. For example, in Figure 4, the assumables are okX and okY , the non-assumables are A; B; C and D and a possible system observation is C ^ D. A diagnosis is de ned as a conjunctive clause over the set of assumables W that is consistent with [f g. Therefore, a diagnosis is an assignment of modes to components that is consistent with the system description and its observed behavior. In Figure 4, okX and okY are the assumables and okX ^ :okY is a potential diagnosis. One goal of diagnostic reasoning is to characterize all diagnoses compactly so they can be presented to a user. Another goal is to extract a subset of these diagnoses according to some preference criterion. We have shown elsewhere that these objectives can be achieved in two steps [4]. First, we compute the consequence of observation , which is a Boolean expression that characterizes all the diagnoses of . Second, we extract the most preferred diagnoses from the computed consequence. The consequence of an observation is de ned formally below: De nition 7 Let be a propositional database, be a propositional sentence and W be a set of atomic propositions that do not appear in . The consequence of sentence with respect to database and atoms W , written Cons W (), is a logically strongest sentence W~ such that [ fg j= W~ .2

When and W are clear from the context, we will write Cons () instead of Cons W () for simplicity. In Figure 4, for example, the consequence of observation C ^ D is :okX _ :okY because it is a logically strongest sentence (constructed from assumables) that can be concluded from the given observation and system description. A consequence characterizes all diagnoses in the following way: Theorem 4 W^ is a diagnosis for system (; P; W ) and observation i W^ j= Cons W ( ). The consequence :okX _ :okY for example characterizes three diagnoses: :okX ^ :okY , okX ^ :okY and :okX ^ okY .

Diagnosis formulation

Similar to testing for logical entailment, the diculty with computing diagnoses is that it cannot be done compositionally. In particular, although Cons ( _ ) is equivalent to Cons () _ Cons ( ), Cons ( ^ ) is not equivalent to Cons () ^ Cons ( ) in general. The following formulation of LCI is in terms of decomposing consequences. It is followed by a theorem that shows the equivalence between this formulation and the one based on belief change, therefore, identifying conditions under which decomposing consequences is valid. De nition 8 Let be a database and let X , Y , Z and W be disjoint sets of atomic propositions. The pair (; W ) nds X independent of Y given Z , written Ind c(;W ) (X; Z; Y ), i j= Cons (X^ ^ Y^ ^ Z^) Cons (X^ ^ Z^ ) ^ Cons (Y^ ^ Z^) ^ Y^ and Z^ . for all conjunctive clauses X;

Theorem 5 Ind c(;W ) (X; Z; Y ) precisely when Ind b(X; Z [ W; Y ).

Given Theorem 5, we now have an equivalence between 1. the irrelevance of information to beliefs (De nition 3), 2

A sentence is stronger than sentence i j= . The sentence is said to be weaker than in such a case.

10

cons(C & D)

OR cons(A & C & D)

cons(~A & C & D)

AND ∆1

AND cons(A & D)

cons(A & C)

∆

∆

2

cons(A & B & D)

true

∆1

cons(~A & D)

OR

~OK-X

cons(~A & C)

OR ∆2

2

cons(A & ~B & D) cons(~A & B & D)

~OK-Y

true ∆

2

cons(~A & ~B & D)

~OK-Y

~OK-Y

Figure 5: A rewrite process for decomposing a global consequence into a number of local consequences. Also shown are the values of local consequences. 2. the validity of decomposing entailment tests (De nition 5), 3. the validity of decomposing satis ability tests (De nition 6), and 4. the validity of decomposing consequences (De nition 8). Before we present an example on the computational use of this LCI formulation, we note the following special case of De nition 8. When Z is empty, true is the only conjunctive clause over Z , and we have Ind c(;W ) (X; ;; Y ) i j= Cons (X^ ^ Y^ ) Cons (X^ ) ^ Cons (Y^ ) for all conjunctive clauses X^ and Y^ .

Example

We have presented elsewhere an algorithm for computing consequences with respect to a structured database [4]. The algorithm works by rewriting a global consequence of the form Cons () into a Boolean expression that involves logical connectives and local consequences of the form Cons (i ), where i is a local database. The algorithm is based on the following two rewrites: 1. case-analysis: Cons () is rewritten into Cons ( ^ ) _ Cons ( ^ : );3 and 2. decomposition: Cons ( ^ ^ ) is rewritten into Cons ( ^ ) ^ Cons ( ^ ) when the corresponding LCI assertion holds. We will now consider an example using these rewrites with respect to Figure 4. Our goal is to compute the consequence Cons (C ^ D), where the set W contains the atoms okX and okY . Figure 5 depicts a trace of the rewrite process that decomposes the global consequence Cons (C ^ D) into six local consequences: Cons 1 (A ^ C ), Cons 1 (:A ^ C ), Cons 2 (D ^ A ^ B ), Cons 2 (D ^ A ^ :B ), Cons 2 (D ^ :A ^ B ) and Cons 2 (D ^ :A ^ :B ). Here, 1 and 2 are the local databases: i

1 = fA ^ okX :C; :A ^ okX C g and

2 = fA ^ B ^ okY D; :(A ^ B ) ^ okY :Dg:

The case-analysis rule follows because: (a) is equivalent to ( ^ ) _ ( ^ : ) and (b) Cons (( ^ ) _ ( ^ : )) is equivalent to Cons ( ^ ) _ Cons ( ^ : ). 3

11

The process proceeds top-down, rewriting each consequence into the nodes below it. The case-analysis rule rewrites a consequence into a disjunction of other consequences, while the decomposition rule rewrites a consequence into a conjunction of other consequences. Therefore, the rewrite process constructs an and-or tree, the root of which is the global consequence, and the leaves of which are local consequences. Given the values of local consequences, the tree simpli es to :okX _ :okY , which is the value of the global consequence Cons (C ^ D). Note that local consequences can be computed by operating on local databases, which is assumed to require constant time. Note also that the two applications of the decomposition rule are validated by the LCI assertion Ind c(;W ) (fC g; fAg; fDg), which can be inferred by examining the database structure.

Abduction formulation

The fth formulation of LCI is oriented towards the computation of abductions. There is no standard de nition of an abduction, although existing de nitions seem to agree on the following two properties: 1. Generality: An abduction should be as general as possible while still explaining the nding. 2. Scope: The language in which an abduction is phrased must be restricted in order to avoid trivial abductions ( being an abduction of itself). The following de nition gives a generic notion of an abduction, called an argument. The ATMS label of a proposition, for example, is only a syntactic variation on the argument for that proposition [6, 21].

De nition 9 Let be a propositional database, be a propositional sentence, and W be a set of atomic

propositions that do not appear in . The argument for sentence with respect to database and atoms W , written Arg W (), is the logically weakest sentence W~ such that [ fW~ g j= .4

The intention here is that represents background information, represents a nding, and W restricts the language used in phrasing an abduction. For example, if we choose W to contain okX and okY in Figure 4, the argument for nding :C _ :D would then be okX ^ okY . Arguments are dual to consequences as the following theorem shows: Theorem 6 Arg W () :Cons W (:). This also explains, indirectly, why ATMS engines are typically used for computing diagnoses. Similar to logical entailment, the diculty in computing abductions/arguments is related to the inability to decompose arguments. That is, although Arg ( ^ ) is equivalent to Arg () ^ Arg ( ); Arg ( _ ) is not equivalent to Arg () _ Arg ( ) in general. For example, if = fp (r _ s)g and W = fpg, then Arg (r _ s) = p while Arg (r) = false and Arg (s) = false . The following formulation of LCI is in terms of decomposing arguments. It is followed by a theorem that shows the equivalence between this formulation and previous ones, thus establishing conditions under which it is valid to decompose arguments.

De nition 10 Let be a database and let X , Y , Z and W be disjoint sets of atomic propositions. The a pair (; W ) nds X independent of Y given Z , written Ind (;W ) (X; Z; Y ), i

j= Arg (X _ Y _ Z ) Arg (X _ Z) _ Arg (Y _ Z ); Y ; Z . for all disjunctive clauses X; Theorem 7 Ind a(;W ) (X; Z; Y ) precisely when Ind c(;W ) (X; Z; Y ). 4

We will typically drop and W , thus writing Arg (), when no confusion is expected.

12

Now that we have established the equivalence between Ind a(;W ) and Ind c(;W ) , we will sometimes drop the superscripts and simply write Ind (;W ) . The speci c interpretation of Ind will then be chosen depending on the context. We close this section with an important special case of De nition 10. When Z is empty, false is the only disjunctive clause over Z , and we have Ind a(;W ) (X; ;; Y ) i Arg (X _ Y ) Arg (X ) _ Arg (Y ); for all disjunctive clauses X and Y .

6 Structured Databases The dierent formulations of LCI show how independence information can validate the decomposition of a computation into smaller computations that can be performed in parallel, a decomposition that is not sound in general. Although this ability to decompose a computation could be valuable from a complexity viewpoint, exploiting such decompositions is not always straightforward. Structured databases make this utilization of independence more feasible. In particular, the structure of a database plays two important roles in designing independence-based algorithms: 1. it graphically explicates valid applications of the decomposition rule, and 2. it speci es a control ow that guarantees the termination of the decomposition process. This is also the role that directed acyclic graphs have been playing in probabilistic reasoning, and our goal here is to extend the role to propositional logic.5 We will now provide the formal de nition of structured databases, and then present two important operations on their structures: (1) reading independences o the database structure and (2) pruning irrelevant parts of a database structure before performing certain computations.

6.1 The Syntax of a Structured Database

A structured database is a pair (G ; ) where G is a directed acyclic graph with nodes representing atomic propositions, and is the union of local databases, one database for each node in G . The atoms that appear in the database but do not appear in the structure G are called exogenous atoms. For example, in Figure 4, the exogenous atoms are okX and okY . The local database corresponding to proposition n in G , denoted by n , must satisfy the following conditions: 1. locality: the only atoms in G that n can mention are the family of atom n;6 and 2. modularity: if n entails a disjunctive clause that does not mention atom n, then the clause must be valid. The rst condition makes sure that each local database is restricted to specifying the relationship between an atom and its parents. The second condition ensures that a local database for atom n be only concerned with specifying how the parents of n determine the truth value of n. That is, the database should not be concerned with specifying a direct relationship between the parents of n. Note also that the modularity condition ensures the consistency of a local database since the empty clause (falsehood) does not mention atom n. If the arcs of a structured database represent causal in uences, then the conditions above are typically self{imposed. For example, suppose that a structured database is used to describe the functionality of a digital circuit as shown in Figures 2 and 4. Each atom in the structure represents the state of a wire in the 5 6

However, we later discuss a key dierence between the propositional and probabilistic framework. Recall that the family of n consists of n and its parents in the database structure.

13

circuit, and the arcs point from the inputs of a gate into its output. The local database associated with atom n speci es the behavior of the gate having output n. For this class of structured databases, the locality and modularity conditions are typically self{imposed since 1. the locality condition means that one should not mention any wire that is not an input or an output of the gate for which we are specifying a behavior, and 2. the modularity condition means that one should not specify a relationship between the inputs of a gate in the process of specifying its behavior. We have focused elsewhere on a causal interpretation of structured databases, referred to as symbolic causal networks, where we discussed non-computational applications of LCI that include reasoning about actions and identifying anomalous extensions of nonmonotonic theories [5].

6.2 Structure-Based Independence

By construction, a structured database satis es some LCI assertions that can be easily detected from the database structure:

Theorem 8 Let (G; ) be a structured database, W be its exogenous atoms, n be a node in G, U be its parents, and N be its non-descendants. Then Ind (fng; U [ W; N ). That is, each atom in the database structure is independent of its non-descendants given its parents and exogenous atoms. With respect to the database in Figure 4, this theorem states that 1. fDg is independent of fC g given fA; B; okX ; okY g, 2. fB g is independent of fAg given fokX ; okY g, and 3. fC g is independent of fDg given fA; okX ; okY g. Theorem 8 brings up a key dierence between structured databases and Bayesian networks: the independences characterized by this theorem are properties of a structured database, but their corresponding probabilistic independences are part of the de nition of a Bayesian network. That is, the restrictions on a structured database are sucient to imply these independences, but similar restrictions on a Bayesian network are not enough to imply the corresponding probabilistic independences. There are two implications of this: By constructing a structured database, one is making no explicit commitment to independences; one is committing only to the logical content of local databases. By constructing a Bayesian network, however, one is explicitly committing to independences which correspond to Theorem 8 [19]. One can recover the independences satis ed by a structured database from only its local databases, that is, without having to consider the database structure.7 Note that having an explicit structure is very useful in deducing further independences. Speci cally, since LCI is a symmetric relation, the LCI assertion in (3) above implies 4. fDg is independent of fC g given fA; okX ; okY g, which cannot be detected by applying Theorem 8 directly. As this example illustrates, a structured database satis es more LCI assertions than those characterized by Theorem 8. We now present a topological test, called d-separation, for identifying some of these additional assertions. d-separation is a test on directed acyclic graphs that tells us whether two sets of nodes are d-separated by a third set [19]. According to this test, for example, the nodes fB; Dg are d-separated from the node fC g by the node fAg in Figure 4. Therefore, d-separation could be viewed as a relation Sep G where Sep G (X; Z; Y ) holds precisely when Z d-separates X from Y in the directed acyclic graph G . The d-separation test has 7

We do not investigate this direction in this paper however.

14

been instrumental in reasoning about independence in Bayesian networks. We shall use it analogously in structured databases. We will de ne d-separation later, but we rst go over how it can be used to identify LCI assertions. The following result, which is basically a corollary of a theorem in [25], reveals the importance of d-separation:

Theorem 9 Let (G; ) be a structured database and let W be its exogenous atoms. If Z d-separates X from Y in G , then nds X and Y independent given Z [ W . That is, Sep G (X; Z; Y ) implies Ind (X; Z [ W; Y ). Therefore, d-separation allows one to infer LCI assertions satis ed by a given structured database by simply examining the topology of its structure. Following are the key computational implications of Theorem 9. If atoms X and Y are d-separated by Z in the structure of database (G ; ), then the following decompositions are valid with respect to database : 1. j= X _ Z _ W _ Y i j= X _ Z _ W or j= Z _ W _ Y , 2. [ fX^ ^ Z^ ^ W^ ^ Y^ g is satis able i [ fX^ ^ Z^ ^ W^ g and [ fZ^ ^ W^ ^ Y^ g are satis able, 3. Arg (X _ Z _ Y ) is equivalent to Arg (X _ Z) _ Arg (Z _ Y ), and 4. Cons (X^ ^ Z^ ^ Y^ ) is equivalent to Cons (X^ ^ Z^ ) _ Cons (Z^ ^ Y^ ), Y ; Z and all conjunctive clauses X; ^ Y^ ; Z^. These implications show why the for all disjunctive clauses X; complexity of reasoning in the framework we are proposing is decided by the topology of a structured database. The d-separation test will be de ned next. We rst need the following supporting de nition: De nition 11 Let G be a directed acyclic graph and let X , Y , and Z be three disjoint sets of nodes in G. A path between X and Y is Z -active precisely when its nodes satisfy the following conditions: 1. each converging node belongs to Z or has a descendent in Z , and 2. each diverging or linear node is outside Z . When Z is empty, we say the path is active. Figure 6 gives the de nition of converging, diverging, and linear nodes.

In Figure 4, for example, the path A ! D B is fDg-active, the path C A ! D is active, and the path A ! D B is not active. Following is the de nition of d-separation:

De nition 12 ([19]) In a directed acyclic graph G, nodes Z d-separate X from Y , written Sep G (X; Z; Y ), precisely when there is no Z -active path between X and Y in G . When Z is empty, we say that X and Y are d-separated. The d-separation test is not complete in the sense that it cannot identify all LCI assertions satis ed by a database. To illustrate this, consider the structured databases in Figure 7. In the rst one, fAg and fC g are independent although they are not d-separated. And in the second, fB g and fC g are also independent although not d-separated. Intuitively, in the rst database, there is no information about A that could lead us to prove anything about C . Similarly in the second database, neither B nor :B will help in proving anything about C . Existing structure-based algorithms use only the database structure for identifying LCI assertions. It should be clear then that such algorithms cannot be optimal because they are bound to miss some independences that could be useful computationally. This also seems to be the practice in the probabilistic literature, except possibly for some recent work on utilizing non-structural independence [1, 18]. 15

j i j

k

i

k Diverging

k

j

i

Converging

Linear

Figure 6: There are three types of intermediate nodes on a given path. The type of a node is determined by its relation to its neighbors. A node is diverging if both neighbors are children. A node is linear if one neighbor is a parent and the other is a child. A node is converging if both neighbors are parents. 1

2 A A B

C

A implies B

not B implies C

B

C

A implies B

A implies C

Figure 7: Structured databases satisfying more independences than is revealed by their graphical structures.

6.3 Structure-Based Pruning

The modularity condition on local databases has strong implications that make structured databases attractive from computational complexity and knowledge acquisition viewpoints. To illustrate this, suppose that we are constructing a structured database incrementally by adding leaf nodes together with their local databases. When adding node n, the modularity condition ensures that the added database n does not contradict the structured database constructed so far. For the database n to contradict , it must entail some clause that is inconsistent with . But this is impossible. Every non-valid clause that is entailed by n must mention the atom n. And any clause that mentions n cannot be inconsistent with , since does not mention n to start with. Therefore, structured databases are guaranteed to be consistent by construction, which is a very attractive property from a knowledge acquisition viewpoint.8 Another important implication of the modularity condition on local databases is the ability to prune certain (irrelevant) parts of a structured database before computing answers to certain queries. The following theorem identi es cases where this pruning is possible. Theorem 10 Let (G; ) be a structured database with exogenous variables W , let N be some atoms in structure G and n be a leaf node in G that does not belong to N . If (G 0 ; 0) is a structured database (with exogenous variables W 0) that results from removing node n and its local database n from (G ; ), then 1. Entailment: j= N i 0 j= N , 2. Satis ability: [ fN^ g is satis able i 0 [ fN^ g is satis able, 8

Modularity is also a key property responsible for the desirable properties of directed constraint networks [10].

16

3. Abduction: Arg W (N ) is equivalent to Arg W (N ) and 4. Diagnosis: Cons W (N^ ) is equivalent to Cons W (N^ ) for all disjunctive clauses N and conjunctive clauses N^ . 0

0

0

0

For example, when attempting to test for the logical entailment of a clause N that does not involve a leaf atom n, we can drop atom n and its local database n without aecting the result of the test. Applying this theorem recursively may prune a signi cant part of a structured database in certain cases. The amount of pruning, however, depends on how the atoms of clause N are spread topologically in the structure. For example, at one extreme, clause N may refer to all leaf nodes in the database structure. In this case, no pruning is possible. At another extreme, the clause may only mention root nodes in the structure, in which case all nodes except for those mentioned in the clause will be pruned. From a computational complexity viewpoint, the signi cance of Theorem 10 is in showing how the complexity of a reasoning task is aected by the speci c query. Theorem 10 presents a result which is close in spirit to the traditional usage of irrelevance information in the literature on logical reasoning [15, 24, 23]. That is, before applying some algorithm for testing entailment, we prune some parts of the given database knowing that we will not compromise the soundness of the test. We have to mention, however, that although such pruning can lead to great computational savings in practice, it is only a secondary usage of independence information in the framework that we are proposing. This is also consistent with the way independence is used in probabilistic and constraint-based reasoning. The key usage of independence information is in decomposing a computation with respect to a global database into a number of independent computations with respect to local databases. This usage will be illustrated in the following section, which presents an independence-based algorithm for a class of structured databases. We close this section with the following corollary which says that we can prune a leaf node from a structured database, without aecting the independences that hold between any of the remaining nodes.

Corollary 1 Let (G; ) be a structured database with exogenous variables W . Let X , Y and Z be atoms in structure G and let (G 0 ; 0) be a structured database (with exogenous variables W 0) that results from removing a leaf node from (G ; ) that does not belong to X [ Y [ Z . Then Ind (X; Z [ W; Y ) precisely when Ind (X; Z [ W 0 ; Y ): 0

7 Structure-Based Reasoning This section presents an algorithm for computing arguments. The algorithm applies only to structured databases that are singly-connected, which are introduced in the following section. The algorithm can be generalized straightforwardly to arbitrary structured databases. The computational complexity of the algorithm and its extensions depend on the topology of the structured database to which it applies. This is why this class of algorithms is referred to as structure-based. Using this algorithm, one can also decide entailment and satis ability in addition to computing consequences. We provide more details on this later.

Singly-connected structures

A singly-connected structure is one in which there is only one undirected path between any two nodes | the structure has no undirected cycles. The database in Figure 4 has a singly-connected structure, but the one in Figure 8 has a multiply-connected structure. Singly-connected databases are important for at least two reasons: 1. Computation on a singly-connected structure is linear in the number of nodes and arcs of that structure, exponential only in the size of a family.9 9 In the literature on structure-based reasoning, the size of a family is typically assumed to be small enough to justify the assumption that any computation involving only a family will require constant time.

17

A

B

C

D

A

B

X

A & OK-X => ~C

A & B & OK-Y => D

~A & OK-X => C

~(A & B) & OK-Y=> ~D

Y

C

D

E Z

C & D & OK-Z => E ~(C & D) & OK-Z=> ~E

E

Figure 8: A structured database representing a digital circuit. The structure of this database is multiplyconnected because there is more than one undirected path between nodes A and E . 2. There is a simple algorithm for reducing a computation on a multiply-connected structure into a number n of computations on singly-connected structures, where n is exponential in the size of a structure cutset [19, 2].10 A singly-connected database is not necessarily Horn. For example, the databases in Figures 2 and 4 are singly-connected and yet contain non-Horn clauses.

The rewrite paradigm

Given a singly-connected database (G ; ), and a disjunctive clause O over some atoms in G , the algorithm we shall present computes the argument for O with respect to database . The algorithm is symmetric to a well-known one in the probabilistic literature, known as the polytree algorithm [19], and has similar computational complexity: linear in the number of nodes and arcs in G and exponential in the size of each family in G . The algorithm can also be used to compute the argument of an arbitrary sentence by rst converting the sentence into conjunctive normal form O 1 ^ : : : ^ O n, and then applying the algorithm to each of the conjuncts: ^ Arg () Arg (O 1 ^ : : : ^ O n) Arg (Oi ): i

The algorithm works by rewriting a global argument Arg (O ) into an expression that includes logical connectives and local arguments of the form Arg i (i ), where i is a local database. A local argument can be evaluated by operating on a local database, which is assumed to require constant time operation. The algorithm alternates between applying the case-analysis and decomposition rewrite rules: 1. case-analysis: Arg () is rewritten into Arg ( _ ) ^ Arg ( _ : ), and 2. decomposition: Arg ( _ _ ) is rewritten into Arg ( _ ) _ Arg ( _ ) when the corresponding LCI assertion holds. The alternation between the two rules takes place because of the following. To decompose an argument Arg (X _ Y ), atoms X and Y must be independent. If they are not, the algorithm performs a case analysis on atoms Z (that make X and Y independent) in order to apply the decomposition rule. That is, 1. the algorithm rewrites Arg (X _ Y ) into ^ Arg (X _ Y _ Z ) using the case-analysis rule, and then

Z

10 A cutset of a directed acyclic graph is a set of nodes in the graph that satis es the following property: if we remove the arcs outgoing from every node in the cutset, the resulting graph becomes singly-connected.

18

2. rewrites the result into

^ Z

Arg (X _ Z) _ Arg (Y _ Z)

using the decomposition rule. Therefore, when an argument cannot be decomposed, it is rst expanded using case-analysis and then decomposed. This process continues until we have a Boolean expression in which all arguments are local.

The algorithm

We now introduce some notation that is needed to state the algorithm as a set of rewrite rules. Let

x be an arbitrary node in the database structure, u and u0 be distinct parents of x, y and y0 be distinct children of x, and U be all parents of x. The algorithm can be described as a recursive and deterministic11 application of the following rewrite rules: ^

Arg (O ) ! ((x) _ (x)) ((:x) _ (:x)); for some node x in the structure G _ ^ Arg (x _ U ) _ x (u) (x) ! U uj=U x

(x) ! (x) _

_ y

y (x)

(1) (2) (3)

_ y (x) ! (x) _ y (x) (4) y ^ ^ _ x (u) ! (x) _ Arg (x _ U ) _ x(u0 ) (5) x uj=U u j=U true if :x appears in O ; (6) (x) ! false otherwise. The algorithm starts with the argument Arg (O ). It then keeps on applying the rewrites given above until it reaches an expression that contains only the connectives ^ and _ in addition to true , false and local arguments of the form Arg (x _ U ). In its intermediate stages, the constructed expression will also include the following intermediate terms , , y , x and , which will be rewritten into other expressions using the above rewrites. The algorithm is guaranteed to construct an expression that is free of these intermediate terms. Given that the database structure is singly connected, it is easy to verify that the rewrite process will terminate and Rewrite 1 will be applied only once, Rewrites 2, 3 and 6 will be applied at most once per node in the database structure, and Rewrites 4 and 5 will be applied at most once for each arc in the database structure. Therefore, the complexity of the algorithm and the size of the constructed Boolean expression are linear in the number of nodes and arcs, and exponential only in the size of families of the database structure. 0

0

x

0

x

11

Except for the rst rewrite rule, which applies only once.

19

Ox+ O+ u1x

un

u1

x

y 1

ym

Ox y m

Ox

Figure 9: Decomposing a disjunctive clause in a singly-connected structure.

Deriving the rewrites

We will now go over the derivation of this algorithm. This discussion would typically be part of Appendix A, but we include it here to provide an example of the techniques involved in developing a structure-based algorithm. This is important because in certain applications, one may arrive at database structures that imply strong independences and therefore may be easy to solve using a specialized structure-based algorithm. To develop such an algorithm, one needs to go into an exercise similar to the following. The algorithm starts by rewriting Arg (O ) using the case-analysis rule into ^ Arg (x _ O ) Arg (:x _ O ) for some atom x in the structure (we will assume that x does not appear in O without loss of generality). To decompose the argument Arg (x _ O ), the algorithm partitions the disjunctive clause O into two parts: ?x mentioning only descendants of x and a clause O +x mentioning only non-descendants of x; see a clause O Figure 9. This validates the decomposition: +x _ O ?x ) Arg (x _ O ) 7?! Arg (x _ O +x ) _ Arg (x _ O ?x ) 7?! Arg ( x _ O | {z } | {z } (x)

(x)

because x d-separates any of its descendants from any of its non-descendants. ?x ), the algorithm partitions the clause O ?x into a number of clauses To decompose the argument Arg (x _ O O ?xy , each about the nodes connected to one child y of x; see Figure 9. This validates the decomposition: ?x ) 7?! Arg (x _ _ O ?xy ) Arg (x _ O y _ 7?! Arg (x _ O ?xy ) {z } y | y (x)

because x d-separates any nodes connected to one of its children from any nodes connected to other children. +x ), the algorithm applies the case-analysis rule: To decompose the argument Arg (x _ O +x ) 7?! ^ Arg (x _ U _ O +x ); Arg (x _ O U

20

where U is a conjunctive clause over U , the parents of x. Using Theorem 11 in the Appendix, we can decompose the argument +x ) Arg (x _ U _ O into _ +x ): Arg (x _ U ) Arg (U _ O Note here that the rst argument is local, that is, can be computed using only the local database for x. +x ), the algorithm partitions the clause O +x into a number of To decompose the argument Arg (U _ O + clauses Oux, each about the nodes connected to x through its parent u; see Figure 9. This allows the application of the decomposition rule: +x ) 7?! Arg ( _ u _ O +ux) Arg (U _ O uj=U _ 7?! Arg (u{z_ O +ux}); | uj=U x

x (u)

which is validated by d-separation: each parent u of x and the nodes in clause O +ux are d-separated from +u x. The rewrite for y (x) and that for x (u) can be every other parent u0 of x and the nodes in clause O derived similarly. Moreover, given that x may appear in O , the term must be inserted in the Rewrites 1{6 as shown. The algorithm presented in this section can be extended to multiply-connected databases straightforwardly, leading to an algorithm called cutset conditioning that is exponential in the size of a structure cutset [20, 19, 2]. There are more sophisticated extensions to multiply-connected structures, however, but they are outside the scope of this paper [14, 22, 3]. Note that multiply-connected structures are not necessarily harder than singly-connected ones in general. For example, n-bit adders lead to multiply-connected structures that behave computationally like singly-connected ones. 0

Other reasoning tasks

The algorithm we just presented computes the argument for a given disjunctive clause O . Given the duality between arguments and consequences, the algorithm can also be used to compute the consequence of a conjunctive clause O^ using Cons (O^ ) :Arg (:O^ ): The algorithm can also be used to decide entailment since j= precisely when Arg ; () true : Therefore, if the structured database (G ; ) has no exogenous variables | that is, all atoms that appear in also appear in G | then one can compute arguments and declare a clause O as entailed by the database i the computed argument for O is equivalent to true . Given the duality between satis ability and entailment in propositional logic, one can also use the algorithm for testing satis ability.

8 Conclusion We have presented a notion of conditional independence with respect to propositional databases that resembles its probabilistic counterpart. We have also presented several formulations of logical independence, together with their applications to deciding logical entailment, computing abductions and diagnoses. We have demonstrated how structuring a database around a directed acyclic graph can lead to explicating the independences it satis es and how independence-based algorithms can then take advantage of this structure. Our proposed approach for utilizing independence ties the computational complexity of logical reasoning to the availability of independence information and therefore to database structure. This leads to a computational paradigm in which the complexity of reasoning is parameterized by the topology of a database 21

structure. This structure-based approach is at the heart of causal-networks in the probabilistic literature and constraint-networks in the constraint satisfaction literature [9, 12, 8]. In both cases, a graphical structure is the key aspect deciding the diculty of a reasoning problem. This structure is what users need to control in order to ensure an appropriate response time for their applications. The probabilistic literature, for example, contains techniques for tweaking this structure to ensure certain response times, most of which can be adopted by our proposed framework.

22

A Proofs The proofs will not have the same order as the theorems. We will rst establish the equivalence between all LCI formulations, then prove the semi-graphoid axioms using the abduction formulation in De nition 10.

Proving equivalence between LCI de nitions

We rst prove the equivalence between dualities: entailment/satis ability and diagnosis/abduction. We then prove the equivalence between belief-change and abduction formulations followed by the equivalence between belief-change and entailment formulations. We will then have covered Theorems 2, 3, 5 and 7.

Equivalence between entailment and consistency formulations Suppose that

j= X _ Y _ Z precisely when j= X _ Z or j= Y _ Z Y ; Z . Then [f:Z; :Y ; :X g is inconsistent precisely when either [f:Z; :X g for all disjunctive clauses X; ^ ^ ^ or [f:Z; :Y g is inconsistent for all disjunctive clauses X; Y ; Z . This means that [fZ; Y ; X g is satis able ^ X^ g and [ fZ; ^ Y^ g are satis able for all conjunctive clauses X; ^ Y^ and Z^. precisely when both [ fZ; The other direction follows similarly.

Equivalence between diagnosis and abduction formulations

We now prove the equivalence between the diagnosis and abduction formulations of LCI, therefore, proving the equivalence between De nitions 8 and 10 if independence. By Theorem 6, we have that Arg W () :Cons W (:). Therefore, j= Arg (X _ Y _ Z ) Arg (X _ Z) _ Arg (Y _ Z) is equivalent to j= :Cons (:X ^ :Y ^ :Z ) :Cons (:X ^ :Z) _ :Cons (:Y ^ :Z ); which is equivalent to j= :Cons (:X ^ :Y ^ :Z) :(Cons (:X ^ :Z ) ^ Cons (:Y ^ :Z)); and j= Cons (:X ^ :Y ^ :Z ) Cons (:X ^ :Z) ^ Cons (:Y ^ :Z): Therefore, j= Arg (X _ Y _ Z ) Arg (X _ Z) _ Arg (Y _ Z) for all X , Y and Z i j= Cons (X^ ^ Y^ ^ Z^) Cons (X^ ^ Z^ ) ^ Cons (Y^ ^ Z^) for all X^ , Y^ and Z^.

Equivalence between belief change and abduction formulations We now prove that Ind b (X; Z [ W; Y ) (De nition 3) is equivalent to Ind a(;W ) (X; Z; Y ) (De nition 10).

The following observations (lemmas) are useful in understanding the proof of this theorem: Each sentence X~ is equivalent to the conjunction of some X 's. Example: a ^ b is equivalent to (a _ b) ^ (a _ :b) ^ (:a _ b). Each sentence X~ is equivalent to the disjunction of some X^ 's. Example: a b is equivalent to (a ^ b) _ (:a ^ b) _ (:a ^ :b). Suppose that there is no X and Y^ such that 23

1. [ fY^ g is consistent, 2. 6j= X , 3. [ fY^ g j= X . Then there is no X and Y~ such that 1. [ fY~ g is consistent, 2. 6j= X , 3. [ fY~ g j= X . Because if Y~ is consistent with and [ fY~ g j= X , then there must exist some Y^ such that Y^ is consistent with and [ fY^ g j= X . Suppose that there is no Y~ and X such that 1. [ fY~ g is consistent, 2. 6j= X , and 3. [ fY~ g j= X . Then there is no Y~ and X~ such that 1. [ fY~ g is consistent, 2. 6j= X~ , and 3. [ fY~ g j= X~ . Because if 6j= X~ and [fY~ g j= X~ , then there must exist some X such that 6j= X and [fY~ g j= X . The following is always true: Arg (Z _ X ) _ Arg (Z _ Y ) j= Arg (Z _ X _ Y ): Because if then also

[ fW~ g j= Z _ X or [ fW~ g j= Z _ Y ; [ fW~ g j= Z _ X _ Y :

The proof of the theorem is given below. ==> Suppose that Ind a(;W ) (X; Z; Y ) does not hold. That is, we have Arg (Z _ X _ Y ) 6j= Arg (Z _ X ) _ Arg (Z _ Y ) Y and Z . We want to show that Ind b (X; Z [ W; Y ) does not hold either. for some X; By supposition, there must exist some W^ such that 1. [ fW^ g j= Z _ X _ Y , 2. [ fW^ g 6j= Z _ X , 3. [ fW^ g 6j= Z _ Y . This means that ^ :Z; :Y g j= X , 1. [ fW; ^ :Z g 6j= X , 2. [ fW; ^ :Z; :Y g is consistent. 3. [ fW; 24

Therefore, Ind b (X; Z [ W; Y ) does not hold. We have then proved that Ind b (X; Z [ W; Y ) of implies Ind a(;W ) (X; Z; Y ).

Recommend Documents

A Conditional Logical Framework - Semantic Scholar

RE Axiomatization of Conditional Independence - Semantic Scholar

Logical and Algorithmic Properties of Conditional Independence and ...

Conditional Independence and Natural Conditional Functions