Logical and Algorithmic Properties of Conditional Independence and Graphical Models Author(s): Dan Geiger and Judea Pearl Reviewed work(s): Source: The Annals of Statistics, Vol. 21, No. 4 (Dec., 1993), pp. 2001-2021 Published by: Institute of Mathematical Statistics Stable URL: http://www.jstor.org/stable/2242326 . Accessed: 09/11/2011 03:49 Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at . http://www.jstor.org/page/info/about/policies/terms.jsp JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact
[email protected].
Institute of Mathematical Statistics is collaborating with JSTOR to digitize, preserve and extend access to The Annals of Statistics.
http://www.jstor.org
The Annals of Statistics 1993, Vol. 21, No. 4, 2001-2021
LOGICAL AND ALGORITHMIC PROPERTIES OF CONDITIONAL INDEPENDENCE AND GRAPHICAL MODELS1
BY DAN
GEIGER AND JUDEA PEARL
Technion-Israel Instituteof Technologyand Universityof California, Los Angeles This article develops an axiomatic basis for the relationship between conditional independence and graphical models in statistical analysis. In particular, the followingrelationships are established: (1) every axiom for conditionalindependence is an axiom forgraph separation, (2) everygraph representsa consistentset of independenceand dependence constraints,(3) all binary factorizationsof strictlypositive probabilitymodels can be encoded and determined in polynomial time using their correspondence to graph separation, (4) binaryfactorizationsof non-strictlypositiveprobability models can also be derived in polynomialtime albeit less efficiently and (5) unconditional independence relative to normal models can be axiomatized with a finiteset of axioms.
1. Introduction. A useful approach to multivariate statistical modeling is to firstdefine the conditional independence constraints that are likely to hold in the domain, and then to restrictthe analysis to probabilityfunctions that satisfy those constraints. An increasingly popular way of specifying independence constraintsare graphical models, such as Markov networksand Bayesian networks,where the constraintsare encoded throughthe topological properties of the corresponding graphs [Lauritzen (1982), Lauritzen and Spiegelhalter(1988), Pearl (1988) and Whittaker(1990)]. The key idea behind these specificationschemes is to utilize the correspondence between separation in graphs and conditional independence in probability; each node represents a variable and each missing edge encodes some conditional independence constraint. More specifically,if a set of nodes Z blocks all the paths between two nodes, then the correspondingtwo variables are asserted to be conditionallyindependentgiven the variables corresponding to Z.
The notions of graph separation and conditional independence, which at firstglance seem to have little in common, share key propertieswhich render graphs an effectivelanguage of specifyingindependence constraints. This
Received July 1989; revised November 1992. 'Supported in part by NSF Grant IRI-8821444 while the first author was at UCLA. The revised version was prepared while the firstauthor was at Northrop Research and Technology Center and completedat Technion. AMS 1991 subject classifications.Primary60A05, 60J99, 60G60; secondary62A15, 62H25. Key words and phrases. Conditional independence,Markov fields,Markov networks,graphical models. 2001
2002
D. GEIGER AND J.PEARL
articledevelopsan axiomaticcharacterization oftheseproperties, thusproviding a theoreticalbasis for understandingthe role of graphicalmodels in statisticalanalysis. The articleis organizedas follows.Section2 providespreliminary definitions. Section 3 provesthe existenceof perfectprobabilitymodels,that is, probabilitymodelsthat,givenan arbitrarylist of conditionalindependence statements, satisfyeverystatementon thatlist,everystatementthatlogically followsfromthatlistand none other.Usingthisresult,Section4 thenshows that everyaxiomforconditionalindependenceis an axiomforgraphseparation and that everygraph representsa consistentset of independenceand In otherwords,graphsprovidea "safe" languagefor dependenceconstraints. encodingstatisticalassociations;the set of conditionalindependenciesand dependenciesencodedby any graph is guaranteedto be realizablein some probability model. Section5 deals withspecialkindsofconditionalindependence relationships, thosethatpermitthefactorization ofa probability modelintoa productoftwo functions.It is shown that graphsprovidea parsimoniouscode (requiring the entireset ofbinaryfactorizations polynomialspace) forrepresenting that are realizablein strictlypositiveprobability models.Graphsalso facilitatea timealgorithm fordetermining polynomial whetheran arbitrary binaryfactorizationlogicallyfollowsfroma givenset of such factorizations. The rest of the articleprovidesa completeaxiomaticcharacterization for special familiesof independencerelationships.We firstdevelop complete forsaturatedindependence axiomatizations (Section6) and marginalindependence(Section7) and thenaddressthe axiomatization ofconditionalindependencein general(Section8). Section9 generalizesseveralresultsto qualitative independence, and Section10 providesa tabulatedsummaryofour results. 2. Preliminaries. Throughoutthis article,let U be a finiteset of distinctsymbols{u 1,...., u n}, called attributes(or variablenames). A domain mapping is a mappingthat associatesa set, d(ui), witheach attributeui. This set is calledthe domain of u i and each ofits elementsis a value foru i. An attributecombinedwitha domainis a variable.For example,the variable describingthe age of a personwillbe characterized by the attributeage and may be assigned a domain such as {ilO < i < 120} or [infant, child, young
betweenattributesand variablesallowsus adult, otheradult}. The distinction to associateseveraldomainswiththe same variablename,as done in someof the following. DEFINITION. A probability model over a finiteset of attributesU= {u1,. . ., u } is a pair (d, P), whered is a domainmappingthatmaps each ui to a finitedomain d(u ), and P: d(u1) x ... x d(u n) [0, 1] is a probability distributionhaving the Cartesianproductof these domains as its sample modelsover U is denotedby 9. space. The class ofprobability -
PROPERTIES
OF CONDITIONAL INDEPENDENCE
2003
Unless statedotherwise,U and its domainare assumedto be finite. DEFINITION. The expressionI(X, YIZ) where X, Y and Z are disjoint statement. Its negation I(X, YIZ) is subsetsof U is calledan independence An independenceor dependencestatementis called a dependencestatement. definedoverV c U ifit mentionsonlyattributesin V.
DEFINITION. Let (d, P) be a probabilitymodel over U. An independence statementI(X, YIZ) is said to holdfor(d, P) ifforeveryvalue X, Y and Z of X, Y and Z, respectively, P(X,Y,Z)
(1)
P(Z)
= P(X,Z)
* P(Y,Z).
Equivalently,(d, P) is said to satisfyI(X, YIZ). Otherwise,(d, P) is said to satisfy-I(X, YIZ). DEFINITION. When I(X, YIZ) holds for(d, P), then X and Y are conditionallyindependentrelativeto (d, P), and if Z = 0, then X and Y are marginallyindependentrelativeto (d, P).
positiveifeverycombimodelover U is strictly DEFINITION. A probability greaterthan 0. The class of strictly nationof U's values has a probability modelsis denotedby Yt positiveprobability DEFINITION. A probabilitymodel over U is binary if it assigns every attributein U a domainwithonlytwovalues,say 0 and 1. The class ofbinary modelsis denotedby f9. probability
Equations(2) through(6) list somepropertiesofconditionalindependence. studiedby byDawid (1979) and further Variantsofthemwerefirstintroduced Spohn(1980), Pearl and Paz (1985), Pearl (1988) and Geiger(1990). Trivialindependence: (2) Symmetry: ( 3)
I(X,90IY). It X, YlZ ) = I( Y, XZ ) .
Decomposition: (4) I(X, Y u WIZ) =>I(X, YIZ). Weak contraction [theaxiomatictheoryof Pearl and Paz (1985) invokeda strongerversionof this axiomwhichis not neededin the discussionof this article]: If X u W9YIZ) , IfXXWIZ u Y) =>IfX, Y u WIZ). (5) Weak union: (6)
I(X, YU WIZ) => I(X, YIZ u W).
2004
D. GEIGER AND J. PEARL
DEFINITION. An independence Horn clause is an implicationof the form I(X1,
Y11ZI),
I(X2,
Y21Z2),
...
I(Xk,
YklZk) =- I(Xk+l,
Yk+llZk+l).
Each independencestatementon the left of the implicationis called an antecedentand the one on the rightis called the consequence.Independence Horn clauses mayalso have no consequence[as in (2)]. An independenceHorn clause is instantiatedif each of the DEFINITION. Xi's, Yi's and Zi's is substitutedwith a specificsubset of U [e.g., I U1 U2), 0 {u3,U 4)) is an instanceoftrivialindependence]. to denotean independencestatement,- a We use o-,possiblysubscripted, to denotethe negationof a-,X to denotea set ofindependence statementsand SF to denotea subsetof 6 (i.e.,a class ofprobability modelsoverU suchas O or q+). An independenceHorn clause is sound relativeto S ifffor DEFINITION. modelin F that satisfies everyinstantiation of the clause,everyprobability the clause's antecedentsalso satisfiesits consequence. DEFINITION. When an independence Horn clause is sound relative to F, it
is called an axiom relativeto E. An axiomrelativeto 9 is simplycalledan axiom.
For example,(7) is an axiomrelativeto + but not relativeto Intersection:
'.
I(X, YIZ u W),~ I(X, WIZ u Y) =>I(X, Y u WIZ).
(7)
Given a set of axioms d, an independencestatementa is derivablefroma set ofstatementsX, denotedX W-a, ifthereexistsa derivationsequencea1, ..., an such that on = aJand foreach oj, either(1) oj E E or (2) aj is the consequenceof some instantiatedaxiomin v forwhichevery antecedent is in {0-, ... , oj- ). The closure of E is the set of derivable DEFINITION.
statements,{faIY-
a,), and is denoted by E+.
For example, I(u1, u310) is derivable from the set {I({u1u3},u210), I(u1, u3 u2)) using axioms (2) through(6) via the derivationsequence I({ulu3,
U210), I(U1, U3lu2), I(U1, {U2, U3)10), I(U1, U310).
The thirdand
in thissequenceare derivedfromthepreviousonesbyweak fourthstatements contractionand decomposition,respectively.[For simplicity,throughout, I(ui ujIuk) standsforI({Qu),{uj)I{uk)]. DEFINITION. An independence statementaf is entailed by a set of statements E relativeto a set of probability models , denoted I= a, if every modelin F thatsatisfiesE satisfiesaras well.The set ofentailed probability statements, {fall = a), is denotedby .*, keepingS implicit.
PROPERTIES OF CONDITIONAL INDEPENDENCE
2005
1. Let d be a setofaxiomsrelativeto 5. For everyset I of PROPOSITION statements, we have .'+ independence *, whereE.+ is derivedfromX using theaxiomsin a?, and E* is entailedrelativeto . PROOF. The prooffollowsby inductionon the length of a derivation sequence of each o in E+, using the factthat the axiomsin d are sound relativeto Y. O
Equalityof E+ and L* holdsonlyifno axiomsare "missing." DEFINITION.A set ofaxioms v is complete(relativeto 95-)ifforeveryset M of independence statements,E* = E+. PROPOSITION2.
A set of axioms
v
is complete(relative to F) if and only
if for everyset of statementsE and everystatemento, - + thereexistsa probability model(dc, P,) in F thatsatisfiesX and does notsatisfya.
ofcompleteness fromthe definition PROOF. The prooffollowsimmediately and Proposition1. ol Next, we seek conditionsunder which,for everyset E of independence modelin a givenclass F that satisfies thereexistsa probability statements, preciselythe statementsin E* and none other.Fagin (1982) spelledout such conditionsand showed,in the contextofdatabasetheory,thattheyimplythe existenceof an operator ? that maps a set of probabilitymodels to a model,such that an independencestatementholdsin the latterif probability and only if it holds in everyconstituentof the former.The next section constructssuch an operator. 3. Perfect probability models. The mainresultofthis sectionis that, forany given set X of independencestatements,there existsa probability modelin Y thatsatisfiespreciselyE* and no otherstatements. (Fagin called "Armstrong models.")An immediateapplicationof modelswiththisproperty and whethera givenset ofindependence it,as we shall see, lies in determining dependencestatementsis consistent. DEFINITION. Let E be a set of independencestatements.A probability model is perfectfor X (relativeto F) if it satisfiespreciselythe set of statements1* entailedby X (relativeto Y) and noneother.
modelsrests The key idea in showingthe existenceof perfectprobability withthe notionof directproductdefinedbelow,whichextendsFagin's definition(1982) fromdatabaserelationsto probability models. DEFINITION. The (binary)direct productforF is a mapping,0: 91x 31-* modelsover a finiteset of attributes 5Y, where F is a class of probability
2006
D. GEIGER AND J.PEARL
{uJ, ... , uJ, and (d, P) = (d1, P1) ? (d2, P2) is definedas follows:Let d l(u ) and d2(u i) be the domainsassociatedwith u i in (d1, P1) and in (d2, P2), respectively. Let ai and bi be values drawnrespectively fromthese domains. Set the domainof ui in (d, P) to be the Cartesianproductd1(u ) x d2(ud), and let
(8)
P(alb1, a2b2,
. . .,
ab,,)
= Pl(al, a2,...,
an) *P2(bl, b2, .bn,
wherea ib, denotesa value of u i in (d, P). A notableproperty of X is theassignment ofa newdomain,d1(ui) x d2(u ), to each u j. Thus u is treatedas an attributeratherthan a variablewitha fixeddomain.We willshowat theend ofthissectionthatifthedomainofeach attributeis fixed,thenthe existenceofperfectmodelsis notguaranteed. The next lemma shows that the productformof (8) remainsvalid after marginalization. LEMMA3. Let (d1, P1), (d2, P2) and (d, P) beprobability modelsoverUas in (8). Then, for everysubset ull, ..., uil of U, (9)
P(a ilb , a2 b2,.
.
. , a11bil) = Pl(a -, ai2,
*, ail)
.
P2(bi , b,
bil)
Assumewithoutloss of generality that in (9), i1 = 1,i2 = 2, ... u to this meet assumption.) When 1 = n , n thisequationis identicalto (8). We proceedby descendinginduction.Assume (9) holds for 1 = k ? n; then PROOF.
i = 1. (otherwise reorder u
P(albl,...
,ak-lbk-l)
EP(albl,.*.,ak-lbk-1,Xk)
-
Xk
E
= akedl(Uk),
E
=
ak-1 ak) *P2(bl,...,
Pl(al,...,
bk
PI(a,*....,
ak-,
ak))
(
ak Edl(Uk)
-Pl(al,..
bk-1 bk)
ed2(Uk)
E
P2(bl,...,
bk-1
bk))
bk Ed2(Uk)
.,ak-1)
* P2(bl,.*
*,
bk-J1)E
The keyproperty of ? is givenin the following lemma. LEMMA4. Let (d 1,P1), (d2, P2) and (d, P) beprobability modelsoverU as in (8). Then, forany threedisjointsubsetsX, Y and Z of U,
(10) (10)
I(X, YIZ) holdsfor(d, P) iffI(X, YIZ) holdsfor(d1, P) and for(d2, P2).
PROOF. Let ax, a y, az be respective values of X, Y, Z in (d1, P1) and values of X, Y, Z in (d2, P2). bX,by,bzbe respective
PROPERTIES OF CONDITIONAL INDEPENDENCE
2007
The if partof(10) followsfrom P(b) P(axbx, ayby)az Z) =Pj(ax, ay, az)
Pl(az)
P2(bx,by,~ bz) P2(bz)
PI(ay) az) *P2(bx I bz) P2(by) bz)
=Pl(ax az)
P(axbx, azbz)
P(ab,
a bz).
(Note the implicituse ofLemma3.) The onlyif partof(10) followsfrom Pl(ax, ay) az) *PI(az)
P2(bxgby,bz) P2(bz)
P(axbx, a by,a bz) P(azbz)
=
= P(axbx, azbz)
P(ayby, azbz)
Pl(ay , az)
=Pl(ax Iaz)
P2(bx, bz) * 2bXbz).
By summingonce over ax and once over bx,it is evidentthat I(X, YIZ)
holds for(d1, P1) and for(d2, P2). 0
Next, we extend the directproductto be a mappingfromfamiliesof models. probability models(ratherthanpairs)intoprobability THEOREM 5.
There exists an operator ? that any nonemptyfinitefamily = 1,... , n) of probabilitymodels over a set of attributes U into a {(di, Pj)Ii probabilitymodel over U, such that if o- is an independencestatement,then aholds for ? {(di, Pi)ji = 1,..., n} if and only if orholds foreach (di, Pi).
PROOF. Since the binarydirectproductis commutative and associative,it can be extendedto sets as follows: 09 {(di, Pj)Ii = 1,...,
n}
=
((((d1, P1)
(d2, P2)) ?9 (d3, P3))
(dn, PO))
Due to Lemma4, a- holds for 0) ((di, Pj)Ii
=
1,..., n} iffoa hold forevery(di, Pi),
as statedby the theorem.E1 the existenceofperfectprobability modelscan be established Consequently, [similarto (Fagin 1982)]. COROLLARY 6. For every set of independence statements E over the attributesof U, there exists a probabilitymodel (d, P) in .7 such that (d, P) satisfies every statement in E* and none other, that is, (d, P) is a perfect model relative to
4?.
D. GEIGER AND J.PEARL
2008
PROOF. Let (d, P) be ?{(d,, P9)Io, *}, where(d,, P,) is a probability of E*, a model that satisfiesE* but does not satisfyac. By the definition case where model(d,, P,) alwaysexistsexceptforthe degenerated probability in whichcase Corollary6 holds E* rendersall variablesmutuallyindependent, trivially. (Alsonotethatthe set {lo- - E*} is finitebecause U is finite.)Due to in E* and noneotherbecausethese Theorem5, (d, P) satisfiesthe statements are the onlystatementsthatholdforevery(da, P9). r positivewhenever model C){(d , Pi) Ii = 1,. . . , n) is strictly The probability result. we obtainthe following each (di, Pi) is strictly positive.Consequently, 7. For everyset of independencestatementsX, thereexistsa COROLLARY model(d, P) such that(d, P) satisfieseverystatestrictly positiveprobability mentin 1* (relativeto 4?+) and noneother,thatis, (d, P) is a perfectmodel relative to +
thatdetermines The existenceofa perfectmodelimpliesthatanyalgorithm whethera givenstatementis entailedby E can also determinewhethera disjunctionofstatementsin entailedby E. For example,to showthat (11)
{I(U1,
U210),
I(U1,
U21u3)}
t? I(U1,
U310)
V I(U2,
U310),
we will see that one mustmerelycheckthat each disjunctis not entailedby itself. To refutethe firstdisjunct,constructa probability model(dl, P1) in which u1 and u2 are two independentbinaryvariablesand u1 equals U 3. This probabilitymodel satisfiesthe antecedentsbut does not satisfythe first model(d2, P2) disjunct.To refutethe seconddisjunct,constructa probability in whichu 1 and u 2 are twoindependent binaryvariablesand u 2 equals u 3; it model satisfiesthe antecedentsbut not the seconddisjunct.The probability ? the antecedents does the satisfies but not satisfy disjunc(d2, P2) P1) (dl, tion.Hence,the disjunctionis notentailedby the antecedents. Notably,ifwe fixthe domainof u3 to be binary,theantecedentsof(11) do entail the disjunctiveconsequence[Pearl (1988), pp. 129 and 137]; the constructionof (dl, P1) X (d2, P2) failsbecause ? assignsa domainof size 4 to we obtainthe following result. U 3. Consequently, statementsX forwhich COROLLARY8. Thereexistsa set of independence no binaryprobability modelis perfect. model Let E = (I(u1, u210), I(u1, u2lU3)}. Everybinaryprobability neither u3 10) or I( u2, u3 10). However, statementin itselfis entailedby E (relativeto 9) and therefore noneis in E. PROOF.
eitherI(u1, thatsatisfiesE satisfies
testsforconsistency. AnotherapplicationofTheorem5 is facilitating
[
DEFINITION. A set of independencestatementsYp and a set of negated ifthere statements(i.e., dependencestatements) independence In is consistent
PROPERTIES OF CONDITIONAL INDEPENDENCE
2009
modelthatsatisfieslp U In. The taskofdecidingwhether existsa probability a set of independenceand dependencestatementsis consistentis called the whethera set of independence consistency problem.The task of determining probstatementsentailsan independencestatementis called the implication lem. whetheror not Upu In is consistent: algorithmdetermines The following For everymemberm o- of In, determinewhether ,p l= o-. If the answeris negativeforall membersof In, then Upu In is consistent;otherwiseit is not consistent[(Geiger,Paz and Pearl (1991)]. twoconditionsare met:(1) we can workswhenthefollowing This algorithm is takenwithrespect efficiently checkwhetheror not I l= o and (2) entailment to a class ofprobability modelsthathas perfectmodels(i.e., 9+ but not q). statements, called saturated, In Section5 we examinea class ofindependence forwhichtheseconditionsare met. stemsfromthe factthatifthe negationof ofthe algorithm The correctness each member oaof In is not entailedby lp, thatis, each memberof In is model(dv, P,) individually consistentwith ;p,thenthereexistsa probability The model not a*. (d, P) = that satisfieslp and does satisfy probability o U in the E and therefore every statement satisfies s {(dh, POl ff- n} Yp In, In the correct. other two are is sets consistent algorithm'sdecisionthat the memberof I,n, namely,whenthe algorithmdetectsan inconsistent direction, thenthe decisionis obviouslycorrect. 4. Graphs and independence. The use ofgraphsforrepresenting probin the statisticalliterature[Whittaker is welldocumented abilitydistributions schemesis the therein].The basis oftheserepresentation (1990) and reference similaritybetween separation in graphs and conditionalindependencein We will show that these two conceptsare relatedin a stronger probability. sense than was previouslyknown;we will show that everyaxiomforconditionalindependence mustalso be an axiomforgraphseparation,and thatthe conditionsembodiedin any graphalwayscorreset of separation-connection in probabilstatements spondsto a consistentset ofindependence-dependence ity. DEFINITION. An undirected graph is a pair (U, E), whereU is a finiteset ofattributes, called nodes,and E is a set ofunorderedpairsofdistinctnodes, called edges. When(u, u 2) is an edge, u1 and u2 are directlyconnected.A path betweentwonodesis a sequenceofnodesforwhicheverypairofadjacent connectedand no nodeappearstwice. nodesis directly DEFINITION. Let X, Y and Z be disjointsubsets of nodes in a graph G = (U, E). A separationstatementJ(X, YIZ) is said to hold forG if every pathbetweena nodein X and a nodein Y includesa nodein Z. Equivalently, we say that G satisfiesJ(X, YIZ) or X and Y are separatedby Z in G.
2010
D. GEIGER AND J.PEARL
Connection(negatedseparation)statements, separationHorn clauses and separationHorn axioms fora set of graphsare definedanalogouslyto the corresponding conceptsofindependence definedin Section2. It is easy to see that axioms (2) through(7) remain sound when I is replacedwithJ; thatis, whenevertheantecedentofone oftheseaxiomsholds in some graph,its consequenceholds as well. For example,if X and Y u W are separatedby Z in some graph G, then X and Y are also separatedby Z u W as dictatedby theweak-unionaxiom(6). This correspondence between independenceand graphseparationis not a coincidence;we show nextthat everyaxiom of conditionalindependenceis an axiom for separation.The conversedoes not hold [Pearl (1988)]. A preliminary definition and a lemma are needed. DEFINITION. Let (d, P) be a probability modelovera finiteset ofattributes U, and let G be a graphwhosenodesare the elementsof U (i.e., each nodeis associatedwithan attribute).Then G is said to be a Markovnetworkof(d, P) ifforeverythreedisjointsubsetsX, Y and Z of U,
J(X, YIZ) holdsforG impliesthatI(X, YIZ) holdsfor(d, P). For example,a language in which the probabilityof the i th letteris determined solelyby the (i - 1)th lettervia P(li I i- ) can be represented by theMarkovnetworkofFigure1. This graphshows,forexample,that11and 13 are conditionally independent given12,since 12 separates11and 13. Notethat this independencestatementholds regardlessof the domainassociatedwith each li (i.e., the alphabet of the language need not be specified).Markov networksare discussedin Darroch,Lauritzenand Speed (1980) and Lauritzen (1982). A variantof the next lemma was independently derivedby Frydenberg (1990).
LEMMA9. Let G be an undirected graphwithU as itssetofnodes.LetX, Y and Z be disjointsubsetsof U such thatX and Y are notseparatedbyZ. Then thereexistsa strictly model (d, P) overa set ofattributes positiveprobability U, such thatG is a Markovnetwork of (d, P) and I(X, YIZ) does notholdfor (d, P). PROOF. Since X and Y are not separatedby Z, there exists a path r1,r2,. . , r, whichcontainno nodesof Z and whichconnectsa node r1 in X to a node r1in Y. Let everynode ri be associatedwitha binaryvariablevi and
O-?-(, FIG.
3
4
1. A five-node chain.
2011
PROPERTIES OF CONDITIONAL INDEPENDENCE
every node not on the path be associated with a binaryvariable s . A model(d, P) where probability 1-1
P(v1,...., vl, S1, ...)
=
(1/2)
H f(V,
i=l
HFg(si),
vi+1)
i
g(si) = 1/2, and I1/2,
f(vi, vi+1)
1/4, t3/4,
if vi=0, vi+1 =, if vi = 1, vi+ - 1,
if vi=
17Vi+1
=
1
satisfiesthe requirements;I(v1,v1IZ)does not hold and if J(X', Y'lZ') holds, I(X', Y'IZ') holdsas well. o c=r o- that is THEOREM 10. Every independence Horn clause ao, U2' ... o-n an axiom for independence relative to P+ is also an axiom for separation, where each ri is interpretedas a separation statement. ,
that thereexistsa graphthat satisfies PROOF. Suppose by contradiction {1,. . . ,on} and does not satisfya. Then by Lemma 9 there exists a
=
modelthatsatisfiesE and does notsatisfyv. Thus strictly positiveprobability 1 C2 *...* o'n => oa is not soundrelativeto +. El
Consequently,in particular,axioms (2) through(7) as well as those discussedby Studeny(1992) are axiomsforseparation.A completelistofaxioms forseparationwas foundby Pearl and Paz (1985). languageforindependence Each graphcan be thoughtof as a specification and dependencestatements;whenevera separationconditionholds in the statementis asserted,and whenevera independence graph,the corresponding dependencestateconnectionconditionholdsin the graph,the corresponding the two sets of graph, in any that, next will show We ment is asserted. use of undirected statementsare alwaysconsistent.This resultjustifiesthe of statistical patterns intricate graphs as a general language for encoding and Pearl [Geiger associations.Similarresultsholdfordirectedacyclicgraphs (1988)]. For everygraph G with U as its nodes, thereexists a strictly positiveprobabilitymodel (d, P) over U, such that foreverythreedisjoint sets THEOREM 11.
X, YandZ of U,
J( X, YIZ) holds forG ifand only ifI( X, YIZ) holds for (d, P).
thatholdin G. For every PROOF. Let E be the set ofseparationstatements model(dc, P,) thatsatisfiesE and statementa 0 X,thereexistsa probability statements as independence does not satisfya- wherel and a- are interpreted
D. GEIGER AND J. PEARL
2012
(Lemma 9). Let (d, P) be {(do, Pa)Ic o- Y}. (The set {tojo- E 1) if finite because U is finite.)Due to Theorem5, (d, P) satisfiespreciselythe statementsin E and noneother. o domain Note,however,that ? assignsto each attributein U an arbitrary is not needed. size. We conjecturethatthisarbitrariness CONJECTURE 1. For everygraph G with u , . . .X,u as its nodes and for everyn integers k1,... , kn all greater than 2, there exists a strictlypositive probabilitymodel (d, P) over U, such that (1) Id(ui)l = ki and (2) for every threedisjoint sets X, Y and Z of U,
J( X, YIZ) holds forG ifand only ifI( X, YIZ) holds for (d, P).
5. Graphs and binary factorizations. The relationship betweengraph is even strongerthan thatshownso separationand conditionalindependence far if we restrictourselvesto strictlypositiveprobabilitymodels and to
saturated statements.
DEFINITION. An independence statement I(X, YIZ) or a separation statement J(X, YIZ) is saturated if X u Y U Z = U, where U is the finiteset of
attributesofinterest.
In the followingdiscussionwe show that saturatedindependencestatements(relativeto _9) and saturatedseparationstatementssatisfyprecisely the same axioms.This correspondence providesus withan efficient algorithm to deterineall saturatedindependence statementsentailed(relativeto '9) by a givenset of such statements. model(d, P) if Moreover,each statementI(X, YIZ) holdsfora probability and only if (d, P) has a binaryfactorization,namely, P(X, Y, Z) = f(X, Y)
g(Y, Z),
where g and f are any functions[Lauritzen(1982)]. Consequently,the proposedalgorithm providesan efficient wayto determine all binaryfactorizations entailed(relativeto + by a givenset of binaryfactorizations. [The termssaturatedindependence and binaryfactorizations are borrowed, respectively,fromLee and Buehler(1986) and Malvestuto(1992)]. We use the following theoremof Pearl and Paz (1985) whichgeneralizesa resultby Lauritzen(1982). THEOREM12. Let E be a set of independencestatementsover a finiteset of attributesU, and let E + be the closure of E with respectto trivial independence, symmetry,decomposition,intersectionand weak union. Let G0 be the graph having U as its nodes and an edge betweenx and y if and only if
PROPERTIES OF CONDITIONAL INDEPENDENCE
I({x}, {y} IU \ {x, yl) E of U,
E
2013
+. Then (1) for everythreedisjoint subsets X, Y and Z
J(X, YIZ) holds forGo implies thatI(X, YIZ) E X,, and (2) if any edge is removedfromGo property1 ceases to hold.
Next,we strengthenTheorem12 when E consistsof saturatedindependencestatements. THEOREM 13. Let X be a set of saturated independence statementsover a finite set of attributes U, and let E + be the closure of X with respect to saturated trivial independence [i.e., all statements of the form I(X, 0IZ) where X u Z = U], symmetry,intersectionand weak union. Let G0 be the graph definedin Theorem 12. Then foreverythreedisjoint subsets X, Y and Z of U, such that X u Y u Z = U,
J(X, YIZ) holds forG0 iffI(X, YIZ) E E+. PROOF. The key point to notice is that I(X, YIZ) E lI+ if and only if I({x}, {ylIZ u(X\ {x}) U (Y\{y})) is in E+ foreveryx E X and y E Y. Each of these independencestatementsis derivablefromI(X, YIZ) by an applicationofweak unionfollowedby symmetry, weak unionand finallyfollowedby The statementI(X, YIZ) is derivableby repeatedapplicationsof symmetry. intersection and symmetry. The same equivalenceholdswhen I is replacedby J because separationsatisfiesthe threeaxiomswe have used in the preceding argument.Consequently, J(X, YIZ) holdsforGo if J({x, {y lIZ u (X \ {x}) u (Y\ {y})) holdsforG0 foreveryx E X and y E Y. By the definition of G0,the
latter set of statementsholds if and only if I({x}, {ylIZ u (X\ {x}) u (Y\ {y})) is in E.+ foreveryx E X and y E Y. In addition,thesestatementsare in E+
if I(X, YIZ) E Zi+.An additionalminorobservationis thateach trivialindependencestatementholdsin everygraph(U, E) in particularin Go. O we obtainthe following Similarly, result. THEOREM14 (Completeness relative to 9+). Let E be a set of saturated independence statements, and let E + be the closure of X with respect to saturated trivial independence, symmetry, intersectionand weak union. Then, foreverya t X +, thereexists a strictlypositiveprobabilitymodel (d, P) over U, where U is the set of attributesthat appears in X, that satisfies I + and does not satisfycr.
PROOF. By Theorem13 thereexistsa graphGo that satisfiesE + and no otherindependencestatement.By Lemma 9 thereexistsa strictlypositive probability model(dv, Pa) that satisfiesthe statementsthat hold in Go and does not satisfyar.Thus (dc, Pa) satisfiesthe requirement ofthe theorem.o
2014
D. GEIGER AND J.PEARL
Theorems13 and 14 togethershowthatsaturatedindependence statements and saturatedseparationstatementssharepreciselythe same axioms(relative to + This equivalencepermitsus to computethe set of all saturated independencestatementsentailedrelativeto 4+ by a givenset of saturated statements, usinga purelygraph-theoretic approach. The algorithmis simple:Givena set ofsaturatedindependence statements the graphGo = (U, E) as follows. X over U, construct Step 1. Replaceeach givenstatementI(X, YIZ) witha set ofindependence statements{I({x}, {ylIZ U (X\ {x}) u (Y\ {y}))Ix E X, y E Y}. Step2. Introducean edgebetweenx and y if I({x}, {y}IZ u (X\ {x}) u (Y\ {y})) is notamongthe statementsgeneratedin Step 1. Step 3. Output I(X', Y'IZ') E E+ if J(X', Y'IZ') holds in the graphproducedin Step 2. OtherwiseoutputI(X', Y'IZ') t E+. The algorithmrequires0(11 I n2) steps to constructG0 where n is the numberof attributesbecause it scans each statementof the inputonce and each statementmay require checkingn2 pairs of attributes.Once G0 is it permitsus to check whethera specificsaturatedstatement constructed, = I(X, YIZ) is entailed(relativeto F+) by E in only0(n) steps-the time neededto checkwhetherZ separatesX and Y in Go. This methodallowsus to representin polynomialspace (in the numberof entailed(relativeto 97+) bya attributes)the entireset ofbinaryfactorizations given set of binary factorizationsand to determine,in polynomialtime, whetheror not a specificbinaryfactorization is in this set. We will see next that a similarimplicationalgorithm,albeit less efficient, can be developed withoutthe assumptionofstrictpositiveness. 6. Saturated independence. The next completenesstheoremis the analog of Theorem 14 with weak contractionreplacingintersection.This changeis neededbecauseintersection is soundrelativeto 9+ but notrelative to 5 . THEOREM 15 (Completenessrelativeto "F . Let E be a set of saturated independencestatements overa finiteset of attributesU, and let E+ be the closureof X withrespectto saturatedtrivialindependence, weak symmetry, contraction and weakunion.Then, foreveryo- IS+, thereexistsa probability model(d,, P) thatsatisfiesE + and does notsatisfyo. PROOF. Let o-= I(X, YIZ) be a saturatedstatementnot in E+ where X u Y u Z = U. At firstwe assume that o- is maximal,that is, forall sets X'X" and Y' Y" partitioningX and Y, respectively,the statement I(X', Y'IZX"Y") is in E+. (In thisproofAB standsforA U B.) At the end of the proofwe relaxthisassumption. Let each attributein U be associatedwitha binarydomain{0, 1). Denoteall attributesin X by x1,x2,... , xl, thosein Y by Y1,Y2,..., ymand thosein Z
PROPERTIES OF CONDITIONAL INDEPENDENCE
by z1, Z2,...
,
Zk.
Pa(X, Y, Z) =
2015
The probability model(dc, P9) is definedas follows:
[l
ziLZ
f(zi)
in X u Y are assigned0, (1/2, ifall attributes in X u Y are assigned1, 1/2, ifall attributes 10, otherwise,
where f(zi) = 1/2. This probabilitymodel does not satisfyoa because PJ(X = 0, Y = 1, Z = 0) is 0, while P(X = 0, Z = 0) and P(,(Y= 1, Z = 0) are not. It remainsto showthateverysaturatedstatementin E+ holdsfor(do, P9,
or equivalently thateverysaturatedstatementeitherholdsfor(da, Pr) or does not belong to E +. Any saturated statement y can be written as I(X1YlZ1, X3Y3Z3 X2Y2Z2), whereX = XlX2X3, Y = Y1Y2Y3and Z = Z1Z2Z3 and the Xi's, Yi's and Zi's are all disjoint.If X2Y2=A0, then y holds for (dc, P9) because everyinstanceof X1Y1Z1and of X3Y3Z3that is consistent with the values of X2Y2 has the same probabilityof occurring,namely, 1/21zll l/21Z31. If X1Y1 = 0, then,again,y holds for(dc, P9) because Z1 is and conditionally marginally ofany otherset ofattributesof P,. independent (Symmetrically when X3Y3 = 0.) Otherwise, y is of the form where X1Y1= 0 and X3Y3=A0. We continueby conI(X1Y,Z1,X3Y3Z31Z2), tradictionand showthatin thiscase y does notbelongto I'. Assume,by contradiction, that the statementI(X1Y1Z1,X3Y3Z3IZ2)is in Then in is E+. E,+ as well because it can be derived by weak I(X1Yl, X3Y3IZ) union and symmetry. To reach a contradiction, we show that the latter statementimpliesthat oa mustbe in E +, contradicting our selectionof cr.The proofuses weak contraction and symmetry to deriveI(X1X3, Y1Y31Z)(i.e., a) fromI(X1Y1,X3Y3IZ) by "joining" the X's and the Y's. The following is a derivationof a. First,I(X1, YlIZX3Y3)is in Y. because I(X, YIZ) is maximal.Due to weak contraction, I(XIY17
X3Y314)
I(X1,
Y1IZX3Y3)
v*I(X1,
Y1X3Y3IZ),
we conclude that I(X1, YX31Z)E E+. Due to symmetry,we conclude I(YX3, X1IZ) E ' as well. I(X3, YIZX1)E E+ because a is maximal.Therefore,by symmetry, I(Y, X3IZXd)is also in E+. Using weak contraction, we obtain I(YX3,
X114)
I(Y, X3IZXJ)
I*(Y, XlX31Z)-
Thus I(Y, XIZ) E E+, and, by symmetry, I(X, YIZ) E E+, a contradiction. (Notethatifsomesetsout of Xl, X3, Y, and Y3 are empty,thederivation just describedremainsvalid.) If a = I(X, YIZ) is notmaximal,theneitherI(X \ {x},YIZ u {x}) t i' for some x E X or I(X, Y \ {y)IZ u {y}) 0 E+ forsome y EI Y. Withoutloss of generalityassume the firststatementis not in E+. If this statementis maximal,denoteit a'. Otherwise,repeatthe processof augmentingZ with additionalelementsfromX and Y. Whenthisprocesscan no longercontinue, we denotetheresultingstatementa,' = I(R, SIT). Clearly,ar'is maximal;it is
2016
D. GEIGER AND J. PEARL
not in + and forall sets R'R" and S'S" partitioning S and T, respectively, the statementI(R', S'ITR"S") is in X.+. For a maximalstatemento-',we have shownhowto constructa probability model(d,,, P,,) thatsatisfiesE and does not satisfyoa'.Due to symmetry and weak union,whichhold forall probability modelthat models,any probability does not satisfyo-',does not satisfyoaas well.In particular, (dc, Pa) does not satisfyof whilesatisfying E+, as requiredby the theorem.w1 The probabilitymodel (ds, P) constructedpreviouslyhas an additional property; each combination ofvalues forX u Y u Z has eitherzero probability or a constant probabilityof l/21ZI+l. Thus the probabilitymodel (dv, P) can be viewedas a database,categorically betweenpossibleand distinguishing impossiblevalue combinations. the proofofTheorem15 shows Consequently, thatthe previouslymentionedaxiomsare also completeforMVD statements ofrelationaldatabases[Fagin(1978)]. Indeed,the onlydifference betweenour axiomsand the ones governing MVD's is thatthe latterallowoverlapping sets X, Y and Z in I(X, YIZ) whereaswe do not[Beeri,Faginand Howard(1977)]. This equivalencepermitsthe employment of a polynomialimplicationalgorithmdevised for MVDs [Beeri (1980)] to determinewhethera saturated statementis entailedby a set ofsaturatedstatements, just as the equivalence between graph separationand conditionalindependence(relativeto 7+) in the previoussection. providedus withan implication algorithm Malvestuto(1992) has independently observedthisequivalenceand used it to produce an indirectproofof Theorem 15 by showingthat MVD and saturatedindependence statementsmustsatisfythe same set ofaxioms. The complexity ofthe implicationalgorithmforsaturatedstatementsrelativeto 9. [Beeri(1980)] differs fromthat neededrelativeto + the former requires0(10I* n2) operationsto decide E t= o- foreach o, whilethe latter requiresonly0(n) operations, regardlessof IX. These savingsare achievedat the cost of investing0(1 I - n2) stepsin constructing a graphicalrepresentation of the closureof E (relativeto + but this cost is encounteredonly in complexity once. This difference can be significant since,in principle,III can be exponentialin n. 7. Marginal independence. This sectionsummarizestwocompleteness resultsforstatementsofthe formI(X, YI0) (marginalstatements). THEOREM16 (Completeness). Let E be a set of marginalstatements, and let L+? be theclosureof X withrespectto axioms(12) through(15). Thenfor everymarginal statementof= I(X, YI0) not in I', thereexists a binary model(dc, P,) thatsatisfiesE.+ and does notsatisfyo-. probability Marginaltrivialindependence:
(12)
I(X, 010).
PROPERTIES OF CONDITIONAL INDEPENDENCE
2017
Marginal symmetry:
I( X, Yl0) *I( Y, Xl0) e
( 13)
Marginal decomposition:
I(X, Y u WI0) =I(X, YI0).
(14) Marginal mixing:
I(X, Yl0), I(X uY, WI0) =*I(X, Y u WI0).
(15)
The proofofTheorem16 uses the same techniqueas thatofTheorem15. It can be foundin Geiger,Paz and Pearl (1991), togetherwithan 0(1I1 * n2) implication algorithm thatis based on theseaxioms.The implication algorithm and the axiomatization hold relative to E4 and J.
A Gaussian model over a finite set of attributes U= . , uj} is a pair (d, P), where d is a domain mapping that maps each u1 to (-o, +oo), and P: d(u1) x ... x d(u,) -> [0, 1] is a multivariateGaussian DEFINITION.
.u..
probability distribution. (For the sake ofbrevity, we willnot definemultivariate Gaussian probability The class of Gaussian modelsis dedistributions.) notedby A. Gaussian modelssharestrongerpropertiesformarginalindependence than the ones listedpreviously; in particular, it is wellknownthatGaussianmodels additionalproperty: satisfythe following Marginalcomposition: (16)
I(X, Yl0),
I(X, Wl0) =*I(X, Y u Wl0).
Theorem17 showsthat marginalcompositionis the onlyaxiomthat was " missing"relativeto Gaussian models. THEOREM 17 (Completeness). Let E be a set of marginal statements,and let E+ be the closure with respectto marginal trivial independence, marginal symmetry,marginal decomposition and marginal composition. Then there exists a Gaussian model that satisfiesall statementsin E+ and none other. PROOF.
Let U = u1, ... ., u n be the attributes of interest. Let P be a
zero-meanmultivariatenormal distribution, with the followingcovariance matrix: (if r=F (pij)
wherepili=N
, {p
3 I(X, Yl0) E E s.t. u i E X, uj E Y, otherwise,
wherep2 OandP(Z)>O
iffP(X,Z)>OandP(Y,Z)>O,
foreveryrespectivevalue ofX, Y and Z. This definition is identicalto that of EMVD in databasetheoryand is also discussedby Shafer,Shenoyand Mellouli(1988). Theorems5, 14 and 15 hold when I is replacedwithI. For detailsconsultGeiger(1990). 10. Summary. Table 1 summarizespropertiesof classes of probability modelsversus classes of independencestatements.A questionmark means thatthe problemremainsopenas ofthe writingofthisarticle.The symbol 4'V denotesthe class of normalmodels and 97 probabilitymodelsover binary variables. Some propertiesof Gaussian modelsare listedin Table 1 whichhave not been provenin thisarticle.The axiomsforsaturatedindependence (relativeto weak union and intersection -IV) consistof trivialindependence, symmetry, [Geiger(1990)]. The factthat perfectGaussian modelsdo not existforsome sets of statementscan be provenin the same wayas in Corollary8 (withthe same E selected).The nonexistenceof a finiteset of Horn axioms can be provenin the same wayas in Theorem18. In addition,we have showna strongrelationship betweengraphseparation and conditional In particular, independence. everyundirected graphrepresents a consistentset of independenceand dependencestatements(Theorem11), everyaxiomforconditional independence is also an axiomforgraphseparation (Theorem10) and saturatedseparationand saturatedindependence(relative TABLE 1
Propertiesofconditional independence
Properties
Marginal statements
Saturated statements
Unrestricted statements
9?
Completefiniteaxiomatization Polynomial implication algorithm Perfectmodels
Yes Yes Yes
Yes Yes Yes
No ? Yes
Y+
Completefiniteaxiomatization Polynomial implication algorithm Perfectmodels
? ? Yes
Yes Yes Yes
? ? Yes
XAV
Completefiniteaxiomatization Polynomial implication algorithm Perfectmodels
Yes Yes Yes
Yes Yes Yes
? ? No
X
Completefiniteaxiomatization Polynomial implication algorithm Perfectmodels
Yes Yes ?
Yes Yes ?
? ? No
2020
D. GEIGER AND J.PEARL
to 4+) sharethe same axiomaticstructure(Theorems13 and 14). Analogous correspondence existsbetweenseparationin directedacyclicgraphs(d-separation) and conditionalindependence.See Geiger and Pearl (1988) and Verma(1986) fordetails. Acknowledgments. Our notationand definitionswere influencedby Beeri,Fagin and Howard (1977) and Fagin (1976). We are indebtedto Ron Fagin forpointingout the usefulnessof the notionofArmstrong models.We thankAzaria Paz forhis help in provingTheorem13 by referring us to Paz (1987), and to NormanDalkey and Thomas Verma formanyusefuldiscussions. We also thank Glenn Shafer and several reviewersfor suggesting numerousimprovements on earlierdrafts. REFERENCES C. (1980). On the membership problemforfunctional and multivalueddependencies in relationaldatabases.ACM Trans. Database Systems5 241-249. BEERI, C., FAGIN, R. and HOWARD,J. H. (1977). A complete axiomatization offunctional dependenciesand multivalueddependencies in databaserelations.Proceedings ofthe1977ACM SIGMOD International on Management Conference ofData 47-61, ACM,New York. DARROCH, J. N., LAURITZEN, S. L. and SPEED, T. P. (1980). Markovfieldsand loglinear interaction modelsforcontingency tables.Ann. Statist.8 522-539. DAWID, A. P. (1979). Conditional in statisticaltheory.J. Roy. Statist.Soc. Ser. B independence 41 1-31. FAGIN, R. (1976). Functionaldependencies in a relationaldatabaseand propositional logic.IBM J. Res. Develop.21 262-278. FAGIN, R. (1978). Multivalued dependencies and a newnormalformforrelationaldatabases.ACM Trans. Database Systems2 (262-278. FAGIN, R. (1982). Hornclausesand databasedependencies. J. Assoc.Comput.Mach. 29 952-985. FAGIN, R. and VARDI, M. Y. (1986). The theory ofdata dependencies-asurvey.In Mathematics of Information Processing(M. Ansheland W. Gewirtz,eds.) Proc. Symp.Appl. Math.34 19-72. Amer.Math.Soc., Providence, RI. FRYDENBERG, M. (1990). Marginalization and collapsibility in graphicalassociationmodels.Ann. Statist.18 790-805. GEIGER, D. (1990). Graphoids: a qualitativeframework forprobabilistic Ph.D. dissertainference. tion,ComputerScienceDept.,Univ.California, Los Angeles. GEIGER, D., PAz, A. and PEARL, J. (1991). Axiomsand algorithmsfor inferences involving probabilistic independence. Inform.and Comput.91 128-141. GEIGER, D. and PEARL, J. (1988). On the logic of causal models.Proceedingsof the Fourth on Uncertainty Workshop in Al (R. D. Shacter,T. S. Levitt,L. N. Kanal and J. F. Lemmer,eds.) 136-147. North-Holland, Amsterdam. LAURITZEN, S. L. (1982). Lectureson Contingency Tables,2nd ed. Univ.AalborgPress,Aalborg, Denmark. LAURITZEN, S. L. and SPIEGELHALTER, D. J. (1988). Local computations withprobabilities on graphicalstructures and theirapplications to expertsystems.J. Roy.Statist.Soc. Ser. B 50 154-227. LEE, Y.G. and BUEHLER, R. J. (1986). Independence relationships formultivariate distributions. TechnicalReport464, Univ.Minnesota. MALVESTUTO, F. M. (1992). A unique formalsystemfor binarydecompositions of database relations,probability distributions, and graphs.Inform.Sci. 59 21-52. PAZ,A. (1987). A full characterization of pseudographoids in termsof familiesof undirected graphs.TechnicalReportR-95,CognitiveSystemsLaboratory, Univ. California, Los Angeles. BEERI,
PROPERTIES OF CONDITIONAL INDEPENDENCE
2021
PEARL, J.(1988).
Probabilistic Reasoningin Intelligent Systems:Networks ofPlausibleInference. MorganKaufmann,San Mateo,CA. PEARL, J. and PAZ,A. (1985). Graphoids:a graph-basedlogic for reasoningabout relevance relations.In Advancesin Artificial Intelligence-II (B. D. Boulay,D. Hogg and L. Amsterdam. Steels,eds.) 357-363. North-Holland, SAGIV, Y. and WALECKA, S. (1982). Subsetdependencies and completeness resultsfora subsetof EMVD. J. Assoc.Comput.Mach. 29 103-117. SHAFER, G., SHENOY, P. and MELLOULI, K. (1988). Propagating belieffunctionsin qualitative Markovtrees.Internat.J. Approx.Reason. 1 349-400. SPOHN, W. (1980). Stochasticindependence, causal independence, and shieldability. J. Philos. Logic 9 73-99. STUDENY, M. (1992). Conditionalindependence relationshave no completecharacterization. Transactionsof the 11thPrague Conference on Information Theory,StatisticalDecisionFoundationand RandomProcessesVol. B 377-396. Kluwer,Dordrecht. VERMA, T. S. (1986). Causal networks:semanticsand expressiveness. TechnicalReportR-65, Cognitive Univ.California, SystemsLaboratory, Los Angeles.[Alsoin VERMA, T. S. and PEARL, J. (1990). Uncertainty in Artificial Intelligence(R. D. Shachter,T. S. Levitt, L. N. Kanal and J. F. Lemmer,eds.) 69-76. North-Holland, Amsterdam.] WHITTAKER J. (1990). GraphicalModelsin AppliedMultivariate Statistics.Wiley,Chichester. COMPUTER SCIENCE DEPARTMENT TECHNION-ISRAEL INSTITUTE OF TECHNOLOGY HAIFA 32000, ISRAEL
COMPUTER SCIENCE DEPARTMENT
6291 BOELTER HALL
UNIVERSITY OF CALIFORNIA,Los ANGELES Los ANGELES, CALIFORNIA90024-1596