In conjunction with qualitative probability
Tim Fernando
[email protected] revised October 1997
Abstract. Numerical probabilities (associated with propositions) are eliminated in favor of qualitative notions, with an eye to isolating what it is about probabilities that is essential to judgments of acceptability. A basic choice point is whether the conjunction of two propositions, each (separately) acceptable, must be deemed acceptable. Concepts of acceptability closed under conjunction are analyzed within Keisler's weak logic for generalized quanti ers | or more speci cally, lter quanti ers. In a dierent direction, the notion of a lter is generalized so as to allow sets with probability non-in nitesimally below 1 to be acceptable. Keywords. Qualitative probability, probabilistic logic, generalized quanti ers, non-monotonic reasoning
AMS classi cation codes. 03B48, 03C80
1 Introduction: weighing the evidence
Let L be a set of \formulas" ('; ; : : : ), and V be a set of \possibilities" (a; b; : : :), connected to L by a relation j=, capturing the intuition that for all a 2 V and ' 2 L,
a j= ' i a \supports" ' : We can, for concreteness, take the case of predicate logic, although we can also proceed more abstractly, assuming only some relation j= V L. In any case, the point is to pick out a set A L of \a cceptable" formulas on the basis of j=, so that for every ' 2 L,
' is acceptable
i
the \evidence" fa 2 V : a j= 'g supporting ' \has enough weight"
or, in other words,
' 2 A i fa 2 V : a j= 'g 2 H
(1)
for some family H Pow(V ) of \h eavy" subsets of V . Clearly, the safest choice is to equate A with the j=-validities (i.e., the formulas ' such that for every a 2 V , a j= '), which is to say, to set H = fV g. But suppose we took a chance on some other choice of A (and H) that tolerates failures or exceptions. What are the rules for doing so rationally? \Logic" suggests the closure condition A2H A B ( V ) ' j? (Up) ' 2 A
2A
B2H
1
where ' j? means that for every a, if a j= ' then a j= . (Up) alone introduces no element of risk into A (or H) insofar as (Up) does not yield, from the assumption that V 2 H, any more elements of H. On the other hand, (Up) does suggest how to proceed | namely, by weakening j? to some binary relation j on L or (turning to H) to some binary relation on Pow(V )
(Up)j ' 2 A 2 A' j (Up) A 2 HB 2 HA B : It is natural to expect that the intuitions that come to play in developing the rule (Up)j are syntactic (or proof-theoretic), whereas those for (Up) are semantic. An example where this distinction matters concerns the operation ^ of conjunction on L, for which it is understood that for every a 2 V , a j= ' ^ i a j= ' and a j= . It is largely on this simple example that the present paper turns.
1.1 Content of paper There is a certain plausibility to asserting (And) if ' and are both acceptable, then ' ^ is acceptable if only because (i) it takes a bit of sophistication to even sense the dierence between \' and " and \' ^ " and (ii) after such sophistication is acquired, we learn that the dierence does not (in a sense) matter, if acceptability is construed as validity (semantic or syntactic). (And) becomes problematic, however, as soon as we accept some exceptions. This is brought out most clearly perhaps by H. Kyburg's \lottery paradox": the proposition that one in say, a million tickets in a lottery will win is acceptable, as are each of a milion propositions asserting that a particular ticket will not win. The \paradox" vanishes after a moment's thought on the underlying semantics: were we to agree that A = f' 2 L : there is at most one a 2 V for which a 6j= 'g ; we may run up against a pair ' and of formulas in A such that the one counter-example to ' is dierent from the one counter-example to , whence ' ^ 62 A. But as long as we concentrate on the syntactic side A, choosing the semantics H to support the manipulations on A that we have decided to legitimize, there is hope for (And). In this regard, it is worthwhile noting what Pearl 1994 calls a \long-standing tension between the logical and probabilistic approaches to dealing with such exceptions": whereas the former is \prescriptive" (insofar as logic is simply a record of \conversational conventions"), probabilities are \descriptive" (be they measures of objective frequencies or subjective beliefs). Now, the question is how could probabilities describe the logical rule (And)? A natural way to proceed is to accept precisely the formulas with probability greater than some xed threshold 2 [0; 1] (say, .999), given a probability function pr from L to the unit interval [0; 1] Apr; = f' 2 L : pr(') > g : The hitch is that exceptions add up: from pr(') = 1 ? and pr( ) = 1 ? , one cannot, in general, do better than predict pr(' ^ ) 1 ? ( + ). This suggests, assuming we stick with probability measures (rather than some alternative where, for example, conjunction is interpreted by the greatest lower bound operation in some lattice), that we 2
(I) replace the condition that pr(') > by the requirement that 1 ? pr(') be in nitesimal, where in nitesimals are assumed to be closed under addition: if and are in nitesimals, then so is + . The notion of an in nitesimal here is exactly that introduced by A. Robinson in his non-standard reconstruction of calculus. Reversing chronological order, we could, as an alternative to (I), (II) explain away in nitesimals by -type limits (a la Bolzano-Weierstrass). Pearl 1994 describes the work of Adams and Spohn in much this way, though without the emphasis on (And). I have decided here to focus on (And) because, together with (Up), it supports a very direct and general analysis of approaches (I) and (II), using no more structure than that implicated by line (1) above. This is made precise by Theorems 1 and 2 in xx2:1 and 2:2 respectively, where, in particular, no appeal is made to numbers, be they in the standard unit interval [0; 1], or some nonstandard copy thereof. This is not to say that the analysis given is incompatible with a numerical approach; only that it allows us to avoid all kinds of arithmetical complications | not to mention the somewhat embarrassing question: what probability function? Without specifying a particular probability function, we return, in x3, to probabilities, bee ng up the rule (Up) to a rule (Up) , where means \at least as probable as." In this section, we refrain from making any commitment to the soundness or, for that matter, unsoundness of (And). (Up) applies not only to models of (And), but also to Apr; = f' 2 L : pr(') > g, where pr is a probability function, and is some number in [0; 1]. (It is easy enough to introduce a rule restricting say, to be above 21 ). As weak as the rule (Up) might be, it nevertheless constitutes a step to understanding what is involved qualitatively in accepting formulas with probabilities that fall non-in nitesimally short of 1.
1.2 Related work
There is an evidently widespread belief that it is a non-trivial (if not hopeless) enterprise to reason qualitatively about formulas with probabilities greater than some xed threshold non-in nitesimally short of 1. Pearl 1988 puts the matter as follows Probabilities that are in nitesimally close to 0 and 1 are very rare in the real world. Most default rules used in ordinary discourse maintain a certain percentage of exceptions, simply because the number of objects in every meaningful class is nite. Thus, a natural question to ask is, why study the properties of a logic that applies only to extreme probabilities? Why not develop a logic that characterizes moderately high probabilities, say probabilities higher than 0.5 or 0.9 | or more ambitiously, higher than , where is a parameter chosen to t the domains of the predicates involved? The answer is that any such alternative logic would be extremely complicated and probably would need to invoke many axioms of arithmetic. [pages 493, 494]
A similar view can be found in Halpern and Rabin 1987, where \a logic to reason about likelihood" is developed relative to a non-probabilistic semantics. The probabilistic approach to likelihood presented in x3 (below) is not only qualitative but quite possibly simpler than might have been feared. It proceeds along lines similar to Scott 1964 and Segerberg 1971, as discussed further below. The literature on non-monotonic formalisms that incorporate (And) as a basic or derived rule is vast and, for the uninitiated, downright bewildering. (See, for instance, Pearl 1994 and the references cited therein,1 plus Kyburg 1970 for earlier work.) What the present paper oers is a 1 A well-known example is Kraus, Lehmann and Magidor 1990, in which the rule (Up)j described above leads dangerously to monotonicity (p. 180). Instead, a weak system of cumulative reasoning is developed there, from which a j-form of (And) but not (Up)j can be derived.
3
logical approach that departs minimally (if at all) from standard practice in classical mathematical logic. The one possible point of departure is the appeal to the weak logic of generalized quanti ers in Keisler 1970 (which arguably belongs to the mainstream of logic) for formalizing line (1) above,2 and even then, the introduction of generalized quanti ers can be eliminated according to Theorems 1 and 2 below, resulting in ordinary predicate logic. Some specialists in the logic of generalized quanti ers seem to consider Theorem 1 part of the subject's folklore. It is implicit in van Lambalgen 1991, and appears in a disguised form as Theorem 5 of Alechina and van Lambalgen 1996, the inessential notational dierences being due to that work's somewhat novel syntax (involving modality) and semantics (motivated by proof theory). A crucial syntactic point that ought to be stressed is the expulsion of numbers from formulas below | in contrast, that is, to the quantitative approaches in Keisler 1985 and Halpern 1990, where numbers appear explicitly in formulas. The idea behind minimizing (explicit) reference to probabilities is to isolate what it is about probabilities that is essential to judgments of acceptability; but by opening the door to alternative non-probabilistic interpretations of formulas, the challenge then becomes showing that only the probabilistic semantics need matter. (More concretely, the problem in establishing completeness is how to de ne a probability measure from syntactic entities that do not mention numbers.)
2 Reasoning according to preference: lters
A straightforward formalization of line (1) is provided by the weak logic L(Q) for generalized quanti ers of Keisler 1970, where (i) L is a rst-order language and Q is a generalized quanti er symbol, inducing formulas Qx' (in addition to the usual closure rules on rst-order L-formulas) and (ii) an L(Q)-model is a pair (M; q) consisting of a rst-order L-model M and a family q Pow (jM j) of subsets of the universe jM j of M , so that (relativizing (1) to (M; q )) (M; q) j= Qx'[f ] i fa 2 jM j : (M; q) j= '[fax]g 2 q for every function f mapping variables to objects in jM j (and where fax is the function that maps x to a, but is otherwise identical to f ). That is to say, (1) is analyzed by building an L-model M around the the set V of possibilities so that the acceptability of a formula ' can be evaluated by exposing (as it were) the \hidden variable" x, the instantiations of which are measured relative to q (= HM;q )
' 2 AM;q i (M; q) j= Qx' : L(Q) oers not only the expressive power of predicate logic (as well as the possibility of nested judgments of acceptability through iterations of Q), but also a natural model theory, relative to
which a complete proof system can be obtained from a simple extension of one for rst-order logic by axiom schemes for -equivalence and extensionality (Keisler 1970). 2 A ne point about (1) and weak logic is that the extension of A to a binary predicate j on L can be treated by passing from unary to binary generalized quanti ers, at the cost only of notational clutter; see x2 3. :
4
2.1 Filter quanti ers
Given a set V , a lter on V is a non-empty family H of subsets of V satisfying (Up) and closed under intersections | viz., V 2 H; whenever A 2 H and A B V , B 2 H; and for all A 2 H and B 2 H, A \ B 2 H (thereby supporting (And)). These properties translate in L(Q) to the ( lter) schemes (Q1) Qx x = x (Q2) 8x(' ) (Qx' Qx ) (Q3) Qx' ^ Qx Qx(' ^ ) : Let us write F [Q] for the L(Q)-theory induced by (Q1), (Q2) and (Q3), where ' and are L(Q)formulas with the same set of free variables. Note that F [Q] holds for Q = 8, or restricted universal quanti cation. The converse is not quite true: consider the standard model M of arithmetic, and let q be the family fA f0; 1; : : :g : f0; 1; : : :g ? A is niteg of co- nite sets of natural numbers; then T (M; q) is a model ofTF [Q], even though q is a non-principal lter. (A lter q on V is principal if q 2 q, in which case q is said to generate q.) Nevertheless, the converse can be approximated through elementary extensions and some slick bookkeeping due to van Lambalgen 1991. Fix a relation symbol R not in L that accepts any nite positive number of arguments, and call an L(Q)-model (M; q) relatively principal if under some expansion of (M; q) to R, (2) Qx' 8x(R(x; y) ') holds for every L(Q)-formula ' with free variables x; y (where y is say, ordered according to some xed well-ordering of variables). (2) says that Q can be taken to be universal quanti cation 8 restricted to some set R of \generic" (or \normal") elements (modulo y). These generic elements are \transcendental" in y: assuming 8yQx(x 6= y), they cannot be named by L-terms with free variables drawn from y. (Hence, the necessity of adding y to R.) Theorem 1.3 Every L(Q)-model of F [Q] can be elementarily extended to a relatively principal L(Q)-model. Proof. Fix an L(Q)-model (M; q) of F [Q], and a nite set 0 of instances of (2). By the compactness theorem of weak logic (Keisler 1970), it suces to show how to expand (M; q) to a model of 0 . The idea is to interpret R as the set is an L(Q)-formula; a 2 dom (F )g ; for certain partial functions F (to be de ned presently) from the set of nite sequences of (the universe) jM j (of M ) to jM j. Let n be the number of free variables in Qx . The domain of F consists exactly of the n-tuples a 2 jM jn such that () (M; q) 6j= Qx [a] :
f(F (a); a) :
Given such a sequence a, let ' be the conjunction ^
f' : Qx' has the same set of free variables as Qx ; (M; q) j= Qx' [a] and `Qx' 8x(R(x; y) ')' 2 g 0
0
0
0
3
0
See the note in x1 2 for bibliographic information. :
5
(appealing here to the niteness of 0 ). It is understood that an empty conjunction is some tautology. Using (Q3) in case the set is non-empty, it follows that (M; q) j= Qx'[a] : Hence, by assumption () and Qx' ^ :Qx 9x(' ^ : ), to which (Q2) can, as pointed out to me by N. Alechina, be rewritten, (M; q) j= 9x(' ^ : )[a] : It remains to choose some such witness for the value of F (a). 2 Remark. The interpretation of R suggested in the proof of Theorem 1 can be described roughly as follows: for every L(Q)-formula such that (M; q) 6j= Qx , throw in a witness to the set f: g [ f' 2 L(Q) : (M; q) j= Qx'g of formulas. What is \rough" about this description is that (a) we should be more careful to specify what variables and constants to allow in the formulas, and (b) the required witnesses may not exist in M . The rst point is a simple matter of bookkepping, while the second can be handled by appealing to the existence of elementary extensions that are !-saturated | suggesting a restatement of Theorem 1 as Proposition 10 . Every !-saturated L(Q)-model of F [Q] is relatively principal. In fact, Proposition 10 can be sharpened to so-called recursively saturated models (Barwise 1975), as the sets of formulas that must be realized can be given eectively as follows f(:Qx ) : g [ f(Qx') ' : ' 2 Ln(Q)g for every 2 Ln(Q), where Ln is an expansion of the language L to n fresh constants (abusing notation in identifying Ln (Q) with the set of Ln (Q)-formulas with one free variable x). Note the similarity of (:Qx ) : to Henkin expansions that witness existential statements. The next section adapts notions from a celebrated method, the force of which is to omit rather than to realize types .
2.2 Graded normality
The proper extensions mentioned in Theorem 1 can be avoided, provided (2) is weakened to allow for varying grades (rather than an absolute, either/or, notion) of genericity: add an argument place to R, for a relation symbol < to be used with the intuition that u < v i u is \more generic" than v : More precisely, let L be a countable rst-order language, < be a fresh binary relation symbol (not in L), and let y v z abbreviate (y < z ) _ (y = z ). Given an L-model M , let LM be L together with a fresh constant symbol for every object in M . To simplify the notation, we will identify an object a in M with its constant symbol in LM , and write M for the LM -model obtained by expanding the L-model M to LM , with every a in M interpreted as a. Theorem 2. For every countable (or nite) L(Q)-model (M; q) of F [Q], there is a transitive binary relation jM j jM j such that for every LM (Q)-formula ' with exactly one free variable x, Qx' (8z)(9y v z)(8x < y) ' (3) holds in (M; q; ). Remarks. 6
1. (3) describes an -limit/asymptotic/co nal-type quanti cation, with (2) falling out as the special case given by
u < v R(u) (i.e., the second argument v in < is vacuous). 2. The converse of Theorem 2 (the soundness of F [Q] under (3)) is trivial: the schemes (Q1) and (Q2) follow from (3) alone, while (Q3) is a consequence of the transitivity of (as well as (3)). (I have not investigated what generality (3) buys beyond that of (2), in the absence of the assumption that < is transitive.) 3. The restriction to one free variable is inessential, and is made only to simplify notation, allowing us to suppress the subscripts y on < in (3). 4. An interpretation of < validating (3) in Theorem 2 is more complicated to describe than an interpretation of R supporting (2) in Theorem 1. In this connection, it is interesting to note the sentiment any fool can realize a type, but it takes a model-theorist to omit one expressed in Sacks 1972. The twist in Theorem 2 is that the \ontological promiscuity" in saturation arguments is avoided by a purely combinatorial argument (without resorting to any of the model-theorist's tools, such as completeness).
Proof of Theorem 2. Fix an L(Q)-model (M; q) of F [Q], and partition the set of LM (Q)formulas with exactly one free variable x as follows
+ = f' 2 : (M; q) j= Qx'g ? = f 2 : (M; q) 6j= Qx g : If for every ' 2 + , (M; q) j= 8x', then we can set to f(a; a) : a 2 jM jg and we are done. Otherwise, choose a '0 2 + and a0 2 jM j such that (M; q) 6j= '0 [a0 ] ; and let f'0 ; '1 ; '2 ; '3 ; : : :g be an enumeration of . We will de ne by nite approximations i (for i 0), with
=
[
i0
i :
The plan is to construct for each i 0, a nite transitive relation i such that the following four conditions hold, with the understanding that i+ = + \ f'j : j ig i? = ? \ f'j : j ig E i = fa 2 dom (i ) : (8b i a) a i bg (C1) i i+1 . 7
(= the set of i -minimal points) :
(C2) dom (i+1 ? i ) E i+1 . V (C3) For every a 2 E i , (M; q) j= i+ [x=a]. (C4) There is a \witness" map wi : i? ! E i such that for every 2 i? , (M; q) j= : [x=wi ( )] ;
and for every j > i, and every a 2 jM j,
a j wi ( ) implies wj ( ) j a : What makes (C1)-(C4) interesting is Lemma A. If is the union Si i of nite transitive relations i satisfying (C1), (C2), (C3) and (C4), then the required equivalence (3) holds.
Proof. Suppose ( rst) that (M; q) j= Qx'. Given a 2 jM j, either there is some a0 2 jM j such that a0 a, or not. If not, then (M; q; ) j= (9y v a)(8x < y) ' holds vacuously (by the de nition of v as the disjunction of < with equality). Otherwise, choose an i such that ' 2 i and an a0 i a which, by the transitivity and niteness of i , belongs to E i . Then by (C2) and (C3), (M; q) j= (8x < a0 ) '. Next, assume (M; q) 6j= Qx'. Choose an i such that ' 2 i? , and conclude from (C1) +
and (C4) that
(M; q; ) j= (8y v z )(9x < y) :' [z=wi (')] :
2 (Lemma A)
To push the construction of through, the following will be useful. Lemma B. For all ' 2 +, 2 ?, and every nite subset A0 of fa 2 jM j : jM j?fag 2 qg,
fa 2 jM j : (M; q) j= (' ^ : )[x=a]g 6 A : 0
Proof. Repeated applications of (Q3) give (M; q) j= Qx(' ^ and therefore, if 2 ? then (Q2) implies (M; q) 6j= 8x ((' ^
^
a2A0
^
a2A0
x 6= a) ) ;
i.e., (M; q) 6j= 8x ((' ^ : ) as required. 2 (Lemma B) 8
x 6= a) ;
_
a2A0
x = a) ;
Looking more closely at fa 2 jM j : jM j ? fag 2 qg, observe that (Q2) implies
fa 2 jM j : jM j ? fag 62 qg = and
\
q
Lemma C. For every a 2 T q and every ' 2 , (M; q) j= '[x=a]. Proof. If (M; q) 6j= '[x=a], then fb 2 jM j : (M; q) j= '[x=b]g jM j ? fag ; T so that ' 2 and (Q2) imply jM j ? fag 2 q (i.e., a 62 q). 2 (Lemma C) +
+
Next, de ne
I i = fa 2 jM j ? E i : (9b 2 jM j) b i ag
(= the image of i minus E i )
and add to the list (C1)-(C4) (C5) For every a 2 I i , there is a ' 2 i+ such that (M; q) 6j= '[a]. (C6) The witness map wi mentioned in (C4) is surjective (onto E i ) and has the additional property T that for all ; 0 2 i? such that 6= 0 , if wi ( ) = wi ( 0 ), then wi ( ) 2 q. Let us turn nally to the de nition of i (and wi ). The initial stage i = 0 is trivial: since '0 2 + , we can set 0 = ; (whence E 0 = ; = w0 ). Now, consider stage i + 1. CaseT 1: 'i+1 2 +. Let N = fa 2 E i : (M; q) 6j= 'i+1 [x=a]g. By Lemma C, for every a 2 N , a 62 q. Hence by (C6), each a 2 N has a unique a 2 i? such that wi ( a ) = a. Next, apply (Q3) and Lemma B to de ne a function new : N ! jM j such that for every a 2 N , (M; q) j= ((: a ) ^
^ i+1 )[x=new(a)] +
(whence new(a) 62 I i by (C5)) and new(a) 2 (E i ? N ) [ fnew(a0 ) : a0 2 N ? fagg implies new(a) 2
\
q:
Then set i+1 to be the transitive closure of
i [ f(new(a); a) : a 2 N g [ f(new(a); new(a)) : a 2 N g and de ne wi : i? ! E i by ( wi ( ) = new(wi ( )) if wi ( ) 2 N +1
+1
+1
+1
wi ( )
otherwise
for every 2 i?+1 (= i? by the case assumption). Case 2: 'i+1 2 ?. Writing for 'i+1 , choose, by (Q3) and Lemma B, an a such that (M; q) j= ((: ) ^ and
^ i
+
)[x=a] \
a 2 fwi ( ) : 2 i?g implies a 2 q : 9
Then set
i
= i [ f(a; a0 )g [ f(a; a)g wi+1 = wi [ f( ; a)g (recalling that a0 was the element chosen at the outset satisfying :'0 , and that a 62 I i , by (C5)). These two cases together yield the following picture of i , for i > 0. At the top is a0 (chosen to satisfy :'0 where '0 2 + ). Coming out of a0 are branches, each with exactly one tip (i.e., an element of E i ). Each tip satis es all of i+ . Moreover, each 2 i? is satis ed at some tip (given by wi ). For dierent and 0 in i?, either the corresponding branches meet only at a0 or else the corresponding branches are the same, and will not grow further (in j , for j > i) because the tip of the branch satis es all of + (Lemma C). A branch witnessing in i? will grow further in j , where j > i, only in case the tip of that branch violates some ' 2 j+ (which is not in i+). With this picture, verifying conditions (C1) through (C6) becomes routine. 2 +1
2.3 The binary case (from A to j) Theorems 1 and 2 generalize to binary quanti ers (with only notational complications) as follows. To step from a unary quanti er up to a binary quanti er, an L-model M is paired with a binary relation q Pow(jM j) Pow(jM j) such that (M; q) j= Qx('; )[f ] i fa 2 jM j : (M; q) j= '[fax ]g q fa 2 jM j : (M; q) j= [fax ]g : The weak completeness and compactness theorems of Keisler 1970 lift immediately to this setting (as worked out, for instance, in Westerstahl 1989). The lter schemes (Q1), (Q2) and (Q3) turn into Qx('; ') 8x( 0 ) (Qx('; ) Qx('; 0 )) Qx('; ) ^ Qx('; 0 ) Qx('; ^ 0) respectively, with line (2) becoming (4) Qx('; ) 8x((' ^ R';x(x; y)) ) where y lists the free variables 6= x in ' as well as . Line (4) supports a reading of the formula Qx('; ) as \for all relevant '-x's, ." Theorem 1 can be lifted to these binary forms, under the additional condition that R is \extensionalized" so that 8xy(' '0 ) 8xy(R';x(x; y) R'0;x(x; y)) for all L(Q)-formulas ' and '0 with the same free variables x; y. Similar remarks apply to Theorem 2. (I.e., the relation symbol < must also be relativized to the antecedent ' and the quanti ed variable x, although its extension can be arranged to depend on ' only up to .)
3 Between preferences and probabilities: quasi- lters
Having upgraded a \normality" predicate R into a \preference" relation 12 :
A 2 H i (A) > (A)
Given 2 [0; 1], call a family H F -sizable if there is a probability function on F such that for every A 2 F ,
A 2 H i (A) > : Let us record some properties of -sizable families dierentiating them from lters. Proposition 3. (i) Sizable families verify (Half)
A 62 H A [ B 62 H : A[B 2H
(ii) For n 1, let n [ ( Ai (i) ) 2 H i=1 :f1;2;:::;ng!f+;?g _
(Covn )
where A+i = Ai and A?i = Ai . Then (Covn ) is valid for -sizable families i < 1 ? 2?n .
Proof. Part (i) is trivial: (A) and (A [ B) imply (A [ B ) = 1, since (A [ B ) = (A ? B ) + (B ) = (B ) ? (B ? A) + (B ) = 1 ? (B ? A) where (B ? A) = 0 (because (A) and (A [ B ) ). 1 2
1 2
1 2
1 2
Part (ii) can be proved by the following clever argument I owe to N. Alechina (considerably simplifying my original proof). The set V can be partitioned into the family
fA \ \ An n j : f1; 2; : : : ; ng ! f+; ?gg (1) 1
( )
of 2n disjoint pieces. Hence, for any probability function , there must be at least one function such that (A1(1) \ \ An(n) ) 2?n | i.e., (A1?(1) [ [ An?(n) ) 1 ? 2?n (where ?+ = ? and ?? = +). 2
3.2 A generalization of lters
To avoid (Half) and (Covn ), we quantify away as follows. Call H additive if it is -sizable for some . That is, H is additive i for some probability function : F ! [0; 1], and 2 [0; 1], 11
H = fA 2 F : (A) > g. Rewriting (Up) (from x1), with (A) (B ) in place of A B , we get
(Up) A 2 H B 2(HA) (B ) : Note, however, that unless we know what is, we cannot assert (Up) . Could it be then that the best we could do to characterize additive families is to assert (Up)? The following counter-example shows that there is more structure in additive families to account for. (y) Let V f1; 2; 3; 4g and H = fA V : f1; 3g A or f2; 4g Ag. Although H veri es (Up), it is not additive: were it induced by , then as f1; 3g 2 H and f1; 2g 62 H, (f3g) > (f2g); but as f2; 4g 2 H and f3; 4g 62 H, (f2g) > (f3g). To strengthen (Up), some notation is useful saying (roughly) that a sequence A1 ; A2 ; : : : ; An of sets in F is heavier than a sequence B1 ; B2 ; : : : ; Bn (of equal length). With that in mind, let us write n X i=1
to mean that for every a 2 V ,
Ai
n X Ai (a) i=1
n X i=1
Bi
n X Bi (a) i=1
where A is the characteristic function of A (mapping elements of A to 1, and elements of V ? A to 0). Now, consider the condition () for all sequences A1 ; : : : ; An and B1 ; : : : ; Bn 2 F , n X i=1
Ai
n X i=1
Bi
implies
(9i 2 f1; 2; : : : ; ng) Ai 2 H or Bi 62 H :
(Up) is just the case n = 1 (as A1 B1 just means A1 B1 ). Example (y) above violates A [ C 62 H B [ (C ? A) 2 H A [ (D ? B ) 2 H
B[D 2H
which follows from the case n = 2 of (). Theorem 4. Assume F is nite. Then a family H F is additive i it validates (). Proof. Let 0
Let A be the m^ n^ matrix (ij ), and appeal to the following well-known fact from linear algebra (e.g., Strang 1980, p. 333) () Ax b has a nonnegative solution x i yA 0; yb < 0 has no solution y 0 . But Ax > 0 has a solution i Ax 1 has a solution (since there are only nitely many variables x1 ; : : : ; xn^ constituting x). So, setting b to 1 and multiplying the right hand side of () by ?1, we can associate with (P) the \dual" (i.e., its negation) (D) there exist real numbers y1 ; : : : ; ym^ 0, not all 0, such that m^ X i=1
yi ij 0
for every j 2 f1; : : : ; n^ g :
The reals in (P) and (D) can be assumed to be rationals, and after clearing the denominators to make the yi 's in (D) positive integers, the negation of (D) can be expressed as (0 ) for all sequences A1 ; : : : ; An and B1 ; : : : ; Bn 2 F , (8i 2 f1; 2; : : : ; ng) Ai