0. Actually the discussion of [4] is more general, concerning the entropy of a convex corner, i.e. a convex K ⊆ ( 0,
P
wj = 1 and A1 ≺ · · · ≺ Ar distinct maximal antichains of P .
Proof. We first prove existence. Note that in any representation a=
X
wA 1A
(14)
of a := a(P ) as a convex combination of (indicator vectors of) antichains A, all A’s in the support of w must be maximal, since expanding any of these antichains gives a strictly better a. Given a representation as in (14), choose, if possible, A, A0 incomparable under ≺ with 0 < wA ≤ wA0 . (If no such choice exists, then (14) is the desired representation.) Let B = min(A ∪ A0 ), B 0 = max(A ∪ A0 ) (where min X and max X are the sets of minimal and maximal elements of X ⊆ P ). Then B, B 0 are antichains, and each element of P appears the same number of times in B, B 0 as in A, A0 . Thus with w0 given by
wC − wA if C = A or A0 wC0 = wC + wA if C = B or B 0 otherwise, wC
wC0 1C again represents a as a convex combination of (necessarily) maximal antichains. This will complete the proof of existence provided we can show this procedure doesn’t cycle. One way to see this is to fix a linear extension ∝ of ≺, and order functions u : A → 0}, A = min(P + ) ⊇ A1 and α = min{a(x) : x ∈ A}. Then A1 = A,
w1 = α.
(15)
For suppose first that A1 6= A, and let x ∈ A\A1 . Then x ∈ Ai for some i > 1, contradicting the assumption A1 ≺ Ai . If instead, A1 = A but w1 = β < α, P then ti=2 wi 1Ai is a laminar decomposition of the function a0 : P → yj }, min ∅ = l + 1 g(j) = max{i ∈ [l] : xi < yj }, max ∅ = 0 U (i) = {j ∈ [t] : f (j) = i}, |U (i)| = ui V (i) = {j ∈ [t] : g(j) = i}, |V (i)| = vi Note kj > 0 for all j = 1, ..., t since C is a maximal chain. Lemma 3.1 If t < n/7 and P has no cut point (element comparable to all others), then there is j ∈ [t] such that kj ≥ 3 and 11
X
(ui + vi ) ≤ kj .
(21)
i∈K(j)
Proof. Suppose this is false, and consider a minimal T 0 ⊂ T for which [
Kj = [l] .
j:yj ∈T 0
We may assume T 0 = {y1 , ..., yr }. By our assumption we have X
(ui + vi ) ≥ kj − 2 f or j = 1, ..., r ,
i∈K(j)
so
X
X
(ui + vi ) ≥
j∈[r] i∈K(j)
X
kj − 2r .
j∈[r]
But the right hand side here is at least l − 2t (since j∈[r] Kj = [l]), while P the left hand side is at most i∈[l] 2(ui + vi ) ≤ 4t (since the minimality of T 0 implies that no i is in more than two of K(1), ..., K(r)). This gives 6t ≥ l, contradicting the assumption t < n/7. S
2
Lemma 3.2 Suppose P has no cutpoint and (as a set of relations) is maximal with given entropy. Then if t < n/7 there exist j ∈ [t] and i ∈ [l] such that P 0 := P (xi < yj < xi+1 ) satisfies e(P 0 ) ≤ (kj − 1)−1 e(P ) and
nH(P¯ ) ≤ nH(P¯0 ) + 2 log(2kj + 1) .
Proof. Let yj be as in Lemma 3.1 and Kj = {xh < ... < xm }. Choose i ∈ {h, ..., m − 1} with P r(xi < yj < xi+1 ) :=
e(P (xi < yj < xi+1 )) e(P )
12
minimum, and set P 0 = P (xi < yj < xi+1 ). Then clearly e(P 0 ) ≤ (kj − 1)−1 e(P ). On the other hand, the maximality of P implies that P 0 differs from P only in the at most 2kj new relations involving yj (c.f. (21)), that is, z < yj ⇒ z < xi+1 and w > yj ⇒ w > xi . (For suppose z < yj . Note that by maximality there is an antichain A in the laminar decomposition of a(P ) with xi , yj ∈ A. Since xi+1 > xi , all antichains in the laminar decomposition which contain xi+1 follow A, and similarly all thos containing z precede A. But then, again using maximality, we have z < xi+1 .) Thus, according to Theorem 2.1, 1 ) 2kj + 1 ≤ nH(P¯0 ) + 2 log(2kj + 1)
nH(P¯ ) ≤ nH(P¯0 ) + (2kj + 1)H(
(where H(z) := −z log z − (1 − z) log(1 − z) is the entropy function). 2 Proof of (20). We may, of course, assume that P is maximal with given entropy. We retain the notation introduced above and induct on n and t, the result being obvious if either n = 1 or t = 0. We assume therefore that n > 1 and t > 0. If P has a cutpoint x, then we finish by induction since nH(P¯ ) = (n − 1)H(P \ {x}) and e(P ) = e(P \ {x}); so we assume this is not the case. We next observe that the easy inequality e(P ) ≥ 2t allows us to assume t < n/7, since otherwise (20) follows from (19). We now have the hypotheses of Lemma 3.2, so also its conclusion. Since (inducting on t), (20) is true for P 0 , we have nH(P¯ ) ≤ ≤ ≤ ≤
nH(P 0 ) + 2 log(2kj + 1) (1 + 7 log e) log e(P 0 ) + 4 log(kj + 1) (1 + 7 log e) log e(P ) + (8 − (1 + 7 log e)) log(kj − 1) (1 + 7 log e) log e(P ) .
completing the proof. 13
2 There are various possibilities for extending the lower bounds here, of which we mention just one: Conjecture 3.3 If l = l(P ) is the length of a longest chain in P, then vol(V P (P )) ≥ (ll /l!)2−nH(P ) . This would improve the constant in (20) to 1 + log e. Notice it is tight for any union of a chain and an antichain.
4
Offense
Here we prove Theorem 1.2. To put the task of locating a good comparison in some perspective, let us first mention two curious examples: Suppose P consists of two disjoint and unrelated chains of size k = n/2. Comparison of the minima of the two chains then turns out to be a good comparison in our sense, forcing an entropy increase of about 1/n. But comparison of the middle elements is not good – it gives only an O(n−2 ) increase – even though it splits the extensions perfectly. On the other hand, suppose P is the poset on {x1 , . . . , xk , y1 , . . . , yk } (n = 2k) with relations xi < yj iff i = j or i = 1. Then the comparison x1 : x2 is good in our sense, but does a poor job of splitting extensions. For the proof of Theorem 1.2, it’s more natural to work in the complement, showing that we can decrease H(P ) by some specified amount (say ε), since for this we only need to exhibit some b0 ∈ V P (P 0 ) for which −
1X log b0k ≤ H(P ) − ε n
(with P 0 the new poset). For example, suppose xi , xj are minimal in P , b = b(P ) =
s X
m=1
14
zm 1Bm
(with each Bk a chain of P ), and let P 0 = P (xi < xj ), Bk0
=
(
Then b0 =
Bk ∪ {xi } if xj ∈ Bk otherwise. Bk s X
0 ∈ V P (P 0 ), zm 1Bm
m=1
b0i
= bi + bj , and H(P 0 ) ≥ H(P ) + log(1 + bj /bi ). This already gives Theorem 1.2 if there are minimal xi , xj with the ratios bi /bj , bj /bi bounded. In general, if the new covering relation is xi < xj , we may modify the weight function z by transferring some fraction (say µ) of the weight on each chain B (of P ) containing xj to a chain (of P 0 ) obtained by replacing the portion of B below xj by a chain with largest element xi . The effect of this procedure is quantified in Lemma 4.1 For any incomparable xi , xj ∈ P and µ ∈ [0, 1], and wk ’s as in Proposition 2.4, the entropy of P 0 = P (xi < xj ) satisfies nH(P 0 ) ≥ nH(P ) + log(1 + µ
βi X
αj −1
wk /aj ) + log(1 − µ
X
wk /aj ).
k=1
k=1
(assuming the right hand side is defined). Proof. Let b = b(P ) =
s X
zm 1Bm
m=1
with B1 , . . . , Bs chains of P and xj ∈ Bm iff m ∈ [t]. Also, denote bq,j =
X
zm
m:xq ,xj ∈Bm
and for m ∈ [t] Cm = Bm \ {x ∈ P : x < xj } .
15
Now fix a chain C = {xi1 < ... < xih = xi } such that h X
aip =
p=1
βi X
wk ,
(22)
k=1
0 0 and consider the chains Bm and weights zm given by 0 = Bm Bm 0 Bm+s = C ∪ Cm 0 0 zm+s = µzm , zm = (1 − µ)zm 0 zm = zm
1 ≤ m ≤ s, 1 ≤ m ≤ t, 1 ≤ m ≤ t, t + 1 ≤ m ≤ s.
(That is, we transfer a µ-fraction of the z-weight of each Bm containing xj to the associated chain C ∪ Cm .) Then 0
b =
s+t X
0 0 ∈ V P (P 0 ) zm 1Bm
m=1
is easily seen to satisfy b0q =
bq + µ(bj − bq,j ) b0q = bq − µbq,j 0 bq = bq
if xq ∈ C if xq ∈ / C, xq < xj otherwise.
Thus by the definition of H(P¯ ), we have nH(P 0 ) − nH(P ) = nH(P¯ ) − nH(P¯0 ) ≥ − ≥
n X
q=1 X
(log bq − log b0q ) log(1 + µbj /bq ) +
log(1 − µbq,j /bq )
q:xq <xj
q:xq ∈C
≥ log(1 + µbj
X
X
q:xq ∈C
1/bq ) + log(1 − µ
X
bq,j /bq )
q:xq <xj
where in the second inequality we use log(1 + u − v) ≥ log(1 + u) + log(1 − v) for nonnegative real numbers u, v, and in the third inequality we inductively use log(1 + u) + log(1 + v) ≥ log(1 + u + v) for all real numbers u, v with uv ≥ 0.
16
On the other hand, using ai bi = 1/n and (12), X
X
1/bq =
q:xq ∈C
and
X
naq = n
q:xq ∈C
wk ,
k=1
X
bq,j /bq = n
βi X
aq bq,j
q:xq <xj
q:xq <xj
X
=n
X
wk bq,j
X
bq,j
q:xq <xj k:xq ∈Ak αj −1
=n
X
wk
X
wk bj
xq ∈Ak
k=1 αj −1
≤n
k=1
where the inequality holds since Ak is an antichain. Therefore, X
nH(P 0 ) − nH(P ) ≥ log(1 + µbj ≥ log(1 + µn = log(1 + µ
1/bq ) + log(1 − µ
q:xq ∈C βi X
wk /aj ) + log(1 − µ
k=1
bq,j /bq )
q:xq <xj αj −1
wk bj ) + log(1 − µn
k=1 βi X
X
X
wk bj )
k=1 αj −1
X
wk /aj ) .
k=1
2 Also, we need the following easy lemma. Lemma 4.2 Given 0 < ε1 , ε2 < 1, choose i with ai as large as possible subject to αX i −1
wk ≤ ε1 ai
k=1
and let t be the smallest number for which t X
wk ≥ ε2 ai .
k=αi
17
Then for any xj ∈ At \ {xi }, aj
ai . Then by the choices of ai and t ≥ αj αj −1
ε1 aj < =
X
wk
k=1 αX i −1
αj −1
wk +
k=1
X
wk
k=αi
< ε1 ai + ε2 ai .
2 Proof of Theorem 1.2 Notice first of all that we may assume P has no cut point, since if it does then the Theorem follows by induction using the fact that (for any cut point x) nH(P¯ ) = (n − 1)H(P \ {x}) . For ε = 1/4, ε2 = 1/3, take xi , xi as in Lemma 4.2. Also, let δ := 0 k=1 wk /ai ≤ ε1 . Then by Lemmas 4.1 and 4.2, for P = P (xj > xi ) and Pαi −11
µ :=
0
ε1 aj ≤1, (ε1 + ε2 )ai
nH(P ) − nH(P ) ≥ log(1 + µ
βi X
αj −1
wk /aj ) + log(1 − µ
k=1
X
wk /aj )
k=1
ai ai ) + log(1 − µ(δ + ε2 ) ) aj aj ε1 − ε1 ε2 − ε21 − ε31 ) ≥ log(1 + ε1 + ε2 17 ). = log(1 + 112
≥ log(1 + µ(δ + 1)
On the other hand, for P 00 = P (xj > xi ) and µ = 1, Lemma 4.1 and the choice of xj imply 18
00
nH(P ) − nH(P ) ≥ log(1 +
βj X
wk /ai ) + log(1 −
k=1
≥ log(1 + (δ + ε2 )) + log(1 − δ) 3 ≥ log(1 + ) 16
αX i −1
wk /ai )
k=1
completing the proof. 2 Remark As shown by the poset with three elements and one relation, the value of c in Theorem 1.2 cannot be increased beyond 3 log 3 − 4 ≈ .755 .
5
Defense
Here we prove Theorem 1.3. The reader may check that the Theorem is sharp whenever x, y are isolated elements of P . The proof of Theorem 1.4 is again based on the laminar decomposition of a(P ). The effect on this decomposition of adding a relation x < y is that some of the Al ’s may no longer be antichains (in the new partial order). However when this happens, because of the nature of the decomposition, at least one of Al \ {x}, Al \ {y} will be an antichain. The proof consists of showing that for at least one of the answers to the comparison x : y we may modify the decomposition by such deletions to produce an a0 in the chain polytope of the new poset with P −(1/n) log a0i ≤ H(P ) + 2/n. (In most cases, the correct procedure is to replace Al by the two antichains Al \ {x}, Al \ {y}, dividing the weight wl between them.) For the proof we use x1 and x2 in place of x and y, and retain the notations Ak , wk , αk and βk used earlier. Proof of Theorem 1.3. Without loss of generality we may assume α1 ≤ α2 . We consider three cases. Case 1: α2 > β1 Set P 0 := P (x2 > x1 ). Then for all xk ≤ x1 and xl ≥ x2 , we have αl ≥ α2 > β1 ≥ βk . 19
Thus A1 , ..., Ar are still antichains of P 0 . This implies H(P 0 ) = H(P ) . Case 2: α1 ≤ α2 ≤ β1 ≤ β2 . Set P 0 := P (x2 > x1 ) and consider A0m = Am \ {x1 }, A00m = Am \ {x2 } A0m = Am
if α2 ≤ m ≤ β1 otherwise.
Since αl > β2 ≥ β1 , βk < α1 ≤ α2
if xk < x1 , xl > x2
the sets defined above are antichains of P 0 . Now define w0 by 0 00 wm = wm = wm /2 0 wm = wm
if α2 ≤ m ≤ β1 otherwise.
Then a0 :=
X m
X
0 wm 1A0m +
00 wm 1A00M
α2 ≤m≤β1
belongs to V P (P 0 ) and satisfies a01 ≥ a1 /2, a02 ≥ a2 /2 and a0k = ak if k 6= 1, 2. Thus 1X 2 1X log a0i ≤ − log ai + . H(P 0 ) ≤ − n i n i n Case 3: α1 ≤ α2 ≤ β2 ≤ β1 . Without loss of generality, we may assume αX 2 −1
k=α1
wk ≥
β1 X
wk .
k=β2 +1
Again set P 0 = P (x2 > x1 ) and A0m = Am \ {x1 }, A00m = Am \ {x2 } A0m = Am \ {x1 } A0m = Am Since for all xk < x1 and xl > x2 , we have
20
if α2 ≤ m ≤ β2 if β2 < m ≤ β1 otherwise.
βk < α1 ≤ α2 < αl , the sets defined above are antichains of P 0 . Now define w0 by 0 00 = wm = wm /2 wm 0 wm = wm
if α2 ≤ m ≤ β2 otherwise.
Then the vector a0 :=
X m
X
0 wm 1A0m +
00 wm 1A00M
α2 ≤m≤β2
belongs to V P (P 0 ) and satisfies a01 ≥ a1 /2, a02 = a2 /2 and a0k = ak if k 6= 1, 2. Thus H(P 0 ) ≤ −
1X 2 1X log a0i ≤ − log ai + . n i n i n 2
Another Proof of Theorem 1.3. Set U = {x ∈ P : x < x1 }, W = {x ∈ P : x < x2 },
V = {x ∈ P : x > x1 } Z = {x ∈ P : x > x2 }
and choose a chain K ⊆ U of P with the weight w(K) :=
X
ai
xi ∈K
as large as possible. Similarly, choose chains L ⊆ V , M ⊆ W and N ⊆ Z with maximum weights. Then since the chain polytope of P is V P (P ), w(K) + w(L) + a1 ≤ 1 w(M ) + w(N ) + a2 ≤ 1 . Therefore, w(K) + w(N ) + (a1 + a2 )/2 ≤ 1
(23)
w(L) + w(M ) + (a1 + a2 )/2 ≤ 1 .
(24)
or Without loss of generality we may assume that (23) is true. It is enough to show that the vector a0 with 21
a0i
=
(
ai /2 ai
if i = 1, 2 otherwise
is in the chain polytope of P 0 := P (x1 < x2 ). Suppose Q is a maximal chain of P 0 . If {x1 , x2 } 6⊆ Q then it is easy, by maximality of Q, to see that Q is a chain of P . Thus a0i ≤ ai for all i implies w0 (Q) :=
X
a0i ≤ w(Q) ≤ 1 .
i:xi ∈Q
If {x1 , x2 } ⊆ Q then set K 0 = {x ∈ Q : x
P 0 x2 } . Note that K 0 ⊂ U , N 0 ⊂ Z are chains of P and Q = K 0 ∪ N 0 ∪ {x1 , x2 } since there is no element x such that x1