arXiv:1405.2466v1 [math.PR] 10 May 2014
ASYMPTOTIC STRUCTURE AND SINGULARITIES IN CONSTRAINED DIRECTED GRAPHS DAVID ARISTOFF AND LINGJIONG ZHU Abstract. We study the asymptotics of large directed graphs, constrained to have certain densities of edges and/or outward p-stars. Our models are close cousins of exponential random graph models (ERGMs), in which edges and certain other subgraph densities are controlled by parameters. The idea of directly constraining edge and other subgraph densities comes from Radin and Sadun [24]. Such modeling circumvents a phenomenon first made precise by Chatterjee and Diaconis [3]: that in ERGMs it is often impossible to independently constrain edge and other subgraph densities. In all our models, we find that large graphs have either uniform or bipodal structure. When edge density (resp. p-star density) is fixed and p-star density (resp. edge density) is controlled by a parameter, we find phase transitions corresponding to a change from uniform to bipodal structure. When both edge and p-star density are fixed, we find only bipodal structures and no phase transition.
1. Introduction In this article we study the asymptotics of large directed graphs with constrained densities of edges and outward directed p-stars. Large graphs are often modeled by probabilistic ensembles with one or more adjustable parameters; see e.g. Fienberg [6, 7], Lov´asz [19] and Newman [20]. The exponential random graph models (ERGMs), in which parameters are used to tune the densities of edges and other subgraphs, are one such class of models; see e.g. Besag [2], Frank and Strauss [9], Holland and Leinhardt [12], Newman [20], Rinaldo et al. [27], Robins et al. [28], Snijders et al. [29], Strauss [30], and Wasserman and Faust [32]. It has been shown that in many ERGMs the subgraph densities actually cannot be tuned. For example, for the class of ERGMs parametrized by edges and pstars, large graphs are essentially Erd˝ os-R´enyi for all values of the parameters. (See Chatterjee and Diaconis [3] for more complete and precise statements.) An alternative to ERGMs was introduced by Radin and Sadun [24], where instead of using parameters to control subgraph counts, the subgraph densities are controlled directly; see also Radin et al. [25], Radin and Sadun [26] and Kenyon et al. [14]. This is the approach we take in this article. We also consider models which split the difference between the two approaches. We find that, in all our models, graphs have either uniform or bipodal structure as the number of nodes becomes infinite. Our approach, following refs. [14, 24, 25, 26], is to study maximizers of the entropy or free energy of the model as the number of nodes becomes infinite. When we constrain both edge and p-star densities, we Date: 10 May 2014. Revised: 10 May 2014. 2000 Mathematics Subject Classification. 05C80, 82B26, 05C35. Key words and phrases. dense random graphs, exponential random graphs, graph limits, entropy, phase transitions. 1
2
DAVID ARISTOFF AND LINGJIONG ZHU
find only bipodal structure (except when the p-star density is fixed to be exactly equal to the pth power of the edge density). When we constrain either edge or p-star densities (but not both), we find both uniform and bipodal structures, with a sharp change at the interface. This is in contrast with the situation in the ERGM version of the model, in which one finds only uniform structures, albeit with sharp changes in their densities along a certain curve in parameter space; see Aristoff and Zhu [1]. Such sharp changes are called phase transitions. Phase transitions have recently been proved rigorously for ERGMs; see e.g. Yin [33] and especially Radin and Yin [23] for a precise definition of the term. Some earlier works using mean-field analysis and other approximations include H¨ aggstr¨om and Jonasson [11] and Park and Newman [21, 22]. The terminology is apt, in that ERGMs and our models are inspired by certain counterparts in statistical physics: respectively, the grand canonical ensemble, and the microcanonical and canonical ensembles. See the discussion in Section 2. This article is organized as follows. In Section 2, we describe our models and compare them with their statistical physics counterparts. In Section 3, we state our main results. In Section 4, we prove a large deviations principle for edge and p-star densities. We use the large deviations principle to give proofs, in Section 5, of our main results.
2. Description of the models A directed graph on n nodes will be represented by a matrix X = (Xij )1≤i,j≤n , where Xij = 1 if there is a directed edge from node i to node j, and Xij = 0 otherwise. For simplicity, we allow for Xii = 1, though this will not affect our results. Let e(X) (resp. s(X)) be the probability that a random map from a single edge (resp. an outward p-star) into X is a homomorphism. More precisely, e(X) := n−2
X
Xij ,
1≤i,j≤n
s(X) := n−p−1
X
Xij1 Xij2 · · · Xijp .
(2.1)
1≤i,j1 ,j2 ,...,jp ≤n
We call e(X) and s(X) the edge and outward p-star homomorphism densities of X. Here, p is an integer ≥ 2. Let Pn be the uniform probability measure on the set of directed graphs on n nodes. Thus, Pn is the uniform probability measure on the set of n × n matrices with entries in {0, 1}. For e, s ∈ [0, 1] and δ > 0, define ψnδ (e, s) =
1 log Pn (e(X) ∈ (e − δ, e + δ), s(X) ∈ (s − δ, s + δ)) . n2
We are interested in the limit ψ(e, s) := lim lim ψnδ (e, s). δ→0+ n→∞
(2.2)
CONSTRAINED DIRECTED GRAPHS
3
The quantity in (2.2) will be called the limiting entropy density. For β1 , β2 ∈ R, e, s ∈ [0, 1] and δ > 0, define h 2 i 1 ψnδ (e, β2 ) = 2 log En en β2 s(X) 1{e(X)∈(e−δ,e+δ)} , n h 2 i 1 ψnδ (β1 , s) = 2 log En en β1 e(X) 1{s(X)∈(s−δ,s+δ)} . n We will also be interested in the limits ψ(e, β2 ) := lim+ lim ψnδ (e, β2 ), δ→0
n→∞
ψ(β1 , s) := lim lim ψnδ (β1 , s). δ→0+
(2.3)
n→∞
The quantities in (2.3) will be called limiting free energy densities. We will prove in Section 5 that the limits in (2.2) and (2.3) exist. It turns out that our models are closely related to their ERGM cousin, in which the quantity of interest is i h 2 1 ψ(β1 , β2 ) := lim 2 log E en (β1 e(X)+β2 s(X)) . (2.4) n→∞ n
The limit in (2.4) (also called a limiting free energy density) was studied extensively in Aristoff and Zhu [1], where it was shown that ψ(β1 , β2 ) is analytic except along a certain phase transition curve β2 = q(β1 ), corresponding to an interface across which the edge density changes sharply. See Radin and Yin [23] for similar results in the undirected graph setting. In this article we investigate properties of the limiting entropy density and free energy densities, which we find are closely related to the limit (2.4). In statistical physics modeling there is a hierarchy almost completely analogous to (2.2)- (2.3)(2.4), with (for example) particle density and energy density in place of e(X) and s(X), and temperature and chemical potential in place of β1 and β2 . The statistical physics versions of (2.2)- (2.3)- (2.4) are called the microcanonical, canonical and grand canonical ensembles, respectively. In that setting, there are curves like q along which the free energy densities are not analytic, and these correspond to physical phase transitions, for example the familiar solid/liquid and liquid gas transitions; see Gallavotti [10]. (There is no proof of this statement, though it is widely believed; see however Lebowitz et al. [16].) It is well-known that the models in this hierarchy have a very simple relationship involving Legendre transforms. In the random graph setting, it has been shown (see Radin and Sadun [24]) that this part of the analogy fails. Still, we find that our models (2.2)- (2.3) are closely related to (2.4). 3. Results To state our results we need the following. For x ∈ [0, 1], define ℓ(x) = β1 x + β2 xp − x log x − (1 − x) log(1 − x), I(x) = x log x + (1 − x) log(1 − x) + log 2. Of course ℓ depends on β1 and β2 , but we omit this to simplify notation. Clearly, ℓ and I are analytic in (0, 1) and continuous on [0, 1]. The function ℓ is essential to understanding the ERGM limiting free energy density [1, 23].
4
DAVID ARISTOFF AND LINGJIONG ZHU
Theorem 1 (Radin and Yin [23], Aristoff and Zhu [1]). For each (β1 , β2 ) the function ℓ has either one or two local maximizers. There is a curve β2 = q(β1 ), β1 ≤ β1c , with the endpoint pp−1 p , , (β1c , β2c ) = log(p − 1) − p − 1 (p − 1)p such that off the curve and at the endpoint, ℓ has a unique global maximizer, while on the curve away from the endpoint, ℓ has two global maximizers 0 < x1 < x2 < 1. The curve q is continuous, decreasing, convex, and analytic for β1 < β1c , with q ′ (β1 ) = −
x2 − x1 . xp2 − xp1
We consider x1 and x2 as functions of β1 (or β2 ) for β1 < β1c (or β2 > β2c ); as such, x1 and x2 are analytic. Moreover, x1 (resp. x2 ) is increasing (resp. decreasing) in β1 , with lim
β1 →−∞
x1 = 0,
lim c x1 =
β1 →β1
lim x2 = 1,
β1 →−∞
p−1 = lim c x2 . β1 →β1 p
(3.1)
The curve in Theorem 1 will be called the phase transition curve, and its endpoint the critical point. Theorem 1 will be used extensively in most of our proofs; because of this, we will often not refer to it explicitly. (We comment that the last statement of Theorem 1, not made explicit in [23] and [1], is proved in Proposition 7 in Section 5 below.) Note that the last part of the theorem implies 0 < x1
β2c } whose closure has lowest point (e
c
, β2c )
=
p − 1 pp−1 , p (p − 1)p
such that the limiting free energy density ψ = ψ(β1 , s) is analytic outside ∂Ue . The limiting free energy density has the formula β ep − I(e), (e, β2 ) ∈ Uec i 2h p p x −e e−x ψ(e, β2 ) = β2 x22−x1 x1 + x2 −x11 x2 (3.3) h i e−x1 x2 −e − I(x ) + I(x ) , (e, β ) ∈ U x2 −x1
1
2
x2 −x1
2
e
−1
where 0 < x1 < x2 < 1 are the global maximizers of ℓ at the point (q (β2 ), β2 ) on the phase transition curve. In particular, ∂ψ/∂e has jump discontinuities across ∂Ue away from (ec , β2c ), and ∂ 4 ψ/∂e4 is discontinuous at (ec , β2c ). (ii) There is a U-shaped region Us = {(β1 , s) : xp1 < s < xp2 , β1 < β1c } whose closure has rightmost point (β1c , sc ) = log(p − 1) −
p , p−1
p−1 p
p
such that the limiting free energy density ψ = ψ(β1 , s) is analytic outside ∂Us . The limiting free energy density has the formula 1 1 β s p − I(s p ), (β1 , s) ∈ Usc i 1 h xp −s p s−x ψ(β1 , s) = β1 xp22−xp1 x1 + xp2 −x1p1 x2 (3.4) i h p s−xp x2 −s 1 − p p I(x ) + p p I(x ) , (β , s) ∈ U x2 −x1
1
x2 −x1
2
1
s
where 0 < x1 < x2 < 1 are the global maximizers of ℓ at the point (β1 , q(β1 )) on the phase transition curve. In particular, ∂ψ/∂s has jump discontinuities across ∂Ue away from (β1c , sc ), and ∂ 4 ψ/∂s4 is discontinuous at (β1c , sc ).
6
DAVID ARISTOFF AND LINGJIONG ZHU
1
0.9
0.8
0.7
s
0.6
0.5
0.4
0.3
0.2
0.1
0 0
0.1
0.2
0.3
0.4
0.5
e
0.6
0.7
0.8
0.9
1
Figure 1. Countour plot of ψ(e, s) for p = 2. The non-shaded ¯ (where by convention ψ portion represents the complement of D c c equals −∞). Here, (e , s ) = (1/2, 1/4).
The sharp change from uniform to bipodal behavior of the models in Theorem 2 is called a first order phase transition (because of the singularity in the first order derivative of ψ), and the bipodal structure in the regions Ue and Us is sometimes called a replica symmetry breaking phase. In contrast with Theorem 2, the x1 and x2 in Theorem 3 are functions of β1 or β2 only and do not depend on e or s. See Figure 1 for the region D from Theorem 2, and Figure 2 for the U-shaped regions Ue and Us in Theorem 3.
4. Large Deviations We will need the following terminology before proceeding. A sequence (Qn )n∈N of probability measures on a topological space X is said to satisfy a large deviation principle with speed an and rate function J : X → R if J is non-negative and lower semicontinuous, and for any measurable set A, − inf o J(x) ≤ lim inf x∈A
n→∞
1 1 log Qn (A) ≤ lim sup log Qn (A) ≤ − inf J(x). ¯ an x∈A n→∞ an
Here, Ao is the interior of A and A¯ is its closure. See e.g. Dembo and Zeitouni [5] or Varadhan [31].
0.8
2.5
0.7
2.4
0.6
2.3
0.5
2.2
β2
s
CONSTRAINED DIRECTED GRAPHS
0.4
2.1
0.3
2
0.2
1.9
0.1
1.8
−2.4
−2.2
−2
7
−1.8
0.2
0.4
β1
e
0.6
0.8
Figure 2. Contour plots of ψ(β1 , s) and ψ(e, β2 ) for p = 2. The boundaries of the U-shaped regions, ∂Us and ∂Ue , are outlined. Here, (β1c , sc ) = (−2, 1/4) and (ec , β2c ) = (1/2, 2). We will equip the set G of measurable functions [0, 1] → [0, 1] with the cut norm, written || · || and defined by Z ||g|| = sup g(x) dx , (4.1) A
where the supremum is taken over measurable subsets A of [0, 1].
Theorem 4. The sequence of probability measures Pn (e(X) ∈ ·, s(X) ∈ ·) satisfies a large deviation principle on the space [0, 1]2 with speed n2 and rate function Z 1 inf I(g(x)) dx, g∈Ge,s
0
where Ge,s is the set of measurable functions g : [0, 1] → [0, 1] such that Z 1 Z 1 g(x) dx = e, g(x)p dx = s. 0
0
By convention, I(x) = ∞ if x ∈ / [0, 1], and the infimum is ∞ if Ge,s is empty. The following is an immediate consequence of Theorem 4. Theorem 5. For any e, s ∈ [0, 1], Z ψ(e, s) = sup − Ge,s
0
1
I(g(x)) dx .
8
DAVID ARISTOFF AND LINGJIONG ZHU
The next theorem is an easy consequence of Theorem 4 and Varadhan’s lemma (see e.g. [5]). Theorem 6. For any e, s ∈ [0, 1] and β1 , β2 ∈ R, Z 1 Z p g(x) dx − ψ(e, β2 ) = sup β2 Ge,·
ψ(β1 , s) = sup β1 G·,s
0
0
Z
1
g(x) dx −
0
Z
1
0
1
I(g(x)) dx
I(g(x)) dx ,
where Ge,· (resp. G·,s ) is the set of measurable functions g : [0, 1] → [0, 1] satisfying Z 1 Z 1 g(x) dx = e resp. g(x)p dx = s . 0
0
From Theorem 4, the proofs of Theorem 5 and Theorem 6 are standard, so we omit them. Theorem 2 and Theorem 4 together imply that for large number of nodes, the graph has the following behavior. Approximately n(x2 − e)/(x2 − x1 ) of the nodes each have on average nx1 outward pointing edges, while the other approximately n(e − x1 )(x2 − x1 ) nodes each have on average nx2 outward pointing edges. When x1 6= x2 we call this structure bipodal; otherwise we call it uniform. Chatterjee and Varadhan [4] established large deviations for undirected random graphs on the space of graphons (see also Lovasz [18]). Szemer´edi’s lemma was needed in order to establish the compactness needed for large deviations. Since our model consists of directed graphs, the results in [4] do not apply directly. Theorem 4 avoids these technical difficulties: it is large deviations principle only on the space [0, 1]2 of edge and star densities, instead of on the (quotient) function space of graphons. Our proof relies on the simplicity of our edge/directed p-star model; it cannot be easily extended to the case where edges and directed triangles (or other more complicated directed subgraphs) are constrained. We expect that these models can be handled by adapting the results of [4] to the directed case. In Aristoff and Zhu [1], it was proved that i h 2 1 lim 2 log E en (β1 e(X)+β2 s(X)) = sup (β1 x + β2 xp − I(x)) . (4.2) n→∞ n 0≤x≤1 Observe that the G¨ artner-Ellis theorem cannot be used to obtain the large deviations principle in Theorem 4 below, due to the fact, first observed in Radin and Yin [23], that the right hand side of (4.2) is not differentiable. On the other hand, once we have established the large deviations principle in Theorem 4, we can use Varadhan’s lemma to obtain an alternative expression for the limiting free energy, i h 2 1 lim 2 log E en (β1 e(X)+β2 s(X)) = sup (β1 e + β2 s + ψ(e, s)) . n→∞ n (e,s)∈[0,1]2 The limiting free energy in the directed and undirected models differ by only a constant factor of 1/2 (see Chatterjee and Diaconis [3] and Radin and Yin [23] for the undirected model, and Aristoff and Zhu [1] for the directed model). On the other hand, the limiting entropy density ψ(e, s) we obtain here differs nontrivially from the one recently obtained in Kenyon et al. [14] for undirected graphs.
CONSTRAINED DIRECTED GRAPHS
9
5. Proofs We start with the proof of the large deviations principle of Section 4. Then we turn to the proofs of our main results in Section 3. Proof of Theorem 4. Observe that (Xij )1≤i,j≤n are i.i.d. Bernoulli random variables that take the value 1 with probability 21 and 0 with probability 21 . Therefore, the logarithm of the moment generating function of Xij is, for θ ∈ R, 1 θ 1 θXij ] = log log E[e = log(eθ + 1) − log 2. (5.1) e + 2 2 Its Legendre transform is
sup{θx − log E[eθXij ]} = x log x + (1 − x) log(1 − x) + log 2 = I(x), θ∈R
where by convention I(x) = ∞ for x ∈ / [0, 1]. Let Yi be the ith entry of the vector (X11 , X12 , . . . , X1n , X21 , X22 , . . . , X2n , . . . , Xn1 , Xn2 , . . . , Xnn ) . Then, Yi are i.i.d. Bernoulli random variables. Mogulskii theorem (see e.g. Dembo and Zeitouni [5]) shows that ⌊n2 x⌋ X 1 P 2 Yi ∈ ·, 0 ≤ x ≤ 1 n i=1
satisfies a sample path large deviation principle on the space L∞ [0, 1] consisting of functions on [0, 1] equipped with the supremum norm; the rate function is given by (R 1 I(G′ (x))dx if G ∈ AC 0 [0, 1] 0 I(G) = , +∞ otherwise
where AC 0 [0, 1] is the set of absolutely continuous functions defined on [0, 1] such that G(0) = 0 and 0 ≤ G′ (x) ≤ 1. The restriction 0 ≤ G′ (x) ≤ 1 comes from the fact that 0 ≤ Yi ≤ 1. On the other hand, for any ǫ > 0, ⌊nx⌋ n ⌊n2 x⌋ X X X 1 1 1 Xij − 2 Yi ≥ ǫ lim sup 2 log P sup 2 n i=1 n→∞ n 0≤x≤1 n i=1 j=1 ! 1 2 ≤ lim sup 2 log P sup {Yi+1 + Yi+2 + · · · Yi+n } ≥ n ǫ n→∞ n 0≤i≤n2 −n = −∞, since sup
{Yi+1 + Yi+2 + · · · Yi+n } ≤ n.
0≤i≤n2 −n
Therefore, ⌊nx⌋ n XX 1 Xij ∈ ·, 0 ≤ x ≤ 1 P 2 n i=1 j=1
satisfies a large deviation principle with the same space and rate function as ⌊n2 x⌋ X 1 Yi ∈ ·, 0 ≤ x ≤ 1 . P 2 n i=1
10
DAVID ARISTOFF AND LINGJIONG ZHU
To complete the proof, we need to use the contraction principle, see e.g. Dembo and R x Zeitouni [5] or Varadhan [31]. Given G ∈ AC 0 [0, 1], we may write G(x) = g(y)dy for a measurable function g : [0, 1] → [0, 1]. It is easy to see that if 0 Gn ∈ AC 0 [0, 1] and Gn → G in the supremum norm, then gn → g in the cut norm. Hence, if Gn → G in the supremum norm, then Z 1 Z 1 gn (x)dx → g(x)dx. 0
0
Moreover, for any p ≥ 2, Z Z 1 p p gn (x) − g(x) dx = 0
Z gn (x) − g(x) dx − p
gn ≥g
p
gn 0, |An,δ | → 0 as n → ∞.
(5.5)
Now, observe that Z 1 Z 1 F (g(x)) dx − F (gn (x)) dx 0 0 Z 1 Z 1 = F (g(x)) dx − F (gn ◦ σn (x)) dx 0 0 Z ≤ 2|An,δ | sup |F (x)| + [F (g(x)) − F (gn ◦ σn (x))] dx . x∈[0,1] [0,1]\An,δ
(5.6)
Let ǫ > 0. Using uniform continuity of F , let δ > 0 be such that |x − y| < δ implies |F (x) − F (y)| < ǫ. Then (5.6) shows that Z 1 Z 1 F (g(x)) dx − F (gn (x)) dx ≤ 2|An,δ | sup |F (x)| + ǫ. 0
0
x∈[0,1]
12
DAVID ARISTOFF AND LINGJIONG ZHU
Now (5.5) establishes (5.4), as desired. These arguments show that there is a global maximizer g˜ of Z 1 ψ(e, s) = sup − I(g(x)) dx. (5.7) g ˜∈G˜e,s
0
If (e, s) ∈ D , then for any g˜ ∈ G˜e,s we have up
|{x : g˜(x) ∈ {0, 1}}| = 1. ¯ \ Dup , it is not hard to see that, since I ′ (x) → −∞ as x → 0 and If (e, s) ∈ D I ′ (x) → ∞ as x → 1, any optimizer g˜ will satisfy |{x : g˜(x) ∈ {0, 1}}| = 0. up ¯ Now for (e, s) ∈ D \ D , it is standard in variational that maximizers of (5.7) exist and are stationary points of the functional Z 1 Z 1 Z 1 g˜ 7→ − I(g(x)) dx + β1 g(x) dx + β2 g(x)p dx (5.8) 0
0
0
where β1 and β2 are Lagrange multipliers. It is straightforward to compute that the first variation of (5.8) is Z 1 Z 1 Z 1 ′ − I (g(x))δg(x) dx + β1 δg(x) dx + pβ2 g(x)p−1 δg(x) dx 0
=
Z
0
1
0
δg(x)ℓ′ (g(x)) dx.
0
Thus the stationary points of (5.8) satisfy, for some β1 , β2 , ℓ′ (g(x)) = 0,
for a.e. x ∈ [0, 1].
Bipodal structure of the optimizers. By Theorem 1, for each pair β1 , β2 , either ℓ′ (x) = 0 at a unique x ∈ (0, 1), or ℓ′ (x) = 0 at two values 0 < y1 < y2 < 1. Since we have seen that the global maximizer g˜ of (5.2) either satisfies g(x) ∈ {0, 1} for a.e x ∈ [0, 1] or ℓ′ (g(x)) = 0 for a.e. x ∈ [0, 1], it suffices to maximize Z 1 − I(g(x)) dx 0
over the set of functions of the form
( y1 , g(x) = y2 ,
x∈A x∈ /A
where |A| = λ ∈ [0, 1] and 0 ≤ y1 ≤ y2 ≤ 1. Observe that for such g, Z 1 − I(g(x)) dx = −λI(y1 ) − (1 − λ)I(y2 ), 0
Z
Z
0 1
0
1
g(x) dx = λy1 + (1 − λ)y2 , g(x)p dx = λy1p + (1 − λ)y2p .
It is therefore enough to maximize −λI(y1 ) − (1 − λ)I(y2 )
(5.9)
CONSTRAINED DIRECTED GRAPHS
13
subject to the constraints λy1 + (1 − λ)y2 = e, λy1p + (1 − λ)y2p = s.
(5.10)
Introducing Lagrange multipliers β1 , β2 , we see that the stationary points satisfy − λI ′ (y1 ) + λβ1 + λpβ2 y1p−1 = 0, − λI ′ (y2 ) + λβ1 + λpβ2 y2p−1 = 0, I(y1 ) − I(y2 ) + β1 (y1 − y2 ) + β2 (y1p − y2p ) = 0. Suppose λ 6= 0. Then the last display implies that ℓ′ (y1 ) = ℓ′ (y2 ) = 0,
ℓ(y1 ) = ℓ(y2 ).
Thus, either y1 = x1 , y2 = x2 are global maximizers along the phase transition curve β2 = q(β1 ), or y1 = y2 is a local maximum of ℓ off the phase transition curve. Actually we don’t need to consider the case where g is constant separately: Theorem 1 shows that along the phase transition curve, x1 and x2 span the entire interval (0, 1), so by taking λ = 0 or 1 we can obtain any constant value besides 1 or 0. When λ = 0 then g is again constant, possibly with the values 0 or 1. We have established that the optimizers of (5.2) have the form (5.9), where either y1 = x1 and y2 = x2 are the global maximizers of ℓ along the phase transition curve, or x1 = 0, x2 = 1. ¯ Uniqueness of the optimizer (geometric proof ). We claim that for each (e, s) ∈ D, there is a unique optimizer of the variational problem, modulo choice of the set A. Consider the class of lines λ(y1 , y1p ) + (1 − λ)(y2 , y2p ),
0 ≤ λ ≤ 1,
(5.11)
where either y1 = x1 and y2 = x2 are the global maximizers of ℓ on the phase transition curve, or y1 = 0, y2 = 1. Since x1 (resp. x2 ) is strictly increasing (resp. strictly decreasing) in β1 with x1 < x2 , no two distinct lines of this class can intersect. Continuity of x1 , x2 in β1 along with (3.1) show that the union of ¯ Thus, given (e, s) ∈ D, ¯ there is a unique optimizer g of all the lines equals D. the form (5.9), obtained by locating the unique line from (5.11) which contains the point (e, s) and choosing λ satisfying the constraint (5.10). Note that we can now establish the formula (3.2), by solving for λ in (5.10). Uniqueness of the optimizer (algebraic proof ). Uniqueness is easy to see for ¯ \ Dup . From (5.10), (e, s) ∈ Dup , so we consider here (e, s) ∈ D λ=
s − xp2 e − x2 = p x2 − x1 x2 − xp1
and so e − x2 = −q ′ (β1 )(s − xp2 ).
(5.12)
(This equation holds even when x1 = x2 = (p − 1)/p.) We must show that (5.12) has a unique solution. Let us define F (β1 ) := e − x2 + q ′ (β1 )(s − xp2 ). Note that lim
β1 →−∞
F (β1 ) = e − 1 − (s − 1) = e − s ≥ 0,
14
DAVID ARISTOFF AND LINGJIONG ZHU
and p p−1 p−1 pp−2 lim F (β1 ) = e − s− − β1 →β1c p (p − 1)p−1 p 2 pp−2 p−1 −s =e− p (p − 1)p−1 2 p−1 pp−2 ≤e− − ep . p (p − 1)p−1 Also define G(x) := x −
p−1 p
2
− xp
pp−2 , (p − 1)p−1
0 ≤ x ≤ 1.
Then G(0) < 0 and G(1) < 0. Moreover, ′
p−1
G (x) = 1 − x
p p−1
p−1
and it is negative if p−1 p < x < 1. Finally, p − 1 (p − 1)2 p−1 p−1 = − 2 = 0. − G p p p2 p
is positive if 0 < x
0. Moreover, since q ′ (β1 ) = − and 0 < x1
β2c . Then the
18
DAVID ARISTOFF AND LINGJIONG ZHU
solutions to (5.22) are precisely y1 = x1 , y2 = x2 , where x1 < x2 are the global maximizers of ℓ at the point (q −1 (β2 ), β2 ) along q. In this case, (5.21) implies x1 ≤ e ≤ x2 . In particular, if e ∈ / [x1 , x2 ] then we must again have g(x) ≡ e. If e = x1 or e = x2 , then the constraint λx1 + (1 − λ)x2 = e implies that g(x) ≡ e. Since for (e, β2 ) ∈ Uec we have (by definition) e ∈ / (x1 , x2 ), we have established that (3.3) is valid in Uec . Next, for (e, β2 ) ∈ Ue define x2 − e λ= x2 − x1 and H(e) = β2 ep − I(e) − (β2 [λxp1 + (1 − λ)xp2 ] − [λI(x1 ) + (1 − λ)I(x2 )]) . To establish that (3.3) is valid in Ue , it suffices to show that H(e) < 0 for all e ∈ (x1 , x2 ). It is easy to check that H(x1 ) = H(x2 ) = 0 and 1 H ′ (e) = pβ2 ep−1 − I ′ (e) − (β2 (xp2 − xp1 ) − (I(x2 ) − I(x1 ))) x2 − x1 1 (ℓ(x2 ) − ℓ(x1 ) + β1 (x2 − x1 )) = ℓ′ (e) − β1 − x2 − x1 = ℓ′ (e). Thus, H ′ (x1 ) = H ′ (x2 ) = 0. Finally, H ′′ (e) = p(p − 1)β2 ep−2 − I ′′ (e) = ℓ′′ (e). From the proof of Proposition 11 in [1], we know ℓ′′ (x1 ) < 0, ℓ′′ (x2 ) < 0, and moreover there exists x1 < u1 < u2 < x2 such that ℓ′′ (e) < 0 for e ∈ (x1 , u1 ) ∪ (u2 , x2 ) while ℓ′′ (e) > 0 for e ∈ (u1 , u2 ). The result follows. We note that the star density can be computed easily as ( ep , (e, β2 ) ∈ Uec , ∂ s(e, β2 ) := ψ(e, β2 ) = e−x2 p x1 −e p ∂β2 x1 −x2 x1 + x1 −x2 x2 , (e, β2 ) ∈ Ue . It is easy to see that the star density s(e, β2 ) is continuous everywhere. Regularity of ψ(e, β2 ). From the formula it is easy to see that ψ(e, β2 ) is analytic away from ∂Ue . Write ψ = ψ(e, β2 ) and fix β2 > β2c . In the interior of Uec , ∂ψ = pβ2 ep−1 − I ′ (e), ∂e
(5.23)
∂ψ β2 xp2 I(x1 ) I(x2 ) β2 xp1 − − + . = ∂e x1 − x2 x1 − x2 x1 − x2 x1 − x2
(5.24)
and in the interior of Ue ,
Observe that lim
(e,β2 )∈Uec , e→x1
∂ψ = pβ2 xp−1 − I ′ (x1 ) = ℓ′ (x1 ) − β1 . 1 ∂e
Inside Ue , by the mean value theorem, β2 (xp1 − xp2 ) I(x1 ) − I(x2 ) ∂ψ − = ∂e x1 − x2 x1 − x2 p−1 ′ = pβ2 z − I (z) = ℓ′ (z) − β1
(5.25)
CONSTRAINED DIRECTED GRAPHS
19
for some x1 < z < x2 where z does not depend on e. Thus, ∂ψ(e, β2 )/∂e has a jump discontinuity across ∂Ue when e < ec . Similar arguments give the same result when e > ec . Next we consider the situation at the point (ec , β2c ). It is clear that ∂j ψ = 0, (e, β2 ) ∈ Ue , ∂ej However, Proposition 11 of [1] gives limc
(e,β2 )∈Ue , e→ec
Proof of part (ii). By Theorem 6, Z ψ(β1 , s) = sup β1 Ge,·
j ≥ 2.
∂4ψ = ℓ(4) (ec ) < 0. ∂e4 1
g(x) dx −
0
Z
0
1
I(g(x)) dx
(5.26)
(5.27)
(5.28)
where we recall G·,s is the set of measurable functions g : [0, 1] → [0, 1] satisfying Z 1 g(x)p dx = s. (5.29) 0
Arguments essentially identical to those in the proof of Theorem 2 show that optimizers of (5.18) must have the form ( y1 , x ∈ A g(x) = y2 , x ∈ /A
where |A| = λ ∈ [0, 1] and 0 ≤ y1 ≤ y2 ≤ 1. For such functions g, we have Z 1 Z 1 β1 I(g(x)) dx = β1 [λy1p + (1 − λ)y2p ] − [λI(y1 ) + (1 − λ)I(y2 )] , g(x) dx − 0
0
Z
0
1
g(x)p dx = λy1p + (1 − λ)y2p .
It is therefore enough to maximize β1 [λy1 + (1 − λ)y2 ] − λI(y1 ) − (1 − λ)I(y2 ) subject to the constraint λy1p + (1 − λ)y2p = s. Introducing the Lagrange multiplier β2 , we see that the stationary points satisfy − λI ′ (y1 ) + λβ1 + λpβ2 y1p−1 = 0, − λI ′ (y2 ) + λβ1 + λpβ2 y2p−1 = 0, I(y1 ) − I(y2 ) + β1 (y1 − y2 ) + β2 (y1p − y2p ) = 0. Now arguments analogous to those in the proof of part (i) show that ψ has the formula (3.4). We note that the limiting edge density has the formula ( 1 sp (β1 , s) ∈ Usc , ∂ p ψ(β1 , s) = s−xp2 e(β1 , s) := x −s ∂β1 x + xp1−xp x2 (β1 , s) ∈ Us . xp −xp 1 1
2
1
2
It is easy to see that e(β1 , s) is continuous everywhere.
20
DAVID ARISTOFF AND LINGJIONG ZHU
Regularity of ψ(β1 , s). From the formula it is easy to see that ψ(β1 , s) is analytic away from ∂Us . Write ψ = ψ(β1 , s) and fix β1 < β1c . In the interior of Usc , 1 1 1 ∂ψ 1 1 = β1 s p −1 − s p −1 I ′ (s p ), ∂s p p
(5.30)
β1 x2 I(x1 ) I(x2 ) β1 x1 ∂ψ − p − p + p . = p ∂s x1 − xp2 x1 − xp2 x1 − xp2 x1 − xp2
(5.31)
and in the interior of Us ,
Note that lim
(β1 ,s)∈Usc , s→xp 1
∂ψ 1 1 = β1 x1−p − x1−p I ′ (x1 ). 1 ∂s p p 1
(5.32)
In the interior of Us , by the mean value theorem β1 (x1 − x2 ) I(x1 ) − I(x2 ) ∂ψ − = ∂s xp1 − xp2 xp1 − xp2 1 1 = β1 z 1−p − z 1−p I ′ (z) p p for some x1 < z < x2 , where z does not depend on s. Thus, ∂ψ/∂s has a jump discontinuity across ∂Us for s < sc . Similar arguments show the same result for s > sc . Next we consider the situation at the point (β1c , sc ). It is clear that ∂j ψ = 0, ∂sj
(β1 , s) ∈ Us ,
j ≥ 2.
(5.33)
1
In the computations below, we let (β1 , s) ∈ Usc and t = s p . Note that ∂ψ ∂t ∂t = [β1 − I ′ (t)] = [ℓ′ (t) − β2 ptp−1 ] , ∂s ∂s ∂s
(5.34)
and ∂2t ∂ 2ψ = [ℓ′ (t) − β2 ptp−1 ] 2 + ℓ′′ (t) − β2 p(p − 1)tp−2 2 ∂s ∂s 2 2 ∂ t ∂t . = ℓ′ (t) 2 + ℓ′′ (t) ∂s ∂s
∂t ∂s
2
Thus, ∂3t ∂3ψ ′ = ℓ (t) + ℓ′′′ (t) ∂s3 ∂s3 and
∂t ∂s
3
+ 3ℓ′′ (t)
∂t ∂ 2 t , ∂s ∂s2
4 ∂t ∂ 3 t ∂4t ∂4ψ ∂t ′′ ′ (4) = 4ℓ (t) + ℓ (t) + ℓ (t) ∂s4 ∂s ∂s3 ∂s4 ∂s 2 2 2 2 ∂t ∂ t ∂ t ′′′ ′′ + 6ℓ (t) + 3ℓ (t) . ∂s ∂s2 ∂s2 As s → sc , t → ec and since ℓ′ (ec ) = ℓ′′ (ec ) = ℓ′′′ (ec ) = 0, ℓ(4) (ec ) < 0, we have limc
s→s
∂3ψ ∂2ψ = limc = 0, 2 s→s ∂s3 ∂s
(5.35)
CONSTRAINED DIRECTED GRAPHS
21
while Proposition 11 of [1] gives limc
s→s
∂4ψ = limc ℓ(4) (t) s→s ∂s4
∂t ∂s
4
= ℓ(4) (ec )
(ec )1−p p
4
< 0.
(5.36)
The next result concerns the curve β2 = q(β1 ) and the shapes of Ue and Us . Proposition 7. (i) The curve β2 = q(β1 ) is analytic in β1 < β1c . ∂x1 2 > 0 and ∂x (ii) For any β1 < β1c , ∂β ∂β1 < 0. Moreover, 1 ∂x1 = +∞, ∂β1 ∂x2 = −∞, lim c β1 →β1 ∂β1 lim
β1 →β1c
For any β2 > β2c ,
∂x1 ∂β2
< 0 and
∂x2 ∂β2
∂x1 = 0, ∂β1 ∂x2 lim = 0. β1 →−∞ ∂β1 lim
β1 →−∞
> 0. Moreover,
∂x1 = −∞, β2 →β2 ∂β2 ∂x2 lim = +∞, β2 →β2c ∂β2 lim c
∂x1 = 0, β2 →+∞ ∂β2 ∂x2 lim = 0. β2 →+∞ ∂β2 lim
Proof. First we show that q is analytic. There is an open V-shaped set containing the curve q except for the critical point (β1c , β2c ), inside which ℓ has exactly two local maximizers, y1 < y2 . (See [23] and [1].) It can be seen from the proof of Proposition 11 of [1] that ℓ′′ (y1 ) < 0 and ℓ′′ (y2 ) < 0 inside the V-shaped region. The analytic implicit function theorem [15] then shows that y1 and y2 are analytic functions of β1 and β2 inside this region. Note that q is defined implicitly by the equation β1 y1 + β2 y1p − I(y1 ) − (β1 y2 + β2 y2p − I(y2 )) = 0. Implicitly differentiating the left hand side of this equation w.r.t. β2 gives ∂y ∂y1 1 + y1p + pβ2 y1p−1 − I ′ (y1 ) β1 ∂β2 ∂β2 ∂y ∂y2 2 p p−1 ′ − β1 + y2 + pβ2 y2 − I (y2 ) ∂β2 ∂β2 = y1p − y2p < 0. Another application of the analytic implicit function theorem implies β2 = q(β1 ) is analytic for β1 < β c . Now we turn to the statements involving x1 and x2 . Along β2 = q(β1 ), we have x1 p−1 = 0. β1 + pq(β1 )x1 − log 1 − x1 Differentiating with respect to β1 , we get ∂x1 1 p−2 p−1 ′ = 0. 1 + pq (β1 )x1 + p(p − 1)q(β1 )x1 − x1 (1 − x1 ) ∂β1 Therefore, ∂x1 = ∂β1
1 + pq ′ (β1 )xp−1 1 1 x1 (1−x1 )
− p(p − 1)q(β1 )xp−2 1
=
1−
p−1 x1 −x2 p px1 xp 1 −x2 −ℓ′′ (x1 )
> 0.
22
DAVID ARISTOFF AND LINGJIONG ZHU
Since limβ1 →β1c {1 + pq ′ (β1 )xp−1 } = limβ1 →β1c {−ℓ′′ (x1 )} = 0, by L’Hˆ opital’s rule, 1 lim c
β1 →β1
∂x1 pq ′′ (β1 )xp−1 + p(p − 1)q ′ (β1 )xp−2 ∂x1 1 1 ∂β1 = lim c , ′′′ (x ) ∂x1 β1 →β1 ∂β1 −p(p − 1)q ′ (β1 )xp−2 − ℓ 1 1 ∂β1
which implies that ∂x1 = +∞. ∂β1 Since x1 → 0 as β1 → β1c , it is easy to see that lim
β1 →β1c
lim
β1 →−∞
∂x1 = 0. ∂β1
The results for x2 can be proved using the similar methods. Finally, notice that ∂xi ∂q −1 (β2 ) ∂xi 1 ∂xi = = , ∂β2 ∂β1 ∂β2 ∂β1 q ′ (β1 ) Therefore the results involving β2 also hold.
i = 1, 2.
Acknowledgements David Aristoff was supported in part by AFOSR Award FA9550-12-1-0187. Lingjiong Zhu is grateful to his colleague Tzu-Wei Yang for helpful discussions. References [1] Aristoff, D. and L. Zhu. (2014). On the phase transition curve in a directed exponential random graph model. Preprint. [2] Besag, J. (1975). Statistical analysis of non-lattice data. J. R. Stat. Soc., Ser. D. Stat. 24, 179-195. [3] Chatterjee, S. and P. Diaconis. (2013). Estimating and understanding exponential random graph models. Annals of Statistics. 41, 2428-2461. [4] Chatterjee, S. and S. R. S. Varadhan. (2011). The large deviation principle for the Erd˝ osR´ enyi random graph. European. J. Combin. 32, 1000-1017. [5] Dembo, A. and O. Zeitouni. Large Deviations Techniques and Applications, 2nd Edition, Springer, New York, 1998. [6] Fienberg, S. E. (2010). Introduction to papers on the modeling and analysis of network data. Ann. Appl. Statist. 4, 1-4. [7] Fienberg, S. E. (2010). Introduction to papers on the modeling and analysis of network data– II. Ann. Appl. Statist. 4, 533-534. [8] Fisher, M.E. and Radin, C. (2006). Definition of thermodynamic phases and phase transitions, AIM workshop on phase transitions. http://www.aimath.org/pastworkshops/phasetransition .html. [9] Frank, O. and D. Strauss. (1986). Markov graphs. Journal of the American Statistical Association 81, 832-842. [10] Gallavotti, G. (1999). Statistical Mechanics: A Short Treatise. Springer-Verlag, Berlin. [11] H¨ aggstr¨ om, O. and Jonasson, J. (1999). Phase transition in the random triangle model. J. Appl. Probab. 36, 1101-1115. [12] Holland, P. and S. Leinhardt. (1981). An exponential family of probability distributions for directed graphs. J. Am. Stat. Assoc. 76, 33-50. [13] Janson, S. Graphons, cut norm and distance, couplings and rearrangements. arXiv:1009.2376 [14] Kenyon, R., Radin, C., Ren K. and L. Sadun. Multipodal structure and phase transitions in large constraigned graphs. Preprint, 2014. [15] Krantz, S. G. and Parks, H. R. (2002). The Implicit Function Theorem: History, Theory, and Applications. Birkhauser, Boston, MA. [16] Lebowitz, J.L, Mazel, A.E. and Presutti, E. (1998). Rigorous Proof of a Liquid-Vapor Phase Transition in a Continuum Particle System. Phys. Rev. Lett. 80, 4701
CONSTRAINED DIRECTED GRAPHS
23
[17] Lennard-Jones, J. E. (1924). On the Determination of Molecular Fields. Proc. R. Soc. Lond. A 106 (738), 463477. [18] Lov´ asz, L. and B. Szegedy. (2006). Limits of dense graph seqeunces. J. Combin. Theory Ser. B 96, 933-957. [19] Lov´ asz, L. (2009). Very large graphs. Current Develop. Math. 2008, 67-128. [20] Newman, M. E. J. (2010). Networks: An Introduction. Oxford University Press, Oxford. [21] Park, J. and M. E. J. Newman. (2004). Solution of the two-star model of a network. Phys. Rev. E 70, 006146. [22] Park, J. and M. E. J. Newman. (2005). Solution for the properties of a clustered network. Phys. Rev. E 72, 026136. [23] Radin, C. and M. Yin. (2013). Phase transitions in exponential random graphs. Annals of Applied Probability. 23, 2458-2471. [24] Radin, C. and L. Sadun. (2013). Phase transitions in a complex network. J. Phys. A: Math. Theor. 46, 305002. [25] Radin, C., Ren, K. and L. Sadun. (2014). The asymptotics of large constrained graphs. J. Phys. A: Math. Theor. 47, 175001. [26] Radin, C. and L. Sadun. Singularities in the entropy of asymptotically large simple graphs. Preprint, 2013. [27] Rinaldo, A., Fienberg, S. and Y. Zhou. (2009). On the geometry of discrete exponential families with application to exponential random graph models. Electron. J. Stat. 3, 446-484. [28] Robins, G., Snijders, T., Wang, P., Handcock, M. and P. Pattison. (2007). Recent developments in exponential random graph (p∗ ) models for social networks. Social Networks. 29, 192-215. [29] Snijders, T. A. B., Pattison, P., Robins, G. L. and M. Handcock. (2006). New specifications for exponential random graph models. Sociological Methodology. 36, 99-153. [30] Strauss, D. (1986). On a general class of models for interaction. SIAM Rev. 28, 513-527. [31] Varadhan, S. R. S. Large Deviations and Applications, SIAM, Philadelphia, 1984. [32] Wasserman, S. and K. Faust. (2010). Social Network Analysis: Methods and Applications. Structural Analysis in the Social Sciences, 2nd ed. Cambridge Univ. Press, Cambridge. [33] Yin, M. (2013). Critical phenomena in exponential random graphs. Journal of Statistical Physics. 153, 1008-1021. [34] Yin, M., Rinaldo, A. and S. Fadnavis. Asymptotic quantization of exponential random graphs. Preprint, 2013. School of Mathematics University of Minnesota-Twin Cities 206 Church Street S.E. Minneapolis, MN-55455 United States of America E-mail address:
[email protected],
[email protected]