simultaneous rational approximation to binomial ... - Semantic Scholar

Report 3 Downloads 160 Views
SIMULTANEOUS RATIONAL APPROXIMATION TO BINOMIAL FUNCTIONS Michael A. Bennett1

Abstract. We apply Pad´e approximation techniques to deduce lower bounds for simultaneous rational approximation to one or more algebraic numbers. In particular, we strengthen work of Osgood, Fel’dman and Rickert, proving, for example, that √ o n √ max 2 − p1 /q , 3 − p2 /q > q −1.79155 for q > q0 (where the latter is an effective constant). Some of the Diophantine consequences of such bounds will be discussed, specifically in the direction of solving simultaneous Pell’s equations and norm form equations.

0. Introduction In 1964, Baker [1,2] utilized the method of Pad´e approximation to hypergeometric functions to obtain explicit improvements upon Liouville’s theorem on rational approximation to algebraic numbers. By way of example, he showed that (0.1)

√ p 3 2 − > 10−6 q −2.955 q

for all positive integers p and q and used such bounds to solve related Diophantine equations. Chudnovsky [6] subsequently refined Baker’s results, 0

1991 Mathematics Subject Classification. Primary 11J68, 11J82; Secondary 11D57. Key words and phrases. Simultaneous approximation to algebraic numbers, irrationality and linear independence measures, Pad´e approximants, Pell-type equations, Norm form equations. 1 Research supported by an NSERC Postdoctoral Fellowship

1

primarily through a detailed analysis of the arithmetical properties of certain Pad´e approximants. Analogous to (0.1), he proved that (0.2)

√ p 3 2 − > q −2.42971 q

for all integers p and q with q greater than some effectively computable constant q0 . By working out the implicit constants in (0.2), Easton [8] deduced √ p 3 −6 −2.795 2 − > 6.6 × 10 q q

for positive integers p and q (as well as related bounds for other cubic irrationalities). Similar results exist for simultaneous approximation to several algebraic numbers. In particular, Baker [3] derived bounds of the form (0.3)

) ( pu max θu − > q −λ 1≤u≤m q

for certain algebraic numbers θ1 , θ2 , . . . θm with 1, θ1 , θ2 , . . . θm linearly independent over the rationals, λ = λ(θ1 , . . . θm ) an explicit real number and p1 , . . . pm , q positive integers with q greater than an effective q0 (λ, θ1 , . . . θm ). To be precise, he considered (0.4)

(θ1 , θ2 , . . . θm ) = (r ν1 , r ν2 , . . . rνm )

with r, ν1 , ν2 , . . . νm rational, via approximation to the system of binomial functions 1, (1 + x)ν1 , . . . (1 + x)νm . Chudnovsky [6], generalizing his approach from the case of a single approximation, sharpened these inequalities. Along somewhat different lines, Osgood [13], Fel’dman [9] and Rickert [15] obtained results like (0.3) with (0.5)

ν (θ1 , θ2 , . . . θm ) = (r1ν , r2ν , . . . rm )

for r1 , r2 , . . . rm and ν rational. These also utilized Pad´e approximation, this time to the functions 1, (1 + a1 x)ν , . . . (1 + am x)ν 2

where a1 , . . . am are distinct integers. Through use of an elegant contour integral representation for the desired Pad´e approximants, Rickert proved the inequality (0.6)

) ( √ p1 √ p2 max 2 − , 3 − > 10−7q −1.913 q q

for p1 , p2 and q integral. In this paper, we will strengthen the work of Osgood, Fel’dman and Rickert on simultaneous approximation to algebraic numbers satisfying (0.5), in analogy to Chudnovsky’s results for those with (0.4). This is primarily accomplished through careful estimation of both “analytic” and “arithmetic” asymptotics (in the same sense as Chudnovsky [6]) of Pad´e approximants to binomial functions. A particularly striking result along these lines (with rather different approximating forms) is due to Hata [11] who proved that p π − ≥ q −8.0161 q

for sufficiently large positive integers p and q. In the special case m = 1, we obtain Chudnovsky’s Theorem 5.3 of [6] on approximation to a single algebraic number (see section §7). For larger values of m, we can prove, for example, that ) ( √ p1 √ p2 max 2 − , 3 − > q −1.79155 q q

for p1 and p2 integral and q ≥ q0 effectively computable (compare to (0.6)). Similarly, we have ) ( √ p1 √ p2 max 3 − , 5 − > q −1.82227 q q

for q ≥ q1 effective. Optimally, one would like to derive (0.3) for any λ > 1+1/m. Theorems of Roth [16] (m = 1) and Schmidt [17] (m > 1) assert that such bounds exist for any independent algebraic θ1 , . . . θm , but are ineffective in that they do not permit the explicit calculation of q0 . For specific classes of algebraic numbers, however, we will be able to obtain effective bounds with λ arbitrarily close to 1 + 1/m. These correspond to the situations described 3

by previous authors where the rationals r or r1 , r2 , . . . rm in (0.4) or (0.5), respectively, are suitably close to 1. We will also prove a theorem on linear forms, of the type (0.7)

|x0 + x1 θ1 + · · · + xm θm | > X −λ1

for x0 , . . . xm integers, θ1 , . . . θm as in (0.5) and X = max |xi | satisfying 0≤i≤m

X ≥ X0 (λ1 , θ1 , . . . θm ). Standard transference arguments (see e.g. Cassels [5]) ensure that (0.3) implies (0.7) with exponent λ1 =

m(λ − 1) m(−λ + 1) + λ

provided λ < 1 + 1/(m − 1). Our result, however, is somewhat stronger. These results have direct applications to Diophantine equations which we will address in §8 and §9. For example, they permit solution of the norm form equation   √ √ NK/Q x + y 4 M 4 − 1 + z 4 M 4 + 1 = u √ √ (where K = Q(4 M 4 − 1, 4 M 4 + 1), x, y and z are integers and u is constant) for M ≥ 6. 1. A Pair of Theorems on Rational Approximation Henceforth, we will suppose that a0 , a1 , · · · am are distinct integers (m ≥ 1) with one of them equal to zero, satisfying a0 < a1 < . . . < am . Let us also assume that N is a positive integer with N > max |au | 0≤u≤m

and that s and n are integers with 1 ≤ s < n and (s, n) = 1. Define (1.1)

c1 = lcm

   m  Y   

   

|al − av | : 0 ≤ v ≤ m

  

l=0

l 6= v

4

c2 = lcm {|al − av | : 0 ≤ v < l ≤ m}

(1.2) (1.3)

c3 =

Y

1

pmax{ordp (n/c2 )+ p−1 ,0}

p|n

and c 4 = c1 · c2 · c3 . If, following Rickert [15], we set A(z) =

m Y

(z − au ), then the polynomial

u=0

A(z) − (z + N)A0 (z)

(1.4)

(where we write A0 (z) for dA(z)/dz) is readily seen to have m + 1 real zeros, one of them, say z0 , satisfying z0 < −N and the remaining m, say z1 , z2 , . . . zm , lying between successive values of the ai ’s. Without loss of generality, we suppose that au−1 < zu < au (1 ≤ u ≤ m) and define  0   |A (z1 )|

c5 = |A0 (z0 )|

if v = 0 c6 (v) = min{|A (zv )|, |A (zv+1 )|} if 1 ≤ v < m   |A0 (zv )| if v = m 0

0

c7 = min |A0 (zu )| 1≤u≤m

and

 

c8 = exp  −γ −



X 1 ψ max φ(n) 1≤r max |au | an integer. If, further, s and n are relatively prime with 0≤u≤m

1 ≤ s < n,  > 0 and c7 · c8 < c4 < c5 · c8 , then

) (   au s/n pu max 1 + − > q −λ− 0≤u≤m N q au 6=0

5

for all integers p0 , . . . pm and q with q ≥ q0 (, s, n, a0 , . . . , am , N), where log(c4 /c7 · c8 ) and q0 is effectively computable. λ=1+ log(c5 · c8 /c4 ) As mentioned previously, in section §7 we will show that, in the case m = 1, the above theorem implies Chudnovsky’s result [6, Theorem 5.3] (see also Heimonen, et. al. [12]). For linear forms, we will prove Theorem 1.2 If a0 . . . am , N, s and n are integers satisfying the hypotheses of the previous theorem, x0 , . . . xm integers, X = max |xu |,  > 0 and 0≤u≤m

m Y

c6 (v) < (c4 /c8 )m < c5 · min

m Y

1≤l≤m

v=1

c6 (v)

v=1

v6=l

m   X au s/n xu · 1 + > X −λ1 − N u=0

then

for all X ≥ X0 (, s, n, a0 , . . . am , N), where m log(c4 /c8 ) − λ1 =

X

log(c6 (v))

1≤v≤m

m log(c8 /c4 ) + log(c5 ) + min

1≤l≤m

X

log(c6 (v))

1≤v≤m

v6=l

and X0 is effectively computable. Examples and applications of this result will be briefly described in §8 and §9. 2. The Nature of the Approximating Forms To construct our approximants to the system of binomial functions (0.3), we consider the contour integral (2.1)

1 Z (1 + zx)k (1 + zx)ν Iu (x) = dz (0 ≤ u ≤ m). 2πi γ (z − au )(A(z))k 6

Here, k is some fixed positive integer, ν a nonintegral positive rational, γ a closed, counter-clockwise contour enclosing the poles of the integrand and x a real satisfying |x|−1 > max |au |.

(2.2)

0≤u≤m

By application of Cauchy’s residue theorem, we can write (2.3)

Iu (x) =

m X

Puv (x)(1 + av x)ν (0 ≤ u ≤ m)

v=0

where the Puv (x) are polynomials with rational coefficients and degree at most k in x. Explicitly, from Rickert [15, Lemma 3.3], we have (2.4) ! ! X Y k+ν −kul k−hv hv Puv (x) = (1 + av x) x (av − al )−kul −hl hv h l 0≤l≤m l 6= v

P

where denotes summation over all nonnegative integers h0 , . . . hm with sum kuv − 1, for kab = k + δab and δab the Kronecker delta. To guarantee the “independence” of the approximants, we require that det (Puv (x)) 0≤u,v≤m

does not vanish for nonzero x, a consequence of Rickert’s Lemma 3.4. To be precise, one may write 

det

0≤u,v≤m

(Puv (x)) = 

m−1 Y v=−1

 ! m Y ν − vk   ·  k

l=0

 m Y



(m+1)k (as − al )−k  . x

s=0

s 6= l

In the sections that follow, we will find asymptotics for |Puv (1/N)| and |Iu (1/N)| and then study the arithmetic properties of the coefficients of Puv (x). 3. Contour Integral Estimates To begin, we note that the value Puv (x)(1+av x)ν (0 ≤ u ≤ m) is obtained from the integral (2.1), only with the contour γ changed so as to enclose the integer av and no other al ’s (for l 6= v). Setting x = 1/N, one sees that 7

(2.2) is satisfied and it follows that the integrand of (2.1) is analytic in a suitable deleted neighbourhood of av . Following Hata [11], we may apply the saddle-point method as described in Dieudonn´e [7, chapter IX] to estimate   1 the principal part of Puv (1 + av /N)ν for large values of k. Explicitly, N we set   z F (z) = log 1 + − log (|A(z)|) N and   z ν G(z) = 1 + (z − au )−1 N so that 

(3.1)

Puv



1 (1 + av /N)ν = N

Z

G(z)ekF (z) dz. γ

The saddles of the surface |F (z)| are given by the zeros of the derivative of F (z) which, since x = 1/N, are the zeros of the polynomial (1.4) (say z0 , z1 , . . . zm as in §1). Since G(z)ekF (z) vanishes as z tends to −N or ∞ (avoiding the real branch cut from −N to −∞), the saddle-point method yields Lemma 3.1 As k → ∞, the principal part of (3.1) is given by 

Puv

1 N



av 1+ N





X v≤t≤v+1

s

ekF (zt ) G(zt )

−2π . kF 00 (zt )

where the summation is restricted to t ∈ [1, m]. In particular, since the roots of (1.4) satisfy eF (zl ) = |NA0 (zl )|−1 (0 ≤ l ≤ m) we may conclude that   1 1 = − log(c6 (v) · N) ≤ − log(c7 · N) lim log Puv k→∞ k N

for all 0 ≤ u, v ≤ m. To find asymptotics for |Iu (1/N)| requires a more delicate analysis. Since the integrand of (2.1) has a branch point at z = −N, we cannot simply apply 8

the saddle-point method for the saddle z0 without justification (recall that z z0 < −N is real). If, however, we make the change of variables 1 + → −w, N then we may write 

(3.2)

Iu

1 N



=

(−1)mk eπiν 2πiN (m+1)k

m  Y

Z γ0

w k w ν dw (w + 1 + aNu )(B(w))k



al where B(w) = w+1+ and γ 0 = γ1 + γ2 + γ3 + γ4 is a contour N l=0 containing the poles of the integrand of (3.2) while avoiding a branch cut along the nonnegative real axis (see Figure 1).

γ2

γ1 γ4 0

r

γ3

R

Figure 1:

Since Z Z 2π w k w ν dw wk wν ≤ dθ , l = 2 or 4 a a u k γl (w + 1 + u )(B(w))k (w + 1 + N )(B(w)) 0 N

(where w = Reiθ or reiθ respectively) we have that the contribution to (3.2) associated with the arcs γ2 and γ4 becomes negligible as r → 0 and R → ∞.

9

Therefore, from        

xk xν on γ1 (x + 1 + aNu )(B(x))k

   

e2πiν xk xν on γ3 (x + 1 + aNu )(B(x))k

wk wν = (w + 1 + aNu )(B(w))k   

we may conclude, letting r → 0 and R → ∞, that 

Iu

1 N



=

(−1)mk eπiν (1 − e2πiν ) 2πiN (m+1)k

Z





xk xν dx

x+1+

0

au N



(B(x))k

.

This is readily evaluated for large k via Laplace’s method (see e.g. Dieudonn´e [7, Chapter IV §2]). Since the function x/B(x) has only one critical point on the positive real axis, say x0 , and vanishes as x → 0+ or x → ∞, we conclude that

(3.3)



  1 1 x0 lim log Iu = log . N m+1 B(x0 ) k→∞ k N

Changing variables back to our original z, we find that 1 + z /N x0 1 0 m+1 = = 0 N B(x0 ) A(z0 ) NA (z0 )

for z0 as in §1. Combining this with (3.3) yields

  1 1 Lemma 3.2 lim log Iu = − log(c5 · N) (0 ≤ u ≤ m). k→∞ k N

4. Coefficients of the Approximating Polynomials We wish to determine certain arithmetic properties of the coefficients of the polynomials Puv (x) defined in §2. Let us write Pv =

Y 0≤l≤m

k + hl − 1 hl

l 6= v

10

!

(al − av )−k−hl .

¿From (2.4), if u = v, then (4.1)

where

Puv (x) = (−1) P

mk

X

!

k+ν hv

(1 + av x)k−hv xhv Pv

implies the sum over nonnegative h0 , . . . hm satisfying

(4.2)

Puv (x) = (−1)

hl = k. If,

l=0

however, u 6= v, we have mk

m X

X

k+ν hv

!

!

(1 + av x)

k−hv hv

x

k + hu Pv k(au − av )

where in this latter case, the summation is over nonnegative h0 , . . . hm with m X

hl = k − 1. ¿From here on we will fix ν = s/n. The following elemen-

l=0

tary lemma concerning primes dividing binomial coefficients will be the chief tool in determining our “arithmetic” asymptotics. It enables us to identify certain classes of prime numbers which are guaranteed to divide the numerators of the coefficients of the polynomials Puv (x) defined in (2.4). Similar results have been utilized by Chudnovsky [6], Hata [11] and Heimonen, et al. [12], amongst others, in the pursuit of irrationality and linear independence measures. Suppressing dependence on m, n, s and k, let n √ S(r) = p prime : p > nk + s, (p, nk) = 1 (and, if m = 1, ( )   k−1 nm − r r (4.3) , (p, nk − s − n) = 1) and > max p nm n for r with 1 ≤ r < n and pr ≡ s (modn)} where we adopt the notation {x} = x − [x] for the fractional part of a real number x. Lemma 4.1 If p ∈ S(r) then ordp

k + s/n h0

!

k + h1 − 1 h1

!

for all nonnegative integers h0 , . . . hm with

··· m X l=0

11

k + hm − 1 hm

!!

hl = k or k − 1.

≥1

Proof. Suppose that p ∈ S(r) does not divide the product !

k + h1 − 1 h1 Then it follows that

(

hl p

)

··· (

k−1 > n l=1

hl p

)

yields (

(4.6)

k − h0 p

We wish to show that ordp

)



m X l=1

k + s/n h0

(

hl p

)

1 + . p

!

≥ 1. Since p 6 | n, we have

!

!

(nk + s)(n(k − 1) + s) · · · (n(k − h0 ) + n + s) ordp = ordp h0 ! √ and so by a result of Chudnovsky [6, Lemma 4.5] (recalling that p > nk + s) k + s/n h0

(4.7)

ordp

k + s/n h0

!

(

=

k − θ − h0 p 12

)

(

h0 + p

)

(

k−θ − p

)

pr − s where θ = . We thus have that ordp n ) ( ) ( h0 k−θ > or, equivalently, when p p (

(4.8)

k − h0 p

)

k + s/n h0

!

≥ 1 exactly when

s r − . n pn


n pn p n

contradicting (4.9). To complete the proof of the lemma, we have only to show that the simultaneous equalities (

(

h1 p h0 p

)

( )

=1− )

(

= 13

k p

k−θ p

)

and h0 + h1 = k − 1 produce a contradiction. Well, the first of these implies that p|k + h1 , while the second yields p|n(k − h0 ) + s. But these together with h0 + h1 = k − 1 imply that p divides nk − s − n, contradicting our initial assumptions.

Now if by π(x, n, s) we denote the number of primes p ≤ x in the arithmetic progression bn + s, then, analogous to the standard prime number theorem, we have x 1 π(x, n, s) = 1+O φ(n) log x log x

!!

.

It follows that if β > α > 0, then (4.10)

Y β−α 1 log p = k→∞ k φ(n)

lim

where the product is over all primes p in the ( interval ) αk < p < βk and  k−1 nm − r r satisfying p ≡ a (mod n). The inequality , > max p nm n defines a collection of open intervals for primes p in S(r), of the form nk k nmk , min , l+1 (l + 1)nm − r ln + r

!

+O

√ 

!

k

, l = 0, 1, 2, . . .

where the shape of the error term follows from the assumption that p > √ nk + s. From (4.10), we may therefore write ∞ Y 1 X 1 nm n lim log p= min , k→∞ k φ(n) l=0 (l + 1)nm − r ln + r p∈S(r)

Since, from Bateman and Erd´elyi [4], the function ψ(z) = ψ(β) − ψ(α) =

∞ X l=0

14

1 1 − l+α l+β

!

!

!

1 − . l+1

d log Γ(z) satisfies dz

we may conclude that 





Y 1 1 nm − r r p= ψ(1) − ψ max log , k→∞ k φ(n) nm n p∈S(r)



lim

whence, recalling the equality ψ(1) = −γ, Y 1 (4.11) lim log k→∞ k 1≤r