Algebra of linear recurrence relations in arbitrary

Report 0 Downloads 145 Views
Algebra of linear recurrence relations in arbitrary characteristic Nikolai V. Ivanov

The goal of this paper is to present an algebraic approach to the basic results of the theory of linear recurrence relations. This approach is based on the ideas from the theory of representations of one endomorphisms (a special case of which may be better known to the reader as the theory of the Jordan normal form of matrices). The notion of the divided derivatives, an analogue of the divided powers, turned out to be crucial for proving the results in a natural way and in their natural generality. The final form of our methods was influenced by some ideas of the the umbral calculus of G.-C. Rota. Neither the theory of representation of one endomorphism, nor the theory of divided powers, nor the umbral calculus apply directly to our situation. For each of these theories we need only a modified version of a fragment of it. This is one of the reasons for presenting all proofs from the scratch. Both these fragments and our modifications of them are completely elementary and beautiful by themselves. This is another reason for presenting proofs independent of any advanced sources. Finally, the theory of the linear recurrence relation is an essentially elementary theory (despite being rarely presented with complete proofs), and as such it deserves a selfcontained exposition. The prerequisites for reading this paper are rather modest. Only the familiarity with the most basic notions of the abstract algebra, such as the notions of a commutative ring, of a module over a commutative ring, and of endomorphisms and homomorphisms are needed. No substantial results from the abstract algebra are used. A taste for the abstract algebra and a superficial familiarity with it should be sufficient for the reading of this paper. The standard expositions of the theory of linear recurrence relations present this theory over algebraically closed fields of characteristic 0, or even only over the field of complex numbers. In contrast, such restrictions are very unnatural from our point of view. The methods of this paper apply equally well to all commutative rings with unit and without zero divisors; no assumptions about the characteristic are needed. Of course, a form of the condition of being algebraically

c Nikolai V. Ivanov, 2014. Neither the work reported in this paper, nor its preparation were supported

by any governmental or non-governmental agency, foundation, or institution.

1

closed is needed. We assume only that all roots of the characteristic polynomial of the linear recurrence relation in question are contained in the ring under consideration (this can be done using any of the standard approaches to the theory also). There are no references in this paper. The reasons are the same as N. Bourbaki’s reasons for not including the references except in his Notes historiques. Contents 1. 2. 3. 4. 5. 6. 7.

Divided derivatives of polynomials Sequences and duality Adjoints of the left shift and of divided derivatives Endomorphisms and their eigenvalues Torsion modules and a property of free modules Polynomials and their roots The main theorems The sections dependency

The main results are stated and proved in Section 7, which depends on all previous sections. Sections 1 - 6 are independent with only one exception: Section 3 depends on both Section 1 and Section 2.

We denote by Z the ring of integers and by N the set of non-negative integers. We denote by k a fixed entire ring, i.e. a commutative ring with a unit without zero divisors and such that its unit is not equal to its zero. There is a canonical ring homomorphism Z → k taking 0, 1 ∈ Z to the zero and the unit of k respectively, making k into a Z-algebra, and every k-module into a Z-module. We identify 0, 1 ∈ Z with their images in k.

1. Divided derivatives of polynomials Polynomials in two variables. Let x be a variable, and let k[x] be the k -algebra of polynomials in x with coefficients in k. Let y be some other variable, and let k[x, y] be the k -algebra of polynomials in two variables x, y with coefficients in k. As is well known, k[x, y] is canonically isomorphic to the k -algebra k[x][y] of polynomials in y with coefficients in k[x]. We will identify these two algebras. This allows us to write any polynomial f(x, y) in two variables x, y in the form X∞ (1) f(x, y) = gn (x) yn . n =0

2

In fact, this sum is obviously finite. Equivalently, the polynomials gn (x) are equal to 0 for all sufficient big n ∈ N. The polynomials gn (x) are uniquely determined by f(x, y). The definition of the divided derivatives. Let p(x) ∈ k[x]. Then p(x y) ∈ k[x, y], and hence p(x y) has the form X∞ (δn p)(x) yn , (2) p(x y) = n =0

for some polynomials (δn p)(x) uniquely determined by p(x). The sum in (2) is actually finite. Equivalently, (δn p)(x) = 0 for all sufficient big n ∈ N. The coefficient (δn p)(x) in front of yn in the sum in (2) is called the n-th divided derivative of the polynomial  n n p(x). We will also denote (δ p)(x) by δ p(x) or δn p(x). Operators δn . Let n ∈ N. By assigning δn p(x) ∈ k[x] to p(x) ∈ k[x] we get a map δn : p(x) 7→ δn p(x). Clearly, δn is a k-linear operator k[x] → k[x]. After substitution y = 0 the equation (2) reduces to p(x) = δ0 p(x). Therefore, δ0 = id = idk[x] . 1.1. Theorem (Leibniz formula). Let f(x) , g(x) ∈ k[x], and let n ∈ N. Then X δ i f(x) δ j g(x). (3) δn (f(x)g(x)) = i

j =n

Proof. By applying (2) to p(x) = f(x) and to p(x) = g(x), we get X∞ δ i f(x) y i , f(x y) = i =0

g(x y) =

X∞

j =0

δ j g(x) y j .

By multiplying these two identities, we get X∞ f(x y) g(x y) = δ i f(x) y i δ j g(x) y i , i,j = 0

and hence f(x y) g(x y) = =

X∞

δ i f(x) δ j g(x) yi i,j = 0 X∞ X i n =0

i

j =n

j

 δ f(x) δ j g(x) yn . 3

The theorem follows.  1.2. Corollary (Leibniz formula for δ1 ). Let f(x) , g(x) ∈ k[x]. Then (4)

δ1 (f(x)g(x)) = δ1 f(x) g(x) + f(x) δ1 g(x).

In other terms, δ1 is a derivation of the ring k[x].  1.3. Lemma. (i) δ0 (1) = 1 and δn (1) = 0 for all n > 1. (ii) δ0 x = x, δ1 x = 1, and δn x = 0 for all n > 2. Proof. As we noted above, δ0 = id. In particular, δ0 (1) = 1. For p(x) = 1, the formula (2) takes the form X∞ (5) 1 = δ0 (1) y0 + δn (1) yn . n =1

Since δ0 (1)y0 = 1 · y0 = 1, the formula (5) implies that X∞ 0= δn (1) yn . n =1

It follows that δn 1 = 0 for n > 1. This proves the part (i) of the lemma. For p(x) = x, the formula (2) takes the form X∞ x + y = δ0 (x) y0 + δ1 (x) y1 + δn (x) yn . n =2

It follows that δ0 (x) = x, δ1 (x) = 1, and δn (x) = 0 for all n > 2. This proves the part (ii) of the lemma.  1.4. Lemma. δ1 (xn ) = nxn

1

for all n ∈ N, n > 1.

Proof. The case n = 1 was proved in Lemma 1.3. Suppose that n ∈ N, n > 1, and we already know that δ1 (xn ) = nxn 1 . By Corollary 1.2, δ1 (xn

1

) = δ1 (x · xn ) = δ1 (x)xn + xδ1 (xn ) = 1 · xn + x(nxn 1 ) = xn + nxn = (n 1) xn .

An application of induction completes the proof.  Remark. By Lemma 1.4, the operator δ1 : k[x] → k[x] agrees on the powers xn ∈ k[x] with the operator d : f(x) 7→ f 0 (x) of taking the usual formal derivative. Since both these 4

operators are k-linear, δ1 = d. But if i ∈ N, i > 2, then the operator δ i is not equal to the operator of taking the i-th derivative. This immediately follows either from Lemma 1.5 or from Theorem 1.7 below. Binomial coefficients. Let n ∈ N. For i ∈ N, i 6 n, we define the binomial coefficients ( i n | n) ∈ N, by the binomial formula (6)

(x + y)i =

Xi n =0

( i n | n) x i

n

yn .

Given arbitrary numbers a, b ∈ N, we define ( a | b ) as ( n b | b ), where n = a b. Given arbitrary integers a, b ∈ Z, we set ( a | b ) = 0 if at least one of the numbers a, b is not in N. We prefer the notation ( a | b ) to the classical one by the typographical reason, and because the new notation helps to bring to the light the fact that we will not use any properties of ( a | b ) except the above definition. 1.5. Lemma. δ n (x i ) = ( i n | n) x i

n.

Proof. It is sufficient to compare (6) with the definition (2) of divided derivatives.  The left shift operator. The left shift operator λ : k[x] → k[x] is just the operator of multiplication by the polynomial x: λ p(x) = λ(p)(x) = xp(x). The reasons for calling λ the left shift operator will be clear later. For us the main property of the left shift operator and the divided derivatives is the following commutation relation. 1

1.6. Theorem. Let us set δ (7)

δn ◦ λ − λ ◦ δn = δn

1

= 0. Then

.

for all n ∈ N. Proof. Let p(x) ∈ k[x], and let y be a variable different from x. By the Leibniz formula from Theorem 1.1,  X  δn x p(x) = δ i (x) δ j p(x) . i

j =n

5

But by Lemma 1.3, δ0 x = x, δ1 x = 1, and δ i x = 0 for i > 2. Therefore    δn x p(x) = xδn p(x) + δn 1 p(x) , or, what is the same,   δn (x p(x)) − xδn p(x) = δn 1 p(x) . Rewriting the last identity in terms of λ, we get    δn λ(p)(x) − λ δn p(x) = δn 1 p(x) , i.e.    δn ◦ λ p(x) − λ ◦ δn p(x) = δn 1 p(x) . Since p(x) ∈ k[x] was arbitrary, this proves the theorem.  The following theorem is not used in the rest of the paper. 1.7. Theorem (Composition of divided derivatives). Let n, m ∈ N. Then δn ◦ δm = (m | n) δn+m . Proof. Let p(x) ∈ k[x]. Let u , z be two new variables different from both x and y. If we apply (2) to u , z in the role of x, y respectively (and use m instead of n), we get X∞ (δm p)(u) zm . (8) p(u z) = m =0

Let us set u = x y and apply (2) to each polynomial (δm p)(x y): X∞ (9) p((x z) y) = (δm p)(x y) zm m =0 X∞  X∞ = δn (δm p)(x) yn zm m =0 n =0 X∞ n = δ ◦ δm (p)(x) yn zm m,n =0

Alternatively, we can apply (2) to y z in the role of y and then apply (6): (10) p(x (y z)) = = = =

X∞ k =0 X∞ k =0

X∞

δk p(x) (y z)k X δk p(x)

m,n =0 X∞ m,n =0

m+n=k

δ

m+n

( m | n) ym zn



p(x) ( m | n) ym zn

( m | n) δm+n p(x) ym zn .

By the associativity of the addition, (x z) y = x (y z) and hence p((x z) y) = p(x (y z)). By combining this equality with (9) and (10) we conclude that δn ◦ δm (p(x)) = ( m | n)(δm+n p(x)) for all p(x) ∈ k[z] and n, m ∈ N. The theorem follows.  6

2. Sequences and duality Sequences. A sequence of elements of a set X is defined as a map N → X. For a sequence s we usually denote the value s(i) , i ∈ N by si and often call it the i-th term of s. The set of all sequences of elements of X will be denoted by SX . We are, first of all, interested in the case when X is a k-module, and especially in the case when X is equal to k considered as a k-module. When it is clear from the context to what set X the terms of the considered sequences belong, we call the sequences of elements of X simply sequences. Let M be a k-module. Then the set SM has a canonical structure of a k-module. The k-module operations on SM are the term-wise addition of sequences and the term-wise multiplication of sequences by elements of k, defined in the following obvious way. The term-wise sum r s of sequences r, s ∈ SM is defined by (r s)i = ri si , and the term-wise product cs of c ∈ k and s ∈ SM is defined by (cs)i = csi . Modules of homomorphisms. Let M 0 , M 00 be k-modules. Then the set Hom(M 0 , M 00 ) of k-homomorphisms M 0 → M 00 has a canonical structure of a k-module. The addition is defined as the addition of k-homomorphisms, and the product aF of an element a ∈ k and a k-homomorphisms F : M 0 → M 00 is defined by (aF)(m) = aF(m), where m ∈ M 0 . Note that the (obvious) verification of the fact that aF is a k-homomorphism uses the commutativity of k. We are mostly interested in the case of M 0 = k[x], and especially in the case of M 0 = k[x] and M 00 = k, where k[x] is considered as a k-module by forgetting about the multiplication of elements of k[x], and the ring k is considered as a module over itself. For the rest of this section M denotes a fixed k-module. A pairing between SM and k[x]. Consider a sequence s ∈ SM and a polynomial X∞ p(x) = ci xi ∈ k[x]. i =0

Of course, the sum here is actually finite, i.e. ci = 0 for all sufficiently large i. Let X∞ h s , p(x) i = ci si ∈ M. i =0

Since ci = 0 for all sufficiently large i, the sum in the right hand side of this formula is well defined. The map h • , •i : SM × k[x] → M 7

defined by h • , •i : (s , p(x)) 7→ hs , p(x) i is our pairing between k[x] and SM . Obviously, it is a k-bilinear map (and hence indeed deserves to be called a pairing). The pairing h• , •i defines a k-linear map  DM : SM → Hom k[x] , M by the usual rule DM (s) : p(x) 7→ hs , p(x) i. Note that, obviously, h s , xi i = s i for every s ∈ SM , n ∈ N. Therefore  (11) s i = DM (s) xi for every i ∈ N and every s ∈ SM . 2.1. Theorem (Duality). The pairing h• , •i is non-degenerate in the sense that the map  DM : SM → Hom k[x] , M is an isomorphism of k-modules. Proof. Note that k-homomorphism F : k[x] → M is determined by its values F(xi ) on the monomials xi , i ∈ N (because every polynomial p(x) ∈ k[x] is a finite sum of powers xi , i ∈ N with coefficients in k ). By (11) the terms si of a sequence s ∈ SM are equal to the values DM (s)(xi ). It follows that all terms of s, and hence the sequence s are determined by the homomorphism DM (s). Therefore, the map DM is injective. In order to prove that DM is surjective, let us consider an arbitrary k-homomorphism F : k[x] → M. Let s ∈ SM be the sequence with si = F(xi ). Then k-homomorphisms F and DM (s) take the same values at all powers xi . It follows that F = DM (s) (cf. the previous paragraph). Therefore, the map DM is surjective. By combining results of the two previous paragraphs with the fact that DM is k-linear, we see that DM is a k-isomorphism. This proves the theorem.  Dual endomorphisms. Each k-endomorphism E : k[x] → k[x] defines its dual endomorphism   E∗ : Hom k[x] , M → Hom k[x] , M by the usual formula E∗ (h) = h ◦ E, for all k-homomorphisms h : k[x] → M. Obviously, if E, F are two k-endomorphisms k[x] → k[x], then (E ◦ F)∗ = F∗ ◦ E∗ . 8

Adjoint endomorphisms. Let M be a k-module. Since DM is an isomorphism by Theorem 2.1, we can use DM to turn the dual map E∗ : Hom(k[x] , M) → Hom (k[x] , M) of an endomorphism E : k[x] → k[x] into a map SM → SM . Namely, let E⊥ = (DM )

1

◦ E∗ ◦ DM : SM → SM ,

or, equivalently, define E⊥ by requiring that DM ◦ E⊥ = E∗ ◦ DM . We will call E⊥ the adjoint endomorphism of E. For every pair E, F : k[x] → k[x] of k-endomorphisms (E ◦ F)⊥ = F⊥ ◦ E⊥ . This immediately follows from the corresponding property (E ◦ F)∗ = F∗ ◦ E∗ of dual endomorphisms. 2.2. Lemma. Let M be a k-module, and let E : k[x] → k[x] be a k-endomorphism. The adjoint map SM → SM is the unique map E⊥ such that (12) h p , E⊥ (s) i = h E(p) , s i for all p = p(x) ∈ k[x], s ∈ SM . Proof. Let p = p(x) ∈ k[x] and let s ∈ SM . By the definition of DM we have:     h p , E⊥ (s) i = DM E⊥ (s) p = DM ◦ E⊥ (s) p ;      h E(p) , s i = DM (s) E(p) = E∗ DM (s) p = E∗ ◦ DM (s) p . Therefore, (12) is equivalent to     (13) DM ◦ E⊥ (s) p = E∗ ◦ DM (s) p . It follows that (12) holds for all p = p(x) ∈ k[x], s ∈ SM if and only if DM ◦ E⊥ = E∗ ◦ DM . The lemma follows. 

3. Adjoints of the left shift and of divided derivatives As in the previous section, M denotes a fixed k-module. The adjoint of the left shift operator. Let L = λ⊥ , where λ is the left shift operator from Section 1. In view of the following lemma call L also the left shift operator.  3.1. Lemma. For every sequence s ∈ SM the terms of the sequence L(s) are L(s) i = s i 9

1.

Proof. Recall that s i = h s , xi i for any sequence s ∈ SM . Together with Lemma 2.2 this fact implies that  L(s) i = h L(s) , xi i = h λ⊥ (s) , xi i = h s , λ(xi ) i = h s , xi

1

i = si

1.

The lemma follows.  3.2. Corollary. For every n ∈ N and every sequence s ∈ SM the terms of the sequence Ln (s)  are Ln (s) i = s i n . Proof. For n = 0 the corollary is trivial, because L0 = id. For n > 1 the corollary follows from Lemma 3.1, if we use an induction by n.  The adjoints of the divided derivatives. Let Dn = (δn )⊥ , where n ∈ N and δn is the n-th divided derivative operator from Section 1. Recall (see Section 1) that δ0 : k[x] → k[x] is the identity of k[x]. Therefore D0 : SM → SM is also the identity of SM . Recall that in Theorem 1.6 we also introduced operator δ δ 1 = 0 by the definition, we have D 1 = 0.

1.

Let D

1

= (δ

1 )⊥ .

Since

The following commutation relations are the most important for our purposes properties of the adjoint operators L = λ⊥ and Dn = (δn )⊥ . 3.3. Theorem. For every α ∈ k and every n ∈ N L ◦ Dn − Dn ◦ L = Dn

1

and

(L α) ◦ Dn − Dn ◦ (L α) = Dn

1

,

where we interpret α as the operator SM → SM of multiplication by α ∈ k. Proof. By taking the adjoint identity of the identity (7) from Theorem 1.6, we get δn ◦ λ

⊥

− λ ◦ δn

⊥

= δn

 1 ⊥

,

and hence λ⊥ ◦ δn

⊥

− δn

⊥

◦ λ⊥ = δn

 1 ⊥

.

In view of the definitions of L and Dn , this implies the first identity of the theorem.

10

Since Dn is a k-linear operator, we have α ◦ Dn = Dn ◦ α , where α is interpreted as the multiplication operator. Clearly, the first identity of the theorem together with α ◦ Dn = Dn ◦ α implies the second one.  The sequences s(α) and s(α, n). Let α ∈ k and n ∈ N. Let us define sequences s(α) and s(α, n) by  s(α) i = α i and s(α, n) = Dn s(α) . Obviously, s(α, 0) = s(α). Note that s(α) 6= 0 even if α = 0, because s(α)0 = α0 = 1 by the definition for all α ∈ k. For explicit formulas for sequences s(α, n) with n > 1 the reader is referred to Theorem 3.6 below. No such formulas are used in this paper. 3.4. Lemma. Let α ∈ k and s ∈ SM . Then s = βs(α), where β ∈ k.

 L α (s) = 0 if and only if s has the form

 α (s) is equivalent to L(s) = αs. The latter condition holds Proof. The condition L   if and only if L(s) n = αs n for all n ∈ N. By Lemma 3.1 L(s) n = s n 1 . Therefore,  L α (s) = 0 if and only if s n 1 = αs n for all n ∈ N. An application of the induction completes the proof.  3.5. Lemma. Suppose that a ∈ N, α ∈ k, and s ∈ SM . If n > a, then a  L α s(α, n) = s(α, n a). Proof. The lemma is trivial if a = 0. Let us prove the lemma for a = 1. In view of the definition of sequences s(α, n), we need to prove that    L α Dn s(α) = Dn 1 s(α) . By applying the second identity of Theorem 3.3 to s(α), we get    (L α) ◦ Dn s(α) − Dn ◦ (L α) s(α) = Dn 1 s(α) , which is equivalent to    (L α) Dn s(α) − Dn (L α) s(α) = Dn 1 s(α) .   Since L α s(α) = 0 by Lemma 3.4, we see that   (L α) Dn s(α) = Dn 1 s(α) , 11

  i.e. L α s(α, n) = s(α, n 1). This proves the lemma for a = 1. The general case follows from this one by induction.  The following theorem is not used in the rest of the paper. 3.6. Theorem. Let n ∈ N. For every sequence s ∈ SM the terms of the sequence Dn (s) are  Dn (s) i = (i n | n) s i n . In addition, for every α ∈ k the terms of the sequence s(α, n) are  s(α, n) i = (i n | n) α i n . Proof. Recall that s i = h s , x i i for any sequence s ∈ SM . Together with Lemma 2.2 this fact implies that  Dn (s) i = h Dn (s) , x i i = h (δn )⊥ (s) , x i i = h s , δ n (x i ) i. Since δ n (x i ) = ( i n | n) x i

n

by Lemma 1.5, we have

h s , Dn (xi ) i = h s , ( i n | n) xi

n

i

= ( i n | n) h s , Dn (x i = ( i n | n) s i n .

n

)i

The first part of the theorem follows. Let us apply the first part to s = s(α). We get  s(α, n) = Dn s(α) i = (i n | n) s(α) i

n

= (i n | n) α i

n

.

This proves the second part of the theorem. 

4. Endomorphisms and their eigenvalues Representation of the polynomial algebra defined by an endomorphism. Let x be a variable, and let k[x] be the k -algebra of polynomials in x with coefficients in k. Let M be a k-module. The k-endomorphisms M → M form a k -algebra End M with the 12

composition as the multiplication. For every k-endomorphisms E : M → M and every a ∈ N we will denote by Ea the a-fold composition E ◦ E ◦ . . . ◦ E. As usual, we interpret the 0-fold composition E0 as the identity endomorphism id ∈ End M. For a k-module endomorphism E : M → M and a polynomial (14) f(x) = c0 xn c1 xn

c2 xn

1

2

...

cn ∈ k[x]

one can define an endomorphism f(E) : M → M by the formula (15) f(E) = c0 En c1 En

1

c2 En

2

...

cn .

The map f(x) 7→ f(E) is a homomorphism k[x] → End M of k-algebras. This follows from the obvious identities xa xb = xa b and Ea ◦ Eb = Ea b . This homomorphism defines a structure of k[x]-module on M. Of course, this structure depends on E. We will denote by k[E] the image of the homomorphism f(x) 7→ f(E). Since k[x] is commutative, the image k[E] is a commutative subalgebra of End M. Eigenvalues. Suppose that a k-module endomorphism E : M → M is fixed. Let α ∈ k. The kernel Ker(E α) is called the eigenmodule of E corresponding to α and is denoted also by Eα . Clearly, Eα is a k-submodule of M. An element α ∈ k is called an eigenvalue of E if the kernel Eα = Ker(E α) 6= 0. The set of elements v ∈ M such than (E α)i (v) = 0 for some i ∈ N is called the extended eigenmodule of E corresponding to α and is denoted by Nil(α). Clearly, Nil(α) is a ksubmodule of M. 4.1. Lemma. Let α ∈ k. Then the following statements hold. (i) The submodules Eα and Nil(α) are E-invariant. (ii) Eα and Nil(α) are k[x]-submodules of M. (iii) The submodule Nil(α) is non-zero if and only if α is an eigenvalue. Proof. Let us prove (i), (ii) first. Note that (E α)i ◦ E = E ◦ (E α)i for every i ∈ N, because k[E] is a commutative subalgebra of End M. Therefore, if (E α)i (v) = 0, then       (E α)i (E(v)) = (E α)i ◦ E (v) = E ◦ (E α)i (v) = E (E α)i (v) = 0. In the case i = 1 this implies that E(Eα ) ⊂ Eα . In general, this implies that E(Ker(E α)i ) ⊂ Ker(E α)i , 13

and hence E(Nil(α)) ⊂ Nil(α). This proves (i), and (ii) immediately follows. Finally, let us prove (iii). Suppose that v 6= 0 and (E α)i (v) = 0. Let i be the smallest integer such that (E α)i (v) = 0. Note that i > 0 because v 6= 0. Let w = (E α)i 1 (v). Then w 6= 0 and (E α)(w) = 0. Therefore Eα = Ker(E α) 6= 0. This proves (iii).  Torsion free modules. A k-module M is called torsion-free, if αm = 0 implies that either α = 0, or m = 0, where α ∈ k and m ∈ M. Since k is assumed to be a ring without zero divisors, kn is a torsion free module for any non-zero n ∈ N. For the rest of this section we will assume that M is a torsion-free module. 4.2. Lemma. Let α1 , α2 , . . . , αn be distinct eigenvalues of an endomorphism E : M → M. Let Ker1 , Ker2 , . . . , Kern be the corresponding eigenmodules, i.e. Keri = Ker(E αi ) for each i = 1, 2, . . . , n. Then the sum of these eigenmodules is a direct sum, i.e. an element v ∈ Ker1 Ker2

...

Kern admits only one presentation v = v1 v2

...

vn

with vi ∈ Keri for all i = 1, 2 , . . . , n. Proof. It is sufficient to prove that if v1 v2 . . . vn = 0 and vi ∈ Keri for all i, then v1 = v2 = . . . = vn = 0. Suppose that v1 v2 . . . vn = 0, vi ∈ Keri for all i, and not all vi are equal to 0. Consider the maximal integer m such that (16) v1 v2

...

vm = 0

for some elements vi ∈ Keri such that vm 6= 0. Note that in this case vi 6= 0 also for some i 6 m 1, in view of (16). By applying E αm to (16), we get (E αm )(v1 )

...

(E αm )(vm

1)

(E αm )(vm ) = 0.

Since vi ∈ Keri = Ker(E αi ) and therefore E(vi ) = αi vi for all i, we see that (17) (α1 αm )v1

...

(αm

1

αm )vm

1

(18) (α1 αm )v1

...

(αm

1

αm )vm

1

(αm αm )vm = 0, = 0.

Since the eigenvalues αi are distinct, αi αm 6= 0 for i 6 m 1. Since our module M is assumed to be torsion-free, this implies that (αi αm )vi 6= 0 if i 6 m 1 and vi 6= 0. As we noted above, vi 6= 0 for some i 6 m 1. Therefore, the equality (18) contradicts to the choice of m. This contradiction proves the lemma.  4.3. Lemma. Let α1 , α2 , . . . , αn be distinct eigenvalues of an endomorphism E : M → M. Let Nil 1 , Nil 2 , . . . , Nil n be the corresponding extended eigenmodules, i.e. Nil i = Nil(αi ) for i = 1, 2 , . . . , n. Then the sum of these extended eigenmodules is a direct sum, i.e. an element v ∈ Nil 1

Nil 2

...

Nil n admits only one presentation v = v1 v2

with vi ∈ Nil i for all i = 1, 2, . . . , n. 14

...

vn

Proof. It is sufficient to prove that if v1 v2 . . . vn = 0 and vi ∈ Nil i for all i, then v1 = v2 = . . . = vn = 0. Suppose that v1 v2 . . . vn = 0, vi ∈ Nil i for all i, and not all vi are equal to 0. The proof proceeds by replacing, in several steps (no more than n), the original elements vi by new ones in such a way that eventually not only vi ∈ Nil i , but, moreover, vi ∈ Keri = Ker(E αi ), and still not all vi are equal to 0. Obviously, this will contradict to Lemma 4.2. Let Ei = E αi for all i = 1, 2, . . . , n. If i = 1 , 2, . . . , n 1, or n, then E0i (vi ) 6= vi and Eai (vi ) = 0 for some integer a > 1. If vi 6= 0, then we define ai as the largest integer a > 0 such Eai (vi ) 6= 0. Then Eai i (vi ) 6= 0 and Eai i 1 (vi ) = 0. In particular, (E αi )(Eai i (vi )) = Ei (Eai i (vi )) = Eiai

1

(vi ) = 0,

and hence Eai i (vi ) ∈ Keri . If vi = 0, then we set ai = 0 and Eai i (vi ) ∈ Keri is still true. Let us fix an integer k between 1 and n. Let wi = Eak k (vi ), where i = 1 , 2, . . . , n. By applying Eak k to v1 v2 . . . vn = 0, we conclude w1 w2 . . . wn = 0. Note that since the submodules Nil i are E-invariant by Lemma 4.1, wi ∈ Nil i for every i. Claim 1. If vi 6= 0, then wi 6= 0. Proof of Claim 1. If i = k and vi = vk 6= 0, then wi = wk = Eak k (vk ) 6= 0 by the choice of ak . Suppose that i 6= k and vi 6= 0. Then   Eai i (wi ) = Eai i Eak k (vi ) = Eak k Eai i (vi ) But Eai i (vi ) ∈ Keri and Eai i (vi ) 6= 0 by the choice of ai . Since E acts of Keri as the multiplication by αi , we have    Eak k Eai i (vi ) = (E αk )ak Eai i (vi ) = (αi αk )ak Eai i (vi ) . Since αi 6= αk and k is a ring without zero divisors, (αi αk )ak 6= 0. Since M is a torsion free k-module and Eai i (vi ) 6= 0, this implies that (αi αk )ak Eai i (vi ) 6= 0. It follows that  Eai i (wi ) = (αi αk )ak Eai i (vi ) 6= 0, and hence wi 6= 0. This completes the proof of the claim.  Claim 2. If vi ∈ Keri , then wi ∈ Keri . Proof of Claim 2. Suppose that vi ∈ Keri , i.e. Ei (vi ) = 0. Since Ei = E αi and Ek = E αk obviously commute, it follows that Ei (wi ) = Ei (Eak k (vi )) = Eak k (Ei (vi )) = 0. This proves the claim.  To sum up, we see that by applying Eak k to the equality v1 v2 . . . vn = 0 with vi ∈ Nil i for all i we get another equality w1 w2 . . . wn = 0 such that for all i: 15

(i) wi ∈ Nil i ; (ii) if vi 6= 0, then wi 6= 0; (iii) if vi ∈ Keri , then wi ∈ Keri . In addition, wk = Eak k (vk ) ∈ Kerk even if vk did not belonged to the eigenmodule Kerk . Therefore, we can take w1 , w2 , . . . , wn as the new elements v1 , v2 , . . . , vn , increasing the number of elements belonging to the corresponding eigenmodules by an appropriate choice of k (if some vi did not belonged to eigenmodules yet). It follows that by starting with the equality v1 v2 . . . vn = 0 and consecutively applying endomorphisms Eak k for k = 1, 2, . . . , n, we will eventually prove the equality v1 v2 . . . vn = 0 for some new vectors vi such that vi ∈ Keri for all i, and still not all vi are equal to 0. The contradiction with Lemma 4.2 completes the proof.  4.4. Lemma. Let E : M → M be an endomorphism of M and let α be an eigenvalue of E. Suppose that v ∈ Nil(α). Let a > 0 be the largest integer such that (E α)a (v) 6= 0, and let vi = (E α)i (v) for i = 0, 1, . . . , a. Then the homomorphism ka 1 → M defined by (x0 , x1 , . . . , xa ) 7→ x0 v0 x1 v1

...

xa va

is an isomorphism onto its image. In particular, v0 , v1 , . . . , va are free generators of a free k-submodule of M. Proof. It is sufficient to prove that our homomorphism is injective. In other terms, it is sufficient to prove that if (19) x0 v0 x1 v1

...

xa va = 0

for some x0 , x1 , . . . , xa ∈ k, then x0 = x1 = . . . = xa = 0. Suppose that (19) holds and xi 6= 0 for some i. Let b ∈ N be the minimal integer with the property xb 6= 0. Let us apply (E α)a b to (19). Note that if i > b, then   (E α)a b (vi ) = (E α)a b (E α)i (v) = (E α)a b i (v) = 0 because a b i > a and (E α)n (v) = 0 for n > a by the choice of a. Therefore, the operator (E α)a b takes the left hand side of (19) to   a b a b b xb (E α) (vb ) = xb (E α) E α) (v) = xb (E α)a b b (v) = xb (E α)a (v), and hence the result of application of (E α)a

b

to (19) is

(20) xb (E α)a (v) = 0. But (E α)a (v) 6= 0 by the choice of a, and xb 6= 0 by the choice of b. Since the module M is assumed to be torsion free, these facts together with (20) lead to a contradiction. This contradictions shows that (19) may be true only if xi = 0 for all i.  16

5. Torsion modules and a property of free modules Torsion modules. An element m ∈ M of a k-module M is called a torsion element if x m = 0 for some non-zero x ∈ k. A k-module M is called a torsion module if every element of M is a torsion element. 5.1. Lemma. Let n ∈ N. If M is a k-submodule of a k-module N and both M and N are isomorphic to kn , then the quotient N/M is a torsion module. Proof. Suppose that N/M is not a torsion module. Then there is an element v ∈ N/M such that αv 6= 0 if α 6= 0. For such a v the map α 7→ αv is an injective homomorphism of k-modules k → N/M. Let us lift v ∈ N/M to an element v0 ∈ N, so v is the image of v0 under the canonical surjection N → N/M. Then the map α 7→ αv0 is an injective homomorphism of k-modules k → N. Clearly, if v1 , v2 , . . . , vn is a basis of M (which exists because M is isomorphic to kn ), then v0 , v1 , . . . , vn is a basis of kv0 M. Therefore, kv0 M is a submodule isomorphic to kn 1 of the module N isomorphic to kn . In particular, there exist an injective khomomorphism J : kn 1 → kn . Since k has no zero divisors, it can be embedded into its field of fractions, which we will denote by F. Moreover, the k-homomorphism kn 1 → kn extends to an F-linear map Fn 1 → Fn , which we will denote by JF . Claim. JF is injective. Proof of the claim. Suppose (y0 , y1 , . . . , yn ) ∈ Fn 1 is non-zero and belongs to the kernel of JF . Since F is the field of fractions of k, there is an element z ∈ k such that z y0 , z y1 , . . . , zyn ∈ k. For such an element z ∈ k the (n 1)-tuple (zy0 , zy1 , . . . , zyn ) belongs to kn 1 , and J(z y0 , zy1 , . . . , zyn ) = JF (zy0 , zy1 , . . . , zyn ) = z JF (y0 , y1 , . . . , yn ) = z 0 = 0. Since F is the field of fractions of k, (y0 , y1 , . . . , yn ) 6= 0 implies that the (n 1)-tuple (z y0 , z y1 , . . . , zyn ) 6= 0. At the same time this (n 1)-tuple belongs to the kernel of J, in contradiction with the injectivity of J. The claim follows.  As is well known, for a field F there are no injective F-linear maps Fn 1 → Fn . The contradiction with the above claim proves that N/M is indeed a torsion module. 

17

6. Polynomials and their roots 6.1. Lemma. Let p(x) ∈ k[x] be a polynomial with leading coefficient 1, and let α ∈ k. Then α ∈ k is a root of p(x) if and only if (21) p(x) = (x − α)q(x) for some polynomial q(x) ∈ k[x] with leading coefficient 1. If α is a root, then q(x) is uniquely determined by (21). Proof. Suppose that p(x) = (x − α)q(x) and both p(x) , q(x) have the leading coefficients 1. Then deg q(x) = deg p(x) 1 and the polynomials p(x) , q(x) have the form p(x) = xn + c1 xn 1 + . . . + an 1 x + cn , q(x) = xn 1 + d1 xn 2 + . . . + dn 2 x + dn 1 ∈ k.

where c1 , . . . , cn , d1 , . . . , dn (x − α)q(x) = (x − α) xn = xn + d1 xn − αxn

1,

Then

1

+ . . . + dn 1 + . . . + dn 1 − . . . − αdn

2x

+ dn 2 2 x + dn 2 3 x − αdn

 1 1x 2 x − αdn 1 .

It follows that p(x) = (x − α)q(x) if and only if c1 = d1 − α c2 = d2 − αd1 ... = ... ... cn 1 = dn 1 − αdn cn = −αdn 1 ,

2

or, equivalently,

αdn

d1 = α + c1 d2 = αd1 + c2 ... = ... ... dn 1 = αdn 2 + cn 1 + cn = 0.

1

These equalities allow to compute the coefficients d1 , d2 , . . . , dn efficients c1 , c2 , . . . , cn 1 . Namely, d1 = α + c1 and d i = αi + c1 αi

1

+ c2 αi

2

+ . . . + ci

1 α + ci

18

1

in terms of the co-

for 2 6 i 6 n 1. Therefore, the last equality αdn  α αn 1 + c1 αn 2 + . . . cn 1 α + cn = 0,

1

cn = 0 holds if and only if

i.e. if and only if p(α) = 0. The lemma follows.  6.2. Corollary. Let p(x) ∈ k[x] be a polynomial with leading coefficient 1, and let α ∈ k. Then there is a number m ∈ N and a polynomial r(x) ∈ k[x] such that p(x) = (x α)m r(x) and α is not a root of r(x). The number m and the polynomial r(x) are uniquely determined by p(x) and α. Proof. If p(α) 6= 0, then, obviously, m = 0 and r(x) = p(x). If p(α) = 0, we can apply Lemma 6.1. If q(α) 6= 0, then m = 1, r(x) = q(x) and we are done. If q(α) = 0, then we can apply Lemma 6.1 again. Eventually we will get a presentation p(x) = (x α)m r(x) such that r(α) 6= 0. By consecutively applying the uniqueness part of Lemma 6.1, we see that (x α)m 1 r(x), (x α)m 2 r(x) , . . . , and, eventually, m and r(x) are uniquely determined by p(x) and α.  The multiplicity of a root. We will denote by deg p(x) is the degree of the polynomial p(x). If α is a root of p(x), then the number m from Corollary 6.2 is called the multiplicity of the root α. 6.3. Corollary. Let p(x) ∈ k[x] be a polynomial with leading coefficient 1. The number k of distinct roots of p(x) is finite and k 6 deg p(x). If α1 , α2 , . . . , αk is the list of all distinct roots of p(x), and if µ1 , µ2 , . . . , µk are the respective multiplicities of these roots, then µ µ µ (22) p(x) = x − α1 1 x − α2 2 · · · x − αk k r(x), where r(x) ∈ k[x] has no roots in k. The polynomial r(x) is uniquely determined by p(x). Proof. By consecutively applying the existence part of the Corollary 6.2, we see that a factorization of the form (22) exists. Similarly, the uniqueness of r(x) follows from the uniqueness part of Corollary 6.2.  Polynomials with all roots in k. Again, let p(x) ∈ k[x] be a polynomial with leading coefficient 1. We say that p(x) has all roots in k, if in the factorization (22) the polynomial r(x) = 1. In other words, p(x) has all roots in k, if p(x) has the form µ µ µ p(x) = x − α1 1 x − α2 2 · · · x − αk k , for some α1 , α2 , . . . , αk ∈ k and some non-zero µ1 , µ2 , . . . , µk ∈ N. Obviously, then deg p(x) = µ1 µ2

...

µk . 19

7. The main theorems Let us fix for the rest of this section a polynomial p(x) = xn + c1 xn

1

1 x + cn ∈ k[x]

+ . . . + cn

with the leading coefficient 1. Consider the left shift operator L : Sk → Sk from Section 2. As it was explained in Section 4, the operator L defines a homomorphism of k -algebras k[x] → End Sk by the rule f(x) 7→ f(L). We are interested in the kernel Ker p(L). 7.1. Lemma. A sequence s ∈ Sk belongs to Ker p(L) if and only if (23) s i + c1 s i

1

+ . . . + cn

1 si n

+ cn s i

1

n

=0

for all i ∈ N, i > n. Proof. Let us compute the terms of p(L)(s), using Corollary 3.2 at the last step:   p(L)(s) i = Ln + c1 Ln 1 + . . . + cn 1 L + cn )(s) i     = Ln (s) i + c1 Ln 1 (s) i + . . . + cn 1 L(s) i + cn s i = si

n

+ c1 s i

n 1

+ . . . + cn

1 si

1

+ cn s i .

This calculation shows that s ∈ Ker f(L) if and only if (24) s i

n

+ c1 s i

n 1

+ . . . + cn

1 si

1

+ cn s i = 0

for all integers i > 0. Clearly, (24) holds for all integers i > 0 if and only if (23) holds for all integers i > n. The lemma follows.  Remark. Classically, a sequence s ∈ Sk is called recurrent if its terms satisfy (23) for all i ∈ N, i > n, and the equation (23) is called a linear recurrence relation. This explains the title of the paper. 7.2. Lemma. The map F : Ker p(L) → kn defined by F : s 7→ (s0 , s1 , . . . , sn

1)

∈ kn

is an isomorphism. In particular, Ker p(L) is a free k-module of rank n.

20

Proof. By Lemma 7.1 the kernel Ker p(L) is equal to the k-submodule of Sk consisting of sequences s satisfying the relation (23) for all i ∈ N, i > n. Clearly, (23) allows to compute each term si , i > n of s as the linear combination of n immediately preceding terms si 1 , si 2 , . . . , si n of s with coefficients c1 , c2 , . . . , cn independent of i. Therefore, such a sequence s is determined by its first n terms s0 , s1 , . . . , sn 1 . Moreover, these n terms can be prescribed arbitrarily. The lemma follows.  7.3. Theorem. Suppose that α ∈ k is a root of p(x) of multiplicity µ. Then p(L)(s(α, a)) = 0 for each a ∈ N, 0 6 a 6 µ 1, where s(α, a) are the sequences defined in Section 3, the paragraph immediately preceding Lemma 3.4. Proof. By Corollary 6.2, p(x) has the form p(x) = (x − α)µ q(x). Therefore µ µ (25) p(L) = L − α q(L) = q(L) L − α . µ If a 6 µ 1, then µ 1 a > 0, and we can present L − α as the following product: (26)

µ µ L−α = L−α

1 a

 a L−α L−α .

By Lemma 3.5 a  (27) L − α s(α, a) = s(α, a a) = s(α, 0) = s(α), and by Lemma 3.4,   (28) L − α s(α) = 0. By combining (26), (27), and (28), we see that µ  L−α s(α, a) = 0. By combining the last equality with (25), we get  µ  p(L) s(α, a) = q(L) L − α s(α, a) = q(L)(0) = 0. The theorem follows.  7.4. Theorem. Suppose that p(x) has all roots in k. Let k be the number of distinct roots of p(x), let α1 , α2 , . . . , αk be these roots, and let µ1 , µ2 , . . . , µk be, respectively, the multiplicities of these roots. Then sequences s(αu , a), where 1 6 u 6 k and 0 6 a 6 µu 1, are free generators of a free k-submodule of Ker f(L).

21

Proof. By Theorem 7.3, all these sequences belong to Ker p(L). In particular, they are generators of a k-submodule of Ker p(L). Let us prove that they are free generators. Let 1 6 u 6 k. By Lemma 3.5, µ 1  (29) L αu u s(αu , µu 1) = s(αu , (µu 1) (µu 1)) = s(αu , 0) = s(αu ).   By Lemma 3.4, L αu s(αu ) = 0 and hence (L αu )µu s(αu , µu 1) = (L αu )(s(α)) = 0. In particular, s(αu , µu 1) belongs to the extended eigenmodule Nil(αu ) of the left shift L : Sk → Sk . In other terms, s(αu , µu 1) ∈ Nil(αu ). In addition, (29) together with the fact that s(αu ) 6= 0 implies that µu 1 is the largest integer a such that a  L αu s(αu , µu 1) 6= 0.  Lemma 4.4 implies that the sequences (L αu )a s(α, µu 1) for a = 0, 1, . . . , µu 1 form a basis of a free submodule of Nil(αu ) ⊂ Ker p(L). Since a  L αu s(αu , µu ) = s(αu , µu a) by Lemma 3.5, this implies that the sequences s(αu , a) for a = 0, 1, . . . , µu 1 form a basis of a free submodule of Nil(αu ) ⊂ Ker p(L). By combining this result with Lemma 4.3, we see that the sequences s(αu , a) from the theorem form a basis of a free submodule of Ker p(L). This completes the proof.  7.5. Theorem. Let S ⊂ Ker p(L) be the free k-module generated by the sequences s(αu , a) from Theorem 7.4. Then the quotient k-module (Ker p(L))/S is a torsion module. Proof. Let n = deg p(x). By Lemma 7.2, Ker p(L) is a free module of rank n, i.e. is isomorphic to kn . Since n = µ1 . . . µk , we have exactly n sequences s(αu , a). By Theorem 7.4, they are free generators of S. In particular, S is also isomorphic to kn . It remains to apply Lemma 5.1.  7.6. Corollary. If k is a field, then S = Ker p(L). Proof. A torsion module over a field is equal to 0.  December 30, 2014 May 2, 2015 (minor edits) http://nikolaivivanov.com

22