Fixed Argument Pairings Craig Costello and Douglas Stebila Information Security Institute Queensland University of Technology, GPO Box 2434, Brisbane QLD 4001, Australia {craig.costello,stebila}@qut.edu.au
Abstract. A common scenario in many pairing-based cryptographic protocols is that one argument in the pairing is fixed as a long term secret key or a constant parameter in the system. In these situations, the runtime of Miller’s algorithm can be significantly reduced by storing precomputed values that depend on the fixed argument, prior to the input or existence of the second argument. In light of recent developments in pairing computation, we show that the computation of the Miller loop can be sped up by up to 37% if precomputation is employed, with our method being up to 18% faster than the previous precomputation techniques. Keywords: Pairings, Miller’s algorithm, Tate pairing, ate pairing, precomputation.
1
Introduction
Boneh and Franklin were among the first to present a pairing-based cryptosystem in 2001, when they exploited the powerful bilinearity property of pairings to construct the first efficient and provably secure identity-based encryption (IBE) scheme [9], answering an open question first posed by Shamir in 1984 [42]. Boneh and Franklin’s discovery, alongside Joux’s one round tripartite key agreement protocol [28], were major breakthroughs that gave rise to the now thriving field of pairing-based cryptography and a vast array of new pairing-based primitives that followed, including short signatures [11], hierarchical encryption [23], group signatures [8], ring signatures [10], identity-based key agreement schemes [43, 13, 32], an IBE scheme secure in the standard model [45], as well as generalizations of IBE such as attribute-based encryption [24, 30]. The computation of pairings via Miller’s algorithm [34] was initially too slow for them to be considered practically viable, but this did not remain the case for long. Much research has concentrated on achieving pairings at high security levels that are efficient enough for industrial applications. Although use of the Weil pairing was initially proposed, it quickly became evident that the computation of the Tate pairing was much faster in practice [9, 21, 22, 3]. A decade after their introduction into cryptosystems, the computation of a pairing has accelerated from older implementations that took a few minutes [33], to current state-of-the-art implementations that take only a few milliseconds [25]. Many of the avenues for improving the computation speed of cryptographic pairings on elliptic curves have been somewhat exhausted: – the Miller loop has been truncated to optimal lengths [2, 27, 31, 44, 26]; – improvements inside the Miller iterations, such as denominator elimination, have been thoroughly explored [3, 4]; – the choice of the groups G1 and G2 as the two eigenspaces of the Frobenius endormorphism is now standard [27, 18]; – there is now a flexible array of curve constructions that facilitate fast implementations [20]; – different elliptic curve representations and coordinates systems have been explored in efforts to reduce the field operations encountered when computing the Miller functions [1, 16, 17]; and
2
C. Costello and D. Stebila
– many other improvements have become standard in optimal implementations, such as compressed pairings [37, 35], optimized field constructions [29, 6], fast hashing [38], and fast final exponentiation [39] techniques. While these techniques have allowed for relatively fast implementations of pairings, there are still some possible improvements that are yet to be explored to their full potential. In this paper we explore one such technique: the implementation of pairings with one argument fixed, which allow for improvements in runtime performance based on precomputations. In the pairing e(R, S) of two points R and S on an elliptic curve, we call the argument R fixed if it is constant in terms of the corresponding protocol that employs the pairing, for example, R is a long-term private key that is “paired” with ephemeral or dynamic values (different S values) at runtime. Scott [36] was the first to comment on the possibility of exploiting the nature of a fixed argument R, suggesting to precompute all multiples of the point R required in the Miller loop. Scott et al. [40] took this idea further by also precomputing the gradients, λi , of each of the line functions that arise in the Miller iterations, since these are also values depending solely on the first argument R. Essentially, this means that the coefficients of the affine Miller functions have been computed in advance and are just “waiting” as indeterminate functions until S is known, so that they can be evaluated and used to update the Miller function. We adopt this same approach by performing all computations that are solely dependent on the fixed argument R, and storing these before the second argument S is available. Since R-dependent computations can be performed once only and stored in the long term, it is only the subsequent Sdependent computations that affect the runtime of the pairing algorithm. The question that we address in this paper arises naturally: are there further R-dependent (pre)computations that can be done to reduce the workload encountered when the dynamic second argument S is known? In light of recent developments in pairing computation [14, 15], we address this question by considering another: is it computationally cheaper to evaluate a function before operating on its result, or can it be advantageous to operate on the indeterminate function before evaluating it? When neither of the pairing arguments are fixed, the results in [14] and [15] show that many cases of pairing implementations favour the latter. In the case of fixed argument pairings, this becomes even more evident when we observe that any computations on the indeterminate Miller functions are entirely R-dependent and can be done in advance. We show that for both the Tate and ate pairings, the Miller loop can be computed with between 25 and 37% fewer field multiplications (or squarings) if an optimal amount of precomputation is performed in the fixed argument, compared to the case when no precomputation is performed. The rest of this paper is organized as follows. In Section 2, we present background information on computing pairings using Miller’s algorithm. In Section 3, we separate the R-dependent precomputations from the S-dependent dynamic computations in preparation for Section 4, where we present our main technical innovation: we show how to merge multiple iterations of the loop in Miller’s algorithm in the precomputation stage to reduce the cost of the S-dependent computations. Section 5 presents the optimal number of iterations to merge for various security levels, and Section 6 lists a number of cryptosystems which can benefit from our techniques. We conclude in Section 7. In Appendix A we give example MAGMA code that implements our technique.
2
Preliminaries
Let E be an elliptic curve defined over a large prime field Fp which is given by the short Weierstrass equation E : y 2 = x3 + ax + b, the identity element of which is denoted by O. Let r be a large prime
Fixed Argument Pairings
3
and let E[r] be the group of points with order r (the r-torsion) on E. Let k > 1 be the smallest integer such that r | q k −1; we call k the embedding degree. Let πp be the p-power Frobenius endormorphism on E. The two eigenspaces of πp , restricted to the r-torsion on E, form two linearly independent groups of order r, written as G1 = E[r] ∩ ker(πp − [1]) and G2 = E[r] ∩ ker(πp − [p]). The group G1 is defined over the base field Fp , whilst the group G2 is defined over the full extension field Fpk . A bilinear pairing e of two points R, S ∈ E can be computed as e(R, S) = fm,R (S)(q
k −1)/r
,
where m ∈ Z, and fm,R is a function with divisor div(fm,R ) = m(R) − ([m]R) − (m − 1)(O). For our purposes, k will always be even, which allows us to write the function fm,R as a polynomial in x and y so that fm,R (x, y) is evaluated at the coordinates of the point S when the pairing is computed. For the Tate pairing, the first argument R comes from G1 and the second argument S comes from G2 , whilst the ate pairing takes R ∈ G2 and S ∈ G1 . In either case, the function fm,R (S) evaluates as an element of the finite field Fpk , and this value is raised to the power (q k − 1)/r in the “final exponentiation” stage. The function fm,R is called the Miller function, since it is constructed in a double-and-add like fashion using Miller’s algorithm, which is described in Algorithm 1. Algorithm 1 Miller’s affine double-and-add algorithm with denominator elimination Input: R = (xR , yR ), S = (xS , yS ), m = (ml−1 ...m1 , m0 )2 . Output: fm,R (S) ← f . 1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14:
T ← R, f ← 1. for i from l − 2 to 0 do Compute g(x, y) = y − yT + λ(xT − x), where λ is the gradient of the tangent line to T . T ← [2]T = [2](xT , yT ). g ← g(xS , yS ). f ← f 2 · g. if mi 6= 0 then Compute g(x, y) = y − yT + λ(xT − x), where λ is the gradient of the line joining T and R. T ← T + R. g ← g(xS , yS ). f ← f · g. end if end for return f .
If we were to discard lines 5 and 10 of Algorithm 1, the output would be an indeterminate function fm,R (x, y) of degree m, rather than a field element fm,R (xS , yS ) ∈ Fpk . The reason we include lines 5 and 10 and evaluate the intermediate functions in each iteration is because m is usually quite large (for example, no smaller than 220 ), and storing so many coefficients is infeasible; memory constraints force us to evaluate the g functions as we go. Modern pairing implementations make use of the twisted curve E 0 of E, to define a group G02 ∈ E 0 that is isomorphic to G2 ∈ E, but is defined over a smaller subfield Fpe of Fpk . We let ψ : E 0 → E be the twisting isomorphism from E 0 to E, so that ψ(G02 ) = G2 . A point P ∈ G2 is usually written as P = P 0 · α, where P 0 ∈ G02 is defined over Fpe and α is an algebraic element used in the twisting isomorphism (cf. [27, 17]). In both the Tate and ate pairings, such points in G2 are often multiplied by elements in the base field Fp , and because of the representation of G2 over the twisted curve, these multiplications are counted as e multiplications in the base field.
4
C. Costello and D. Stebila
To count field multiplications and squarings across the different fields employed in pairing computations, we use mi and si to respectively denote a multiplication and a squaring in the field Fpi . We maintain generality (across both Tate and ate like pairings) by assuming that the first argument R is written as an element of Fpu and that the second argument S is written as an element of Fpv , where it is understood that the Tate pairing has (u, v) = (1, e) and that the ate pairing has (u, v) = (e, 1). In both cases, the argument defined over Fpk is actually treated as an element of Fpe using G02 .
3
R-dependent versus S-dependent Computations
We begin by separating the R-dependent (fixed) computations from the S-dependent (dynamic) computations. Our goal is to minimize the computational complexity of the S-dependent computations. When the first argument in the pairing, R, is a fixed or constant parameter in a protocol, Scott et al. [40] propose precomputing and storing all the multiples of R (the T values in Algorithm 1) and all of the gradients (the λ values) in each of the Miller lines (the functions g). This is essentially the same as writing each of the Miller lines as g(x, y) = y − λx − c, where c = yT − λxT is precomputed and stored alongside λ (see Algorithm 1). We prefer to precompute and store (λ, c) at each iteration, rather than (xT , yT , λ), since this saves an extra multiplication (λ by xT ) at runtime and only requires the storage of two values for each iteration, as well as somewhat simplifying the description of what the precomputation achieves. Namely, we do not store the multiples of the point R, since they are not necessarily required once S is input. Instead, we compute all of the R-dependent coefficients of the Miller line functions that are required, in complete preparation for the “arrival” of the argument S. In this light, fixing one argument in the pairing allows us to split Miller’s algorithm into two parts. The first part involves all of the R-dependent (pre)computations that can be performed in advance: computing a set of indeterminate Miller lines defined by (λ, c). The second part involves all of the S-dependent computations, namely those which cannot be performed until the argument S is known. We describe the R-dependent precomputations in Algorithm 2 and the S-dependent dynamic computations in Algorithm 3. For ease of exposition, we assume from here on that the Miller lines are ˜ + c˜, instead of the usual g(x, y) = y − λx − c, by taking λ ˜ = −λ and of the form g(x, y) = y + λx c˜ = −c, and make an abuse of notation by relabeling and writing g(x, y) = y + λx + c from now on. We use #DBL and #ADD in both algorithms to denote the number of doublings and additions, respectively, that occur in the run of Miller’s algorithm. Clearly, #DBL = l − 1 (from Algorithm 1) and #ADD is equal to the number of non-zero bits in the binary representation of the loop parameter m (excluding the most significant bit). We also write the binary representation of m from m = (ml−1 ...m1 , m0 )2 to m = (m ˜ 0, m ˜ 1 ...m ˜ #DBL−1 , m ˜ #DBL )2 , so that m ˜ 0 is now the most significant bit, and Miller’s algorithm proceeds from m ˜ 1 to m ˜ #DBL ; we relabel so that m = (m0 , m1 ...m#DBL−1 , m#DBL )2 from now on. It is important to note that we are solely focussed on minimizing the computational complexity of the algorithm that is S-dependent. We are assuming that the R-dependent precomputations are carried out well in advance on a platform that is not too restricted (within reason) with computational time. For example, in pairings where both arguments are dynamic, one would never compute the Miller point operations and the Miller line functions in affine coordinates, as this involves costly field inversions. Such pairings always resort to avoiding these inversions by using projective coordinates, but in these cases the Miller lines that arise are almost always (cf. [17]) of the form g(x, y) = gx · x + gy · y + g0 . Employing projective coordinates would certainly reduce the computational time spent performing the R-dependent precomputations, but this would produce slightly more complicated Miller lines (the extra coefficient gy in front of y), and would inevitably slow down the dynamic computations involving S. In the theme of this paper then, we opt for affine coordinates throughout, with the ultimate goal of minimizing the S-dependent runtime. We do point out however, that the methods in this paper
Fixed Argument Pairings
5
Algorithm 2 R-dependent precomputations Input: R = (xR , yR ), m = (m0 , m1 ...m#DBL−1 , m#DBL )2 . Output: GDBL = {(λ1 , c1 ), (λ2 , c2 ), ..., (λ#DBL , c#DBL } and GADD = {(λ01 , c01 ), (λ02 , c02 ), ..., (λ0#ADD , c0#ADD }. 1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12:
T ← R, GDBL ← {∅}, GADD ← {∅}. for i from 1 to #DBL do Compute λi and ci , such that y + λi x + ci is the line tangent to T . T ← [2]T . Append (λi , ci ) to GDBL . if mi 6= 0 then Compute λ0i and c0i , such that y + λ0i x + c0i is the line joining T and R. T ← T + R. Append (λ0i , c0i ) to GADD . end if end for return GDBL , GADD .
Algorithm 3 S-dependent computations Input: S = (xS , yS ), m = (m0 , m1 ...m#DBL−1 , m#DBL )2 , GDBL and GADD (from Algorithm 2). Output: fm,R (S) ← f . 1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11:
f ← 1, countADD ← 1. for i from 1 to #DBL do Compute g ← (yS + λi xS + ci ). f ← f 2 · g. if mi 6= 0 then Compute g ← (yS + λ0countADD xS + c0countADD ). countADD ← countADD + 1. f ← f · g. end if end for return f .
are entirely compatible with an implementation where the precomputation complexity might still be somewhat crucial, in which case the precomputation could be performed in projective coordinates. In such cases, one would split the algorithm and the analogous computational cost analysis described in [15]. In Table 1, we present the S-dependent dynamic computational complexity in a typical iteration of Algorithm 3, ignoring the cost of the precomputations in Algorithm 2. Iteration 1 2 .. . i .. . #DBL
R-dependent values λ1 , c1 λ2 , c2 .. . λi , ci .. . λ#DBL , c#DBL
Storage required 2 × Fpu elements 2 × Fpu elements .. . 2 × Fpu elements .. . 2 × Fpu elements
S-dependent computations λ1 · xS , f ← f 2 , f ← f · g λ2 · xS , f ← f 2 , f ← f · g .. . λi · xS , f ← f 2 , f ← f · g .. . λ#DBL · xS , f ← f 2 , f ← f · g
Dynamic costs em1 + sk + m˜k em1 + sk + m˜k .. . em1 + sk + m˜k .. . em1 + sk + m˜k
Table 1. The complexity of S-dependent computations and storage requirements for Miller’s double-and-add routine
6
C. Costello and D. Stebila
Table 1 also includes the storage requirements for the Miller lines in each iteration. We do not include the extra storage and S-dependent computations required for the addition steps in our analysis, since additions only occur a small number of times in state-of-the-art implementations that make use of well-chosen, low Hamming-weight loop parameters. In each iteration, the multiplication of the Miller ˜ k , where m ˜ k is actually less than a general mk , since the function f by the update g is counted as m ˜ k compared with mk depends on the degree of the twist updates g are sparse. The complexity of m and the nature of the field construction (see [15] for full details). The R-dependent precomputations described above are somewhat natural in the context of fixed argument pairings. In the next section, we investigate whether more precomputation can be done to reduce the overall S-dependent complexity (the total of all the entries in the “Dynamic costs” column).
4
Further Precomputations: Merging n Iterations at a Time
It was shown very recently [14, 15] that speedups in pairing computations can be achieved if the Miller lines for consecutive iterations are combined before they are evaluated at the second argument of the pairing. In [15], speedups were achieved by developing a general algorithm for merging n consecutive iterations. This involves multiplying n Miller lines together, each of which is raised to a different exponent depending on how many field squarings it would otherwise encounter in the standard Miller routine, given as Gn (x, y) =
n Y
gi (x, y)2
n−i
.
(1)
i=1
The technique in [14] is much the same, except the formulas for the above line product were presented in (slightly faster) explicit formulas, depending on the shape of the curve employed. Our approach is analogous to that of [15], where we derive an algorithm to give Gn (x, y) in its general form. We note two important differences between the results herein, and those in [15]. Firstly, the Miller lines in this paper are given in affine form, and so the general product in (1) will have a slightly different form. Secondly, the only computational costs we are concerned with are the dynamic S-dependent costs. This means that the (potentially very expensive) computational costs associated with building up the combined products in (1) can be ignored when determining the optimal number of lines to combine. We start the discussion by determining the nature of the function Gn (x, y), since the Gn (x, y) products will be evaluated at S, and this is the first S-dependent cost incurred in each of the nat-a-time iterations. We reiterate that the cost associated with building the indeterminate Gn (x, y) functions is ignored since the coefficients of these functions are solely R-dependent. We assume that Gn (x, y) is reduced modulo the curve equation y 2 = h(x), so that the result will always be (at most) linear in y, given as Gn (x, y) = fn (x) + gn (x)y,
(2)
(where fn (x) and gn (x) are not to be confused with the f and g functions described in the algorithms in the previous sections). The following lemma can be used to determine the exact degree of fn (x) and gn (x) for all n. Lemma 1. Let y 2 = h(x) define an elliptic curve, where h(x) is a monic polynomial with deg(h) = 3. Let n consecutive affine Miller doubling updates be given by g1 (x, y) = y + λ1 x + c1 ,..., gn (x, y) = y + λn x + cn . If Gn is defined as in Equation (1), then Gn (x, y) takes the form Gn (x, y) = fn (x) + gn (x)y,
Fixed Argument Pairings
7
where g(x) is a monic polynomial such that deg(gn ) = deg(fn ) − 1. Proof. When n = 1, we have G1 (x, y) = g1 (x, y) = y +λ1 x+c1 = f1 (x)+g1 (x)y, where f1 (x) = λ1 x+c and g1 (x) = 1, so that deg(g1 ) = deg(f1 ) − 1 and g1 (x) is trivially monic. For induction, assume Gk = fk (x) + gk (x)y, where deg(gk ) = deg(fk ) − 1 and gk is monic. Gk+1 is computed as Gk+1 = G2k · gk+1 2
· y + λk+1 x + ck+1 = fk (x)2 + 2fk (x)gk (x)y + gk (x)2 h(x) (y + λk+1 x + ck+1 = fk (x) + gk (x)y
= fk+1 (x) + gk+1 (x)y, where fk+1 (x) = 2fk (x)gk (x)h(x) + fk (x)2 + gk (x)2 h(x) λk+1 x + ck+1 and gk+1 (x) = fk (x)2 + gk (x)2 h(x) + 2fk (x)gk (x) λk+1 x + ck+1 . The degree of fk+1 (x) is easily seen to be deg(fk+1 ) = deg(fk )+def(gk )+deg(h) = deg(fk )+(deg(fk )− 1) + 3 = 2 · deg(fk ) + 2, and the degree of gk+1 (x) is 2 · deg(gk ) + deg(h) = 2 · deg(fk ) + 1, so that deg(gk+1 ) = deg(fk+1 ) − 1. Lastly, gk (x)2 h(x) is the only expression contributing to gk+1 (x) whose degree is the same as the degree of gk+1 (x), and gk (x)2 h(x) is clearly monic (since both gk (x) and h(x) are monic), so gk+1 (x) is also monic. t u To determine the degrees of fn and gn , we couple the initial values, deg(f1 ) = 1 and deg(g1 ) = 0, with the conditions deg(fk+1 ) = 2 · deg(fk ) + 2 and deg(gk+1 ) = 2 · deg(fk ) + 1; hence deg(fn ) = 3 · (2n−1 − 1) + 1
and
deg(gn ) = 3 · (2n−1 − 1),
(3)
agreeing with the analogous result for projective coordinates in [15, Eq. (10)]. Since we are combining n iterations into one, the R-dependent precomputation stage of the pairing algorithm will now involve two sub-algorithms. Firstly, Algorithm 2 will precompute the (λ, c) and (λ0 , c0 ) pairs as usual. Then, another sub-algorithm will be called to combine these pairs (which define lines g(x, y) = y + λx + c), combining n at a time into products of the form in (2), the degrees of which are described in (3). Since this sub-algorithm deals with n of the blog2 (m)c standard Miller iterations at a time, it will involve blog2n (m)c iterations. For exactly the same reason, the modified S-dependent evaluation stage of the pairing algorithm will now also involve blog2n (m)c iterations. We now turn to determining the cost of each of these dynamic iterations. Each of the Gn(i) takes the form Gn(i) (x, y) = fn(i) (x) + gn(i) (x)y, containing deg(fn ) + 1 + deg(gn ) = 2 · deg(fn ) = 6 · (2n−1 − 1) + 2 non-trivial coefficients, since gn is monic. Thus, for each 1 ≤ i ≤ blog2n (m)c, we must store 6 · (2n−1 − 1) + 2 elements in Fpu . Apart from the constant term of fn(i) (x), every one of these 6 · (2n−1 − 1) + 2 non-trivial coefficients is multiplied by a product of the coordinates of S, each multiplication of which costs em1 . Thus, the cost of evaluating Gn(i) at S is [6 · (2n−1 − 1) + 1]em1 . We summarize the S-dependent “per-iteration” costs in Table 2. Importantly, we note that each of the n-at-a-time iterations will also involve n squarings of the Miller
8
C. Costello and D. Stebila Iterations 1 ↓ n n+1 ↓ 2n .. . in + 1 ↓ (i + 1)n .. .
R- dependent values coefficients of fn(1) and gn(1)
Storage required 6 · (2n−1 − 1) + 2 Fpu elements
coefficients of fn(2) and gn(2)
6 · (2n−1 − 1) + 2 Fpu elements
.. . coefficients of fn(i) and gn(i)
.. . 6 · (2n−1 − 1) + 2 Fpu elements
.. .
.. .
S- dependent computations G(S) (i.e. fn(1) (S) and gn(1) (S)), n f ← f 2 , f ← f · G(S) G(S) (i.e. fn(2) (S) and gn(2) (S)), n f ← f 2 , f ← f · G(S), .. . G(S) (i.e. fn(i) (S) and gn(i) (S)), n f ← f 2 , f ← f · G(S), .. .
Dynamic costs n−1 [6 · (2 − 1) + 1]em1 nsk mk [6 · (2n−1 − 1) + 1]em1 nsk mk .. . [6 · (2n−1 − 1) + 1]em1 nsk mk .. .
Table 2. The complexity of S-dependent computations and storage requirements for n combined iterations
function f , as well as one (rather than n) multiplication of f by G(S) (see [14, 15]). For n > 1, this multiplication becomes a general full extension field multiplication mk , rather than the m ˜ k that is reported for n = 1, since G(S) is no longer sparse. Each of the blog2n (m)c iterations requires the computation of fn(1) (Sx ) and gn(1) (Sx ) · Sy . Thus, each iterate requires the same products of Sxi and Sxj Sy , where 2 ≤ i ≤ deg(fn ) and 1 ≤ j ≤ deg(gn ), and these products, which are dependent on the dynamic input, are precomputed before the first iteration. We use the term precomputed here carefully, since these S-dependent precomputations are not to be confused with the R-dependent precomputations that don’t contribute to the complexity at runtime. Where possible, we would like to use field squarings to determine the Sxi and Sxj Sy terms, rather than the slightly more expensive field multiplications. We can compute the Sxi terms, which deg(f ) range from Sx2 to Sx n , where 2 ≤ i ≤ 2blog2 (deg(fn ))c , using one field squaring each (cf. [14, §5.2]). The remaining Sxi terms, where i > 2blog2 (deg(fn ))c , are computed using (deg(fn ) − 2blog2 (deg(fn ))c ) field multiplications. These multiplications and squarings occur in the field Fpv , so that the total dynamic cost of obtaining Sxi for 2 ≤ i ≤ deg(fn ) is (2blog2 (deg(fn ))c − 1)sv + (deg(fn ) − 2blog2 (deg(fn ))c )mv . To compute the Sxj Sy products, where 1 ≤ j ≤ deg(gn ), we require deg(gn ) multiplications1 since we already have all of the Sxj . Each of these multiplications occur in Fpv , so that the total cost of the S-dependent precomputation is given by adding deg(gn )mv = (deg(fn ) − 1)mv to the previous cost, giving (2blog2 (deg(fn ))c − 1)sv + (2 · deg(fn ) − 2blog2 (deg(fn ))c − 1)mv . Substituting deg(fn ) = 3(2n−1 − 1) + 1 (from (3)) into the above equation requires the evaluation of n−1 2blog2 (3(2 −1)+1)c . To simplify this expression we rewrite the index as blog2 (2n + 2n−1 − 2)c = n, so that the total number of S-dependent computations simplifies to (2n − 1)sv + (2n+1 − 5)mv . 1
(4)
This is the cheapest way to compute these products, as trying to use field squarings to form the products would require us to compute Sxj Sy = ((Sxj + Sy )2 − Sx2j + Sy2 ); even though we have Sy2 via the curve equation, the squaring (Sxj + Sy )2 would require a full extension field squaring.
Fixed Argument Pairings
9
From Table 2, each of the iterations costs [6 · (2n−1 − 1) + 2]em1 + nsk + mk . Summing this cost over the blog2n (m)c iterations, and adding the S-dependent precomputation in (4), gives the total S-dependent computation complexity as blog2n (m)c · [6 · (2n−1 − 1) + 1]em1 + nsk + mk + (2n − 1)sv + (2n+1 − 5)mv . (5) We summarize the total storage and the total costs for all n in Table 3. Once more, we reiterate that the total complexity is independent of the field containing the first argument, Fpu , and is only dependent on the field containing the second argument, Fpv . Interestingly, this means that even if the ate and Tate pairings had equal loop lengths, a fixed argument ate pairing (v = 1) will still perform faster than a fixed argument Tate pairing (v = e) in general. As far as storage goes, the ate pairing will require storing values in the field Fpe , whilst the Tate pairing will store values in the base field Fp . However, because of the shorter loop lengths achieved in the ate pairing, the total storage needed in the ate pairing may still end up being less than for the Tate pairing. Unlike the speedups achieved in [14] and [15] which mainly benefit the Tate pairing, it is clear that our techniques will also speed up the ate pairing when a fixed argument can be exploited. In the next section, we use the complexities in Table 3 to determine the optimal amount of precomputation for implementations over various embedding degrees. Total storage required Total complexity (field multiplications) blog2 (m)c · 2 blog2 (m)c · [em1 + sk + m˜k ] elements in Fpu ` ´ n ≥ 2 blog2n (m)c · [6 · (2n−1 − 1) + 2] blog2n (m)c · [6 · (2n−1 − 1) + 1]em1 + nsk + mk elements in Fpu +(2n − 1)sv + (2n+1 − 5)mv n=1
Table 3. The total storage requirements and S-dependent complexity for a fixed argument pairing
5
Optimising n
With the ultimate goal of minimizing the dynamic computations incurred in a pairing with a fixed argument, we use the total complexities for the Miller loop in Table 3 to determine the best value of n for both the Tate and ate pairings at different security levels. In Table 4, we summarize our results for a variety of common security levels (taken from [20]) and embedding degrees, for both the Tate and ate pairings. At each security level, we list: – the optimal2 length of the Miller loop (mTate and mate , respectively) for both pairings (taken from [17]). ; – the optimal value of n, the number of iterations to merge, based on the analysis in Section 4; – the overall cost of S-dependent dynamic Miller loop doubling operations in terms of base field (Fp ) multiplications, assuming that s1 = 0.8m1 , and counting multiplications in fields with extension degrees of the form k = 2i 3j , as mk = 3i 5j m1 (cf. [29]); – the percentage reductions in Miller loop operations between an optimal n implementation (this paper) and (i) the previous precomputation methods [40] (which correspond to n = 1 herein), (ii) no precomputation, but with optimal delayed multiplications [14]. 2
These loop lengths are commonly mT ate = log(r)/2 and mate = log(r)/φ(k), corresponding to the twisted ate [27] and optimal ate [44] pairings respectively.
10
C. Costello and D. Stebila
We point out that these percentage speedups are based on the computation of the Miller loop only, and do not take into account the fixed cost of the final exponentation, so that the relative speedups for the entire pairing computation will be less than those reported in Table 4. Security r (bits) 80 160 112
224
128
256
192
384
256
512
k 6 8 12 16 12 16 18 18 24 32 36
Best ρ 2.000 1.500 1.000 1.250 1.000 1.250 1.333 1.333 1.250 1.125 1.167
Fp (bits) 320 240 224 280 256 320 342 512 478 576 598
Fpe (bits) 320 480 448 1120 512 1280 1026 1536 1912 4608 3588
Fpk (bits) 1920 1920 2688 4480 3072 4096 4608 6912 9216 16384 18432
mTate 160 120 112 56 128 64 86 128 96 64 86
Tate (G1 × G2 ) n, #m1 % Speedup mate 2, 3686 [7.8, 37.1]% 80 2, 5069 [11.2, 30.8]% 120 3, 7308 [11.8, 29.5]% 56 2, 6730 [14.6, 25.9]% 28 3, 8263 [12.7, 30.3]% 64 2, 7684 [14.7, 26.0]% 32 3, 9131 [13.6, 28.5]% 43 3, 13449 [14.2, 29.3]% 64 3, 17270 [18.2, 30.4]% 48 3, 21969 [17.9, 25.7]% 32 3, 25740 [18.2, 29.5]% 43
ate (G2 × G1 ) n, #m1 % Speedup 2, 1846 [7.7, 37.0]% 2, 5058 [11.4, 30.9]% 3, 3646 [12.0, 29.7]% 2, 3346 [15.1, 26.3]% 2, 4198 [11.3, 29.2]% 2, 3823 [15.1, 26.3]% 3, 4697 [11.1, 26.5]% 3, 6881 [12.5, 27.6]% 3, 8577 [18.7, 30.9]% 3, 10777 [19.5, 27.1]% 3, 13202 [16.1, 27.7] %
Table 4. The optimal number of iterations to merge and the resulting S-dependent base field operation counts for a variety of security levels and embedding degrees.
(a) Tate pairing multiplication vs. storage costs with r = 256 and k = 12 for various n.
(b) Tate pairing multiplication costs with different r and k for various n.
Fig. 1. S-dependent costs for various n values for the Tate pairing.
In Figure 1(a), we show the operation count (in terms of number of base field multiplications) for the entire S-dependent dynamic Miller loop doubling operations in a Tate pairing based on the analysis in Section 4, as well as the precomputation storage costs in terms of number of base field elements, for various amounts of precomputation; n = 0 corresponds to no precomputation, while n ≥ 1 corresponds to the complexity from Table 3. The optimal amount of precomputation for this case occurs at n = 3, at which point the precomputation storage costs are 860 base field elements, which is not too prohibitive for many applications. Figure 1(b) shows the base field multiplication costs for the S-dependent dynamic Miller loop doubling operations in a Tate pairing for a variety of subgroup sizes and embedding degrees based
Fixed Argument Pairings
11
on the amount (n) of precomputation. With the exception of the r = 160 case, the optimal n is 3, meaning for those cases we should use precomputation to merge 3 iterations of the Miller loop at a time, as described in Section 4.
6
Applications
There are many pairing-based cryptosystems that can benefit from precomputation when one of the arguments is fixed. In Table 5, we have listed some pairing-based cryptosystems that have fixed arguments and hence which can benefit from the improvements in this paper. In some cases, such as the Boneh-Franklin identity-based encryption scheme [9] or the Chen-Kudla identity-based key agreement protocol [13], one party computes a pairing where the first argument is fixed while the other party computes a pairing where the second argument is fixed; here, our technique can only be applied to speed up one party’s pairing computation. In others cases, such as the McCullagh-Barreto identity based key agreement protocol [32], the two parties employing the cryptosystem can both benefit because they each compute pairings where the fixed value appears in the same argument of the pairing. Our speed-up is also applicable to attributebased encryption schemes such as those of Goyal et al. [24, §4] and Lewko et al. [30, §2] which perform a large number of pairings (one or two for each attribute used in decryption), as in these schemes the second arguments in all these pairing computations are long-term values (fixed across multiple runs of the protocol, though not all identical within a single run of the protocol). Scheme Public key encryption Boyen-Mei-Waters [12] Identity-based encryption Boneh-Franklin [9] Boneh-Boyen [7] Waters [45] Attribute-based encryption GPSW [24, §4]
Decryption 1 pairing; fixed in 2nd arg. Decryption 1 pairing; fixed in 1st arg. 1 pairing; fixed in 2nd arg. 2 pairings; both fixed in 2nd arg. Decryption ≤ #(attributes) pairings; all fixed in 1st arg. LOSTW [30, §2] no pairings ≤ 2 × #(attributes) pairings; all fixed in 2nd arg. Identity-based signatures Signing Verification Waters [45] no pairings 2 pairings; 1 fixed in 2nd arg. Identity-based key agreement Initiator Responder Smart-1 [43] 2 pairings; 1 fixed in 1st arg., 2 pairings; 1 fixed in 1st arg., 1 fixed in 2nd arg. 1 fixed in 2nd arg. Chen-Kudla [13] 1 pairing; fixed in 1st arg. 1 pairing; fixed in 2nd arg. McCullagh-Barreto [32] 1 pairing; fixed in 2nd arg. 1 pairing; fixed in 2nd arg. Table 5. Fixed arguments in various pairing-based cryptosystems.
7
Encryption no pairings Encryption 1 pairing; fixed in 2nd arg. no pairings no pairings Encryption no pairings
Conclusions
We have shown how using precomputation to merge iterations of the Miller loop in Tate and ate pairings can reduce the cost of the dynamic running time when computing a pairing with one fixed argument
12
C. Costello and D. Stebila
and one dynamic argument. This improves the runtime cost by 20 to 37% when compared to a pairing computation with no precomputation, and up to 18% when compared to previous precomputation techniques. While the precomputation stage is somewhat expensive compared to the cost of pairing computation, it can still be run quite quickly (in a few seconds or less) on modern computers, and the amount of precomputation storage required is not prohibitive for many settings. Given the wide variety of pairing-based cryptosystems where one argument (say, a long-term private key or system parameter) is fixed across many protocol runs, we believe our techniques have wide applicability.
References 1. Christophe Arene, Tanja Lange, Michael Naehrig, and Christophe Ritzenthaler. Faster pairing computation. Cryptology ePrint Archive, Report 2009/155, 2009. http://eprint.iacr.org/. 2. Paulo S. L. M. Barreto, Steven D. Galbraith, Colm O’Eigeartaigh, and Michael Scott. Efficient pairing computation on supersingular abelian varieties. Des. Codes Cryptography, 42(3):239–271, 2007. 3. Paulo S. L. M. Barreto, Hae Yong Kim, Ben Lynn, and Michael Scott. Efficient algorithms for pairing-based cryptosystems. In Moti Yung, editor, CRYPTO, volume 2442 of Lecture Notes in Computer Science, pages 354–368. Springer, 2002. 4. Paulo S. L. M. Barreto, Ben Lynn, and Michael Scott. Efficient implementation of pairing-based cryptosystems. J. Cryptology, 17(4):321–334, 2004. 5. Paulo S. L. M. Barreto and Michael Naehrig. Pairing-friendly elliptic curves of prime order. In Bart Preneel and Stafford E. Tavares, editors, Selected Areas in Cryptography, volume 3897 of Lecture Notes in Computer Science, pages 319–331. Springer, 2005. 6. Naomi Benger and Michael Scott. Constructing tower extensions for the implementation of pairing-based cryptography. In WAIFI 2010, Lecture Notes in Computer Science. Springer, 2010. To appear. 7. Dan Boneh and Xavier Boyen. Efficient selective-id secure identity-based encryption without random oracles. In Christian Cachin and Jan Camenisch, editors, Advances in Cryptology – Proc. EUROCRYPT 2004, volume 3027 of LNCS, pages 223–238. Springer, 2004. 8. Dan Boneh, Xavier Boyen, and Hovav Shacham. Short group signatures. In Franklin [19], pages 41–55. 9. Dan Boneh and Matt Franklin. Identity-based encryption from the weil pairing. In Joe Kilian, editor, Advances in Cryptology – Proc. CRYPTO 2001, volume 2139 of LNCS, pages 213–229. Springer, 2001. 10. Dan Boneh, Craig Gentry, Ben Lynn, and Hovav Shacham. Aggregate and verifiably encrypted signatures from bilinear maps. In Eli Biham, editor, EUROCRYPT, volume 2656 of Lecture Notes in Computer Science, pages 416–432. Springer, 2003. 11. Dan Boneh, Ben Lynn, and Hovav Shacham. Short signatures from the weil pairing. J. Cryptology, 17(4):297–319, 2004. 12. Xavier Boyen, Qixiang Mei, and Brent Waters. Direct chosen ciphertext security from identity-based techniques. In Catherine Meadows and Paul Syverson, editors, Proc. 12th ACM Conference on Computer and Communications Security (CCS), pages 320–329. ACM, 2005. 13. Liqun Chen and Caroline Kudla. Identity based authenticated key agreement protocols from pairings. In Proceedings 16th IEEE Computer Security Foundations Workshop (CSWF-16), pages 219–233. IEEE, 2003. 14. Craig Costello, Colin Boyd, Juan Manuel Gonz´ alez Nieto, and Kenneth Koon-Ho Wong. Avoiding full extension field arithmetic in pairing computations. In AFRICACRYPT 2010, Lecture Notes in Computer Science. Springer, 2010. To appear. 15. Craig Costello, Colin Boyd, Juan Manuel Gonz´ alez Nieto, and Kenneth Koon-Ho Wong. Delaying mismatched field multiplications in pairing computations. In WAIFI 2010, Lecture Notes in Computer Science. Springer, 2010. To appear. 16. Craig Costello, Huseyin Hisil, Colin Boyd, Juan Manuel Gonz´ alez Nieto, and Kenneth Koon-Ho Wong. Faster pairings on special Weierstrass curves. In Shacham and Waters [41], pages 89–101. 17. Craig Costello, Tanja Lange, and Michael Naehrig. Faster pairing computations on curves with high-degree twists. In PKC 2010, Lecture Notes in Computer Science. Springer, 2010. To appear. 18. Augusto Jun Devegili, Michael Scott, and Ricardo Dahab. Implementing cryptographic pairings over barreto-naehrig curves. In Tsuyoshi Takagi, Tatsuaki Okamoto, Eiji Okamoto, and Takeshi Okamoto, editors, Pairing, volume 4575 of Lecture Notes in Computer Science, pages 197–207. Springer, 2007. 19. Matthew K. Franklin, editor. Advances in Cryptology - CRYPTO 2004, 24th Annual International CryptologyConference, Santa Barbara, California, USA, August 15-19, 2004, Proceedings, volume 3152 of Lecture Notes in Computer Science. Springer, 2004.
Fixed Argument Pairings
13
20. David Freeman, Michael Scott, and Edlyn Teske. A taxonomy of pairing-friendly elliptic curves. J. Cryptology, 23(2):224–280, 2010. 21. Steven D. Galbraith. Supersingular curves in cryptography. In Colin Boyd, editor, ASIACRYPT, volume 2248 of Lecture Notes in Computer Science, pages 495–513. Springer, 2001. 22. Steven D. Galbraith, Keith Harrison, and David Soldera. Implementing the Tate pairing. In Claus Fieker and David R. Kohel, editors, ANTS, volume 2369 of Lecture Notes in Computer Science, pages 324–337. Springer, 2002. 23. Craig Gentry and Alice Silverberg. Hierarchical id-based cryptography. In Yuliang Zheng, editor, ASIACRYPT, volume 2501 of Lecture Notes in Computer Science, pages 548–566. Springer, 2002. 24. Vipul Goyal, Omkant Pandey, Amit Sahai, and Brent Waters. Attribute-based encryption for fine-grained access control of encrypted data. In Rebecca Wright, Sabrina De Capitani de Vimercati, and Vitaly Shmatikov, editors, Proc. 13th ACM Conference on Computer and Communications Security (CCS), pages 89–98. ACM, 2006. 25. Darrel Hankerson, Alfred J. Menezes, and Michael Scott. Software implementation of pairings. In Identity-Based Cryptography, pages 188–206, 2009. 26. Florian Hess. Pairing lattices. In Steven D. Galbraith and Kenneth G. Paterson, editors, Pairing, volume 5209 of Lecture Notes in Computer Science, pages 18–38. Springer, 2008. 27. Florian Hess, Nigel P. Smart, and Frederik Vercauteren. The eta pairing revisited. IEEE Transactions on Information Theory, 52(10):4595–4602, 2006. 28. Antoine Joux. A one round protocol for tripartite diffie-hellman. In Wieb Bosma, editor, ANTS, volume 1838 of Lecture Notes in Computer Science, pages 385–394. Springer, 2000. 29. Neal Koblitz and Alfred Menezes. Pairing-based cryptography at high security levels. In Nigel P. Smart, editor, IMA Int. Conf., volume 3796 of Lecture Notes in Computer Science, pages 13–36. Springer, 2005. 30. Allison Lewko, Tatsuaki Okamoto, Amit Sahai, Katsuyuki Takashima, and Brent Waters. Fully secure functional encryption: Attribute-based encryption and (hierarchical) inner product encryption, 2010. To appear at EUROCRYPT 2010. 31. Seiichi Matsuda, Naoki Kanayama, Florian Hess, and Eiji Okamoto. Optimised versions of the ate and twisted ate pairings. In Steven D. Galbraith, editor, IMA Int. Conf., volume 4887 of Lecture Notes in Computer Science, pages 302–312. Springer, 2007. 32. Noel McCullagh and Paulo S.L.M. Barreto. A new two-party identity-based authenticated key agreement. In Alfred J. Menezes, editor, Topics in Cryptology — CT-RSA 2005, volume 3376 of LNCS, pages 262–274. Springer, 2005. 33. Alfred J. Menezes. Elliptic Curve Public Key Cryptosystems. Kluwer Academic Publishers, 1993. 34. Victor S. Miller. The Weil pairing, and its efficient calculation. Journal of Cryptology, 17:235–261, 2004. 35. Michael Naehrig, Paulo S. L. M. Barreto, and Peter Schwabe. On compressible pairings and their computation. In Serge Vaudenay, editor, AFRICACRYPT, volume 5023 of Lecture Notes in Computer Science, pages 371–388. Springer, 2008. 36. Michael Scott. Implementing cryptographic pairings. In Tsuyoshi Takagi, Tatsuaki Okamoto, and Eiji Okamoto, editors, Pairing, Lecture Notes in Computer Science, pages 177–196. Springer, 2007. 37. Michael Scott and Paulo S. L. M. Barreto. Compressed pairings. In Franklin [19], pages 140–156. 38. Michael Scott, Naomi Benger, Manuel Charlemagne, Luis J. Dominguez Perez, and Ezekiel J. Kachisa. Fast hashing to G2 on pairing-friendly curves. In Shacham and Waters [41], pages 102–113. 39. Michael Scott, Naomi Benger, Manuel Charlemagne, Luis J. Dominguez Perez, and Ezekiel J. Kachisa. On the final exponentiation for calculating pairings on ordinary elliptic curves. In Shacham and Waters [41], pages 78–88. 40. Michael Scott, Neil Costigan, and Wesam Abdulwahab. Implementing cryptographic pairings on smartcards. In Louis Goubin and Mitsuru Matsui, editors, CHES, volume 4249 of Lecture Notes in Computer Science, pages 134– 147. Springer, 2006. 41. Hovav Shacham and Brent Waters, editors. Pairing-Based Cryptography - Pairing 2009, Third International Conference, Palo Alto, CA, USA, August 12-14, 2009, Proceedings, volume 5671 of Lecture Notes in Computer Science. Springer, 2009. 42. Adi Shamir. Identity-based cryptosystems and signature schemes. In CRYPTO, pages 47–53, 1984. 43. Nigel P. Smart. Identity-based authenticated key agreement protocol based on Weil pairing. Electronics Letters, 38(13):630–632, June 2002. 44. Frederik Vercauteren. Optimal pairings. IEEE Transactions on Information Theory, 56(1):455–461, 2010. 45. Brent Waters. Efficient identity-based encryption without random oracles. In Ronald Cramer, editor, Advances in Cryptology – Proc. EUROCRYPT 2005, volume 3494 of LNCS, pages 114–127. Springer, 2005.
A
Magma
The following MAGMA code is included for illustrative purposes. The code can be copied (from top to bottom, ignoring the LATEXcomments in between) directly into MAGMA. We provide an example run using of the Tate pairing using a 224-bit BN curve
14
C. Costello and D. Stebila
(taken from [5]), although the code will work for both the Tate and ate pairings using any parameters following the descriptions in Section 2. The code can serve as a platform to build proper implementations that employ optimum precomputation for fixed argument pairings. We give brief descriptions of each subalgorithm and condense the code due to space considerations. The operation counts given by the cumulative counters herein will not exactly match up with those given in Table 4, since that table ignored additions and did not take into account the remaining iterations. The functions MillerDoubling and MillerAddition use the schoolbook affine formulas for point doubling/addition and Miller line computations, and output lines of the form g = y + λx + c. function MillerDoubling(xT,yT,a) //Output is (x3,y3)=[2](xT,yT) and (lambda,c) such that g=y+lambda*x+c is affine Miller doubling line lambda:=(3*xT^2+a)/(2*yT); //Gradient of tangent c:=lambda*xT-yT; x3:=lambda^2-2*xT; y3:=c-lambda*x3; lambda:=-lambda; return x3,y3, lambda, c; end function; function MillerAddition(xT, yT, x2, y2) //Output is (x3,y3)=(xT,yT)+(x2,y2) and (lambda,c) such that g=y+lambda*x+c is affine Miller addition line lambda:=(y2-yT)/(x2-xT); //Gradient of line intersecting two points c:=lambda*xT-yT; x3:=lambda^2-xT-x2; y3:=c-lambda*x3; lambda:=-lambda; return x3,y3, lambda, c; end function;
The functions SquareFunctionMonic or SquareFunction take square a function of the form f (x) + g(x)y, depending on whether g is monic or not. These functions are optimized in terms of minimal field multiplications and squarings. The output is a function of the form f 0 (x) + g 0 (x)y (i.e. two arrays representing the coefficients of f 0 and g 0 ). function SquareFunctionMonic(fArray, gArray,a,b) // Function assumes input f(x)+g(x)y and computes f’(x)+g’(x)y=(f(x)+g(x)y)^2 where g is monic fSize:=#fArray; //Get sizes gSize:=#gArray; fSquaresArray:=[0: i in [1..fSize]]; //Arrays containing squares of f and g coefficients gSquaresArray:=[0: i in [1..gSize-1]]; for i:=1 to fSize do fSquaresArray[i]:=fArray[i]^2; //Compute f squares end for; for i:=1 to (gSize-1) do gSquaresArray[i]:=gArray[i]^2; //Compute g squares end for; fSquaredSize:=2*fSize-1; //Initialize arrays with entries becoming coefficients of f(x)^2 and g(x)^2 and f(x)g(x) gSquaredSize:=2*gSize-1; fgProductSize:=fSize+gSize-1; fSquaredArray:=[0: i in [1..fSquaredSize]]; gSquaredArray:=[0: i in [1..gSquaredSize]]; fgProductArray:=[0: i in [1..fgProductSize]]; for i:=1 to fSize do //Compute 2*f[i]*f[j] for i not j for j:=1 to (i-1) by 1 do fSquaredArray[i+j-1] +:= (fArray[i]+fArray[j])^2-fSquaresArray[i]-fSquaresArray[j]; end for; end for; for i:=1 to fSize do //Insert squared terms (i.e. f[i]^2) fSquaredArray[2*i-1]+:=fSquaresArray[i]; end for; for i:=1 to (gSize-1) do //Compute 2*g[i]*g[j] for i not j where g is monic for j:=1 to (i-1) by 1 do gSquaredArray[i+j-1] +:= (gArray[i]+gArray[j])^2-gSquaresArray[i]-gSquaresArray[j]; end for; end for; for j:=1 to (gSize-1) by 1 do //When i=gSize then gArray[i]=1 since g is monic gSquaredArray[gSize+j-1] +:= 2*gArray[j]; end for; for i:=1 to (gSize-1) do //Insert squared terms (i.e. g[i]^2) gSquaredArray[2*i-1]+:=gSquaresArray[i]; end for; gSquaredArray[2*gSize-1]+:=1; //Since g is monic for i:=1 to fSize do //Compute 2*f[i]*g[j] for i not j where g is monic (fgProduct always has coefficient 2 in updates) for j:=1 to (gSize-1) do fgProductArray[i+j-1]+:= (fArray[i]+gArray[j])^2-fSquaresArray[i]-gSquaresArray[j]; end for; end for; for i:=1 to fSize do //Since g is monic fgProductArray[i+gSize-1]+:=2*fArray[i]; end for; fNewSize:= Max(2*fSize-1, 2+2*gSize); //Set output sizes gNewSize:= fSize+gSize-1; fNewArray:=[0: i in [1..fNewSize]]; //Initialize gNewArray:=[0: i in [1..gNewSize]]; for i:=1 to fSquaredSize do //fNew takes each of the f(x)^2 terms fNewArray[i]+:=fSquaredArray[i]; end for; for i:=1 to gSquaredSize do //Multiply h(x) by g^2 (where h(x)=x^3+ax+b) fNewArray[i]+:=b*gSquaredArray[i]; fNewArray[i+1]+:=a*gSquaredArray[i]; fNewArray[i+3]+:=gSquaredArray[i]; end for;
Fixed Argument Pairings
15
for i:=1 to fgProductSize do //gNew takes each of the f(x)*g(x) terms gNewArray[i]+:=fgProductArray[i]; end for; return fNewArray, gNewArray; end function; function SquareFunction(fArray, gArray,a,b) // Function assumes input f(x)+g(x)y and computes f’(x)+g’(x)y=(f(x)+g(x)y)^2 where g is NOT monic fSize:=#fArray; gSize:=#gArray; //Get sizes fSquaresArray:=[0: i in [1..fSize]]; gSquaresArray:=[0: i in [1..gSize]]; //Arrays containing squares of f and g coefficients for i:=1 to fSize do fSquaresArray[i]:=fArray[i]^2; //Compute f squares end for; for i:=1 to gSize do gSquaresArray[i]:=gArray[i]^2; //Compute g squares end for; fSquaredSize:=2*fSize-1; //Initialize arrays with entries becoming coefficients of f(x)^2 and g(x)^2 and f(x)g(x) gSquaredSize:=2*gSize-1; fgProductSize:=fSize+gSize-1; fSquaredArray:=[0: i in [1..fSquaredSize]]; gSquaredArray:=[0: i in [1..gSquaredSize]]; fgProductArray:=[0: i in [1..fgProductSize]]; for i:=1 to fSize do //Compute 2*f[i]*f[j] for i not j for j:=1 to (i-1) by 1 do fSquaredArray[i+j-1] +:= (fArray[i]+fArray[j])^2-fSquaresArray[i]-fSquaresArray[j]; end for; end for; for i:=1 to fSize do //Insert squared terms (i.e. f[i]^2) fSquaredArray[2*i-1]+:=fSquaresArray[i]; end for; for i:=1 to gSize do //Compute 2*g[i]*g[j] for i not j for j:=1 to (i-1) by 1 do gSquaredArray[i+j-1] +:= (gArray[i]+gArray[j])^2-gSquaresArray[i]-gSquaresArray[j]; end for; end for; for i:=1 to gSize do //Insert squared terms (i.e. g[i]^2) gSquaredArray[2*i-1]+:=gSquaresArray[i]; end for; for i:=1 to fSize do //Compute 2*f[i]*g[j] for i not j where g is monic (fgProduct always has coefficient 2 in updates) for j:=1 to gSize do fgProductArray[i+j-1]+:= (fArray[i]+gArray[j])^2-fSquaresArray[i]-gSquaresArray[j]; end for; end for; fNewSize:= Max(2*fSize-1, 2+2*gSize); //Set output sizes gNewSize:= fSize+gSize-1; fNewArray:=[0: i in [1..fNewSize]]; //Initialize gNewArray:=[0: i in [1..gNewSize]]; for i:=1 to fSquaredSize do fNewArray[i]+:=fSquaredArray[i]; //fNew takes each of the f(x)^2 terms end for; for i:=1 to gSquaredSize do //Multiply h(x) by g^2 (where h(x)=x^3+ax+b) fNewArray[i]+:=b*gSquaredArray[i]; fNewArray[i+1]+:=a*gSquaredArray[i]; fNewArray[i+3]+:=gSquaredArray[i]; end for; for i:=1 to fgProductSize do //gNew takes each of the f(x)*g(x) terms gNewArray[i]+:=fgProductArray[i]; end for; return fNewArray, gNewArray; end function;
The functions MultiplyByLineMonic or MultiplyByLine multiply a function of the form f (x) + g(x)y by a Miller line g = y + λx + c. The output is a function of the form f 0 (x) + g 0 (x)y (arrays of the coefficients of f 0 and g 0 ). function MultiplyByLineMonic(fArray, gArray, a,b, lambda, c) // Function assumes input f(x)+g(x)y and line = (y+lambda*x+c) and computes // f’(x)+g’(x)y = (f(x)+g(x)y)*(y+lambda*x+c) where g is monic fSize:=#fArray; gSize:=#gArray; //Get sizes fNewSize:=Max(fSize+1, gSize+3); //Set output sizes gNewSize:=Max(gSize+1, fSize); fNewArray:=[0: i in [1..fNewSize]]; //Initialize gNewArray:=[0: i in [1..gNewSize]]; for i:=1 to fSize do fNewArray[i]+:=c*fArray[i]; // Multiply f(x) by (lambda*x+c) and insert into f’(x) fNewArray[i+1]+:=lambda*fArray[i]; gNewArray[i]+:=fArray[i]; // Put f(x) into g’(x) end for; for i:=1 to (gSize-1) do fNewArray[i]+:=b*gArray[i]; // Multiply g(x) by h(x) and insert into f’(x) fNewArray[i+1]+:=a*gArray[i]; fNewArray[i+3]+:=gArray[i]; gNewArray[i]+:=c*gArray[i]; // Multiply g(x) by (lambda*x+c) and insert into g’(x) gNewArray[i+1]+:=lambda*gArray[i]; end for; fNewArray[gSize]+:=b*gArray[gSize]; //computations saved because g is monic fNewArray[gSize+1]+:=a*gArray[gSize]; fNewArray[gSize+3]+:=gArray[gSize]; gNewArray[gSize]+:=c*gArray[gSize];
16
C. Costello and D. Stebila
gNewArray[gSize+1]+:=lambda*gArray[gSize]; return fNewArray, gNewArray; end function; function MultiplyByLine(fArray, gArray, a,b, lambda, c) // Function assumes input f(x)+g(x)y and line = (y+lambda*x+c) and computes // f’(x)+g’(x)y = (f(x)+g(x)y)*(y+lambda*x+c) where g is NOT monic fSize:=#fArray; //Get sizes gSize:=#gArray; fNewSize:=Max(fSize+1, gSize+3); //Set output sizes gNewSize:=Max(gSize+1, fSize); fNewArray:=[0: i in [1..fNewSize]]; //Initialize gNewArray:=[0: i in [1..gNewSize]]; for i:=1 to fSize do // Multiply f(x) by (lambda*x+c) and insert into f’(x) fNewArray[i]+:=c*fArray[i]; fNewArray[i+1]+:=lambda*fArray[i]; gNewArray[i]+:=fArray[i]; // Put f(x) into g’(x) end for; for i:=1 to gSize do fNewArray[i]+:=b*gArray[i]; // Multiply g(x) by h(x) and insert into f’(x) fNewArray[i+1]+:=a*gArray[i]; fNewArray[i+3]+:=gArray[i]; gNewArray[i]+:=c*gArray[i]; // Multiply g(x) by (lambda*x+c) and insert into g’(x) gNewArray[i+1]+:=lambda*gArray[i]; end for; return fNewArray, gNewArray; end function;
The function PrecomputationsR corresponds to Algorithm 2. It takes R as input and returns two arrays containing the Miller lines (defined by λ, c) at each iteration for both doubling and addition. function PrecomputationsR(R, r, n, a) // Standard Miller’s algorithm in affine form (stores (lambda,c) for each Doubling and Addition step) // Only dependent on first argument R. // Outputs arrays of [lambda,c]’s for doublings and [lambda,c]’s for addition x1 := R[1]; y1 := R[2]; x2 := R[1]; y2 := R[2]; B := IntegerToSequence(r, 2); doublingLines:=[]; additionLines:=[]; for i:=#B-1 to 2 by -1 do // Miller doubling x1,y1, lambda,c := MillerDoubling(x1,y1,a); doublingLines:=Append(doublingLines, [lambda,c]); if B[i] eq 1 then // Miller addition x1,y1, lambda, c := MillerAddition(x1, y1, x2, y2); additionLines:=Append(additionLines, [lambda,c]); end if; end for; x1,y1, lambda,c := MillerDoubling(x1,y1,a); //Final iteration only required doubling since addition line is = 1 doublingLines:=Append(doublingLines,[lambda,c]); return doublingLines,additionLines; end function;
The function PrecomputationProducts is the sub-algorithm that takes n iterations worth of Miller lines at a time, and combines them into one polynomial product of the form f (x) + g(x)y. The coefficients at each stage form two arrays (f and g), and each of these arrays is stored in a larger array containing values from all iterations. function PrecomputationProducts(doublingLines, additionLines, r, n, a, b) // Function combines n lots of consecutive [lambda, c] values into functions of the form f(x)+g(x)y // Main outputs are doublingLineProducts and additionLineProducts which are arrays of these combined functions // additionIndexes keeps track of which "n-at-a-time" iterations incurred a division B := IntegerToSequence(r, 2); // B as a bitstring numberCombinedIterations:=Integers()!Floor((#B-2)/n); // First bit of B doesn’t count (always 1), and last bit of B doesn’t count (no add) outsideIterations:=[0..(numberCombinedIterations-1)]; // Number of combined iterations insideIterations:=[2..n]; // First iteration is assumed remainingIterations:=[(numberCombinedIterations*n+2)..#B-2]; //First iteration of remaining iterations is assumed doublingLineProducts:=[]; // Initialize additionLineProducts:=[]; additionIndexes:=[]; additionCount:=1; // Addition count keeps track of the number of additions to call the correct index in additionLines array. for i in outsideIterations do additionPerformed:=false; // Only squaring as soon as addition has been performed lambda:=doublingLines[i*n+1][1]; c:=doublingLines[i*n+1][2]; // First [lambda, c] contributing to each product fArray:=[c, lambda]; gArray:=[1]; // Initialize f(x) and g(x) if B[#B-(i*n+1)] eq 1 then // First addition check lambdaAdd:=additionLines[additionCount][1]; cAdd:=additionLines[additionCount][2]; fArrayAdd:=[cAdd, lambdaAdd]; gArrayAdd:=[1]; additionCount+:=1; additionPerformed:=true; end if; for j in insideIterations do fArray, gArray:= SquareFunctionMonic(fArray, gArray, a,b); //Square f(x)+g(x)y lambda:=doublingLines[i*n+j][1]; c:=doublingLines[i*n+j][2]; // Compute line = y+lambda*x+c fArray, gArray:= MultiplyByLineMonic(fArray, gArray, a,b, lambda, c); // Use monic multiplier if additionPerformed then //Previous addition line products must be squared fArrayAdd, gArrayAdd:=SquareFunction(fArrayAdd, gArrayAdd, a, b); if B[#B-(i*n+j)] eq 1 then lambdaAdd:=additionLines[additionCount][1]; cAdd:=additionLines[additionCount][2]; fArrayAdd, gArrayAdd:=MultiplyByLine(fArrayAdd, gArrayAdd, a,b, lambdaAdd, cAdd);
Fixed Argument Pairings additionCount+:=1; end if; else //Addition line product initialized as first addition line encountered if B[#B-(i*n+j)] eq 1 then lambdaAdd:=additionLines[additionCount][1]; cAdd:=additionLines[additionCount][2]; fArrayAdd:=[cAdd, lambdaAdd]; gArrayAdd:=[1]; additionPerformed:=true; additionCount+:=1; end if; end if; end for; doublingLineProducts:=Append(doublingLineProducts, [fArray, gArray]); // Update doubling line product array (of arrays) additionIndexes:=Append(additionIndexes, additionPerformed); // Update additionIndexes if additionPerformed then additionLineProducts:=Append(additionLineProducts,[fArrayAdd, gArrayAdd]); // Update addition line product array (of arrays) end if; end for; // The remaining iterations (residue when loop length modulo n is not zero) being here! additionPerformed:=false; if (#B-2) mod n ne 0 then // If (#B-2) mod n is zero then we still have to use the final iteration here. (i.e. (#B-1)-th iteration) lambda:=doublingLines[numberCombinedIterations*n+1][1]; c:=doublingLines[numberCombinedIterations*n+1][2]; fArray:=[c,lambda]; gArray:=[1]; //First doubling if B[#B-(numberCombinedIterations*n+1)] eq 1 then //Check for addition on the 1st of the remaining iterations lambdaAdd:=additionLines[additionCount][1]; cAdd:=additionLines[additionCount][2]; fArrayAdd:=[cAdd, lambdaAdd]; gArrayAdd:=[1]; additionCount+:=1; additionPerformed:=true; end if; for i in remainingIterations do fArray, gArray:= SquareFunctionMonic(fArray, gArray,a,b); // Square and multiply lambda:=doublingLines[i][1]; c:=doublingLines[i][2]; fArray, gArray:= MultiplyByLineMonic(fArray, gArray, a,b, lambda, c); if additionPerformed then // Previous addition lines must be squared fArrayAdd, gArrayAdd:=SquareFunction(fArrayAdd, gArrayAdd, a, b); if B[#B-i] eq 1 then lambdaAdd:=additionLines[additionCount][1]; cAdd:=additionLines[additionCount][2]; fArrayAdd, gArrayAdd:=MultiplyByLine(fArrayAdd, gArrayAdd, a,b, lambdaAdd, cAdd); additionCount+:=1; end if; else if B[#B-i] eq 1 then // The first addition (if not occured already) lambdaAdd:=additionLines[additionCount][1]; cAdd:=additionLines[additionCount][2]; fArrayAdd:=[cAdd, lambdaAdd]; gArrayAdd:=[1]; additionPerformed:=true; additionCount+:=1; end if; end if; end for; // The final iteration: no addition check needed fArray, gArray:= SquareFunctionMonic(fArray, gArray,a,b); lambda:=doublingLines[#B-1][1]; c:=doublingLines[#B-1][2]; fArray, gArray:= MultiplyByLineMonic(fArray, gArray, a,b, lambda, c); doublingLineProducts:=Append(doublingLineProducts, [fArray, gArray]); else // The last iteration: no addition check needed lambda:=doublingLines[#B-1][1]; c:=doublingLines[#B-1][2]; fArray:=[c,lambda]; gArray:=[1]; doublingLineProducts:=Append(doublingLineProducts, [fArray, gArray]); end if; additionIndexes:=Append(additionIndexes, additionPerformed); if additionPerformed then // Although no addition on the final iteration, the addition product must still be squared fArrayAdd, gArrayAdd:=SquareFunction(fArrayAdd, gArrayAdd, a, b); additionLineProducts:=Append(additionLineProducts,[fArrayAdd, gArrayAdd]); end if; return doublingLineProducts, additionLineProducts, additionIndexes, outsideIterations, remainingIterations, n; end function;
The function DynamicPreparationS outputs two arrays containing all the required terms of the form xiS and xjS · yS . function DynamicPreparationS(S,n) // Prepares powers of the coordinates of S (Sx and Sy) to be multiplied through each "n-at-a-time" iteration // Only dependent on second (dynamic) argument S. // Outputs fS=[Sx,Sx^2,...] and gS=[Sy, Sx*Sy, Sx^2*Sy, ...] Sx:=S[1]; Sy:=S[2]; // Get coordinates fS:=[Parent(Sx)! 0: i in [1..(3*(2^(n-1)-1)+1)]]; // Prescribe sizes of fS and gS and initialize first elements gS:=[Parent(Sx)! 0: i in [1..(3*(2^(n-1)-1)+1)]]; fS[1]:=Sx; gS[1]:=Sy; squareCountFpv:=0; // Initialize operation counts multiplicationCountFpv:=0; for i:=1 to n do // Compute all Sx^(2^i) required (up to Sx^(2^n)) fS[2^i]:=(fS[2^(i-1)])^2; squareCountFpv+:=1; end for; lastPowerOfTwo:=2; for i:=3 to (2^n-1) do // Use squarings to compute powers of Sx where possible if not IsPowerOf(i,2) then diffe := i - lastPowerOfTwo; fS[i] := ((fS[lastPowerOfTwo]+fS[diffe])^2-fS[2*lastPowerOfTwo]-fS[2*diffe])/2; squareCountFpv+:=1;
17
18
C. Costello and D. Stebila else
lastPowerOfTwo:=i; end if; end for; lastPowerOfTwo:=2^n; for i:=(2^n+1) to (3*(2^(n-1)-1)+1) do // Use multiplications to compute the remaining powers of Sx fS[i]:=fS[lastPowerOfTwo]*fS[i-lastPowerOfTwo]; multiplicationCountFpv+:=1; end for; for i:=2 to (3*(2^(n-1)-1)+1) do // Use multiplications to compute all remaining elements of the form Sx^i*Sy gS[i]:=Sy*fS[i-1]; multiplicationCountFpv+:=1; end for; return fS, gS, squareCountFpv, multiplicationCountFpv; end function;
The function DynamicComputations evaluates the precomputed products at S, whilst updating the Miller function f with the products G. Essentially, this function is the n-at-a-time version of Miller’s algorithm, that combines the precomputed R-dependent products with the powers of S gained in the previous function. Since the original loop length blog2 (m)c may not be divisible by n, we have to account for a “less-than-n-at-a-time” iterate at the end. This will include the final iteration of the regular Miller loop, which does not involve an addition. function DynamicComputations(doublingLineProducts, additionLineProducts, fS, gS, additionIndexes, outsideIterations, remainingIterations, n) // Function computes the pairing of R and S. // - combines the R-dependent precomputed polynomials (arrays) with the S-dependent f(S) and g(S)) // (i.e. multiplies each of the R-dependent "n-at-a-time" polynomials by the powers of the coordinates of S). // - updates the Miller function accordingly. // Outputs the Miller function f. // Also outputs operation counts doublingMultiplicationCountFpe:=0; // Initialize operation counts doublingSquareCountFpk:=0; // each doublingMultiplicationCountFpe costs e*m1 doublingMultiplicationCountFpk:=0; additionMultiplicationCountFpe:=0; //Addition doesn’t incur Fpk squarings additionMultiplicationCountFpk:=0; // each additionMultiplicationCountFpe costs e*m1 fDoublingSize:= #doublingLineProducts[1][1]; gDoublingSize:=#doublingLineProducts[1][2]; f := 1; // The Miller function (initialized) additionCount:=1; // Allows access to correct index in additionLineProducts fLastPower:=#remainingIterations+2; // The exponentiation required (less than or equal to 2^n) for the remaining iterations (less or equal n) for i in outsideIterations do doublingProduct:=doublingLineProducts[i+1][1][1]; // The constant of f(x) doesn’t multiply by any power of Sx for j in [2..fDoublingSize] do doublingProduct+:=doublingLineProducts[i+1][1][j]*fS[j-1]; // Each coefficient of f(x) multiplies by Sx^j doublingMultiplicationCountFpe+:=1; end for; for j in [1..(gDoublingSize-1)] do doublingProduct+:=doublingLineProducts[i+1][2][j]*gS[j]; // Each coefficient of g(x) multiplies by Sx^j*Sy doublingMultiplicationCountFpe+:=1; end for; doublingProduct+:=gS[gDoublingSize]; //Last g is 1 (g is monic) // Save a multiplication as g is monic for k in [1..n] do // f:=f^(2^n) f:=f^2; doublingSquareCountFpk+:=1; end for; f:=f*doublingProduct; // Multiply Miller function by doubling product doublingMultiplicationCountFpk+:=1; if additionIndexes[i+1] then //If addition is required, evaluate addition product at S and multiply. fAdditionSize:=#additionLineProducts[additionCount][1]; gAdditionSize:=#additionLineProducts[additionCount][2]; additionProduct:=additionLineProducts[additionCount][1][1]; for j in [2..fAdditionSize] do additionProduct+:=additionLineProducts[additionCount][1][j]*fS[j-1]; additionMultiplicationCountFpe+:=1; end for; for j in [1..gAdditionSize] do additionProduct+:=additionLineProducts[additionCount][2][j]*gS[j]; additionMultiplicationCountFpe+:=1; end for; f:=f*additionProduct; additionCount+:=1; additionMultiplicationCountFpk+:=1; end if; end for; // The final evaluation fDoublingSize:= #doublingLineProducts[#doublingLineProducts][1]; // Final product (most likely) has a different size gDoublingSize:=#doublingLineProducts[#doublingLineProducts][2]; doublingProduct:=doublingLineProducts[#doublingLineProducts][1][1]; // Constant term of f(x) for j in [2..fDoublingSize] do doublingProduct+:=doublingLineProducts[#doublingLineProducts][1][j]*fS[j-1]; //Each coefficient of f(x) multiplies by Sx^j doublingMultiplicationCountFpe+:=1; end for; for j in [1..(gDoublingSize-1)] do doublingProduct+:=doublingLineProducts[#doublingLineProducts][2][j]*gS[j]; // Each coefficient of g(x) multiplies by Sx^j*Sy doublingMultiplicationCountFpe+:=1; end for; doublingProduct+:=gS[gDoublingSize]; //Last g is 1 (g is monic) // Save multiplication due to g being monic for k in [1..fLastPower] do // f:=f^(2^fLastPower) f:=f^2; doublingSquareCountFpk+:=1;
Fixed Argument Pairings
19
end for; f:=f*doublingProduct; // Multiply Miller function by doubling product doublingMultiplicationCountFpk+:=1; if additionIndexes[#additionIndexes] then // If addition is required in the remaining iterations, evaluate product at S and multiply fAdditionSize:=#additionLineProducts[additionCount][1]; gAdditionSize:=#additionLineProducts[additionCount][2]; additionProduct:=additionLineProducts[additionCount][1][1]; for j in [2..fAdditionSize] do additionProduct+:=additionLineProducts[additionCount][1][j]*fS[j-1]; additionMultiplicationCountFpe+:=1; end for; for j in [1..gAdditionSize] do additionProduct+:=additionLineProducts[additionCount][2][j]*gS[j]; additionMultiplicationCountFpe+:=1; end for; f:=f*additionProduct; additionMultiplicationCountFpk+:=1; end if; return f, doublingMultiplicationCountFpe, doublingSquareCountFpk, doublingMultiplicationCountFpk, additionMultiplicationCountFpe, additionMultiplicationCountFpk; end function;
The function OperationCounts is intended to gain an exact operation count (in terms of field multiplications and squarings) for the entire Miller loop. These counts are accumulated, where applicable, throughout the other function calls. Four counts are returned each time, although the only meaningfull counts will be those that apply to the pairing (ate or Tate) employed. In each case, we give a “doublings only” count and a “total count” (which includes counts incurred during addition stages), to check against Table 4 which assumed doublings only. function OperationCounts(k, e, squareCountFpv, multiplicationCountFpv, doublingMultiplicationCountFpe, doublingSquareCountFpk, doublingMultiplicationCountFpk, additionMultiplicationCountFpe, additionMultiplicationCountFpk, omega) kfact:=Factorization(k); // Assume k is 2^ki*3^kj (Pairing friendly field) efact:=Factorization(e); // Assume e is 2^ei*3^ej (e divides k) // Get ki and kj if #kfact gt 1 then // ki > 0 and kj > 0 ki:=kfact[1][2]; kj:=kfact[2][2]; elif kfact[1][1] eq 2 then; // k=2^ki ki:=kfact[1][2]; kj:=0; else // k =3^kj ki:=0; kj:=kfact[1][2]; end if; // Get ei and ej if #efact eq 0 then // e could be 1 if d=k ei:=0; ej:=0; elif #efact eq 2 then // ei > 0 and ej > 0 ei:=efact[1][2]; ej:=efact[2][2]; elif #efact eq 1 then if efact[1][1] eq 2 then; // e = 2^ei ei:=efact[1][2]; ej:=0; else // e = 2^ej ei:=0; ej:=efact[1][2]; end if; end if; mule:= 3^ei*5^ej; // The complexity of multiplications in Fp^e mulk := 3^ki*5^kj; // The complexity of multiplications in Fp^k tateDynamicCountAll:= (squareCountFpv*omega + multiplicationCountFpv)*mule + (doublingSquareCountFpk*omega + doublingMultiplicationCountFpk + additionMultiplicationCountFpk)*mulk + (doublingMultiplicationCountFpe + additionMultiplicationCountFpe)*e; tateDynamicCountDoublingsOnly:= (squareCountFpv*omega + multiplicationCountFpv)*mule + (doublingSquareCountFpk*omega + doublingMultiplicationCountFpk)*mulk + (doublingMultiplicationCountFpe)*e; ateDynamicCountAll:= (squareCountFpv*omega + multiplicationCountFpv) + (doublingSquareCountFpk*omega + doublingMultiplicationCountFpk + additionMultiplicationCountFpk)*mulk + (doublingMultiplicationCountFpe + additionMultiplicationCountFpe)*e; ateDynamicCountDoublingsOnly:= (squareCountFpv*omega + multiplicationCountFpv) + (doublingSquareCountFpk*omega + doublingMultiplicationCountFpk)*mulk + (doublingMultiplicationCountFpe)*e; return tateDynamicCountAll, tateDynamicCountDoublingsOnly, ateDynamicCountAll, ateDynamicCountDoublingsOnly; end function;
We use parameters from the 224-bit BN curve given in [5]. As a pseudo-check for bilinearity, we multiply the two random points R and S by two factors, and include these factors in the appropriate exponent. We use the subgroup order r as the loop parameter, as in the standard Tate pairing, rather than the twisted ate pairing. The point S is not necessarily in the trace-zero subgroup for the Tate pairing, whilst this is necessary if we wish to illustrate the ate pairing. p := 26959946667149205758383469736921695435015736735261155141423417423923; // Characteristic of base field (prime field) r := 26959946667149205758383469736921690242718878200571531029749235996909; // Subgroup order (in case of BN, this is curve order)... t := 5192296858534689624111674181427015; // Trace of Frobenius k:=12; // Embedding degree d:=6; // Degree of twist e:=2; // Subfield degree (e=k/d) Fp:=GF(p);
20
C. Costello and D. Stebila
A := 0; B := 3; E:=EllipticCurve([Fp|A,B]); // BN Curve has a=0, b=3 Fp2:=ExtensionField; // Quadratic extension n2:=#E(Fp2); lambda:=2; mu:=1+i; // Extension parameters Et:=EllipticCurve([Fp2|0,3*lambda^2*mu^3]); // Twisted curve Fp4:=ExtensionField; // Fp4=Fp2(alpha) where alpha = sqrt(mu^3) Fp12:=ExtensionField; // Fp12=Fp4(beta) where beta = (lambda^2)^(1/3) R:=Random(E); // Random point over base field (keep curve over Fp) for this E := EllipticCurve([Fp12| A, B]); // Redefine curve over Fp12 R:=E!R; // Coerce R St:=Random(Et); // Use twisting isomorphism to define St on twisted curve S:=E![1/(mu*beta)*St[1],1/(lambda*alpha)*St[2]]; // Twisting isomorphism takes St to S fact1:=Random(1,r); // Use random scalar factors for bilinearity "check" fact2:=Random(1,r); Rf:=fact1*R; Sf:=fact2*S; exp := Integers()!(#Fp12 - 1) div r; // Final exponentiation n:=5; // Number of combined iterations omega:=0.8; // square:multiplication complexity ratio for operation counting // Compute first pairing of R and S doublingLines, additionLines := PrecomputationsR(R, r, n, A); doublingLineProducts, additionLineProducts, additionIndexes, outsideIterations, remainingIterations, n := PrecomputationProducts(doublingLines, additionLines, r, n, A, B); fS, gS, squareCountFpv, multiplicationCountFpv:=DynamicPreparationS(S,n); f1, doublingMultiplicationCountFpe, doublingSquareCountFpk, doublingMultiplicationCountFpk, additionMultiplicationCountFpe, additionMultiplicationCountFpk:=DynamicComputations(doublingLineProducts, additionLineProducts, fS, gS, additionIndexes, outsideIterations, remainingIterations, n); // Compute second pairing of Rf and Sf doublingLines, additionLines := PrecomputationsR(Rf, r, n, A); doublingLineProducts, additionLineProducts, additionIndexes, outsideIterations, remainingIterations, n:= PrecomputationProducts(doublingLines, additionLines, r, n, A, B); fS, gS, squareCountFpv, multiplicationCountFpv:=DynamicPreparationS(Sf,n); f2, doublingMultiplicationCountFpe, doublingSquareCountFpk, doublingMultiplicationCountFpk, additionMultiplicationCountFpe, additionMultiplicationCountFpk :=DynamicComputations(doublingLineProducts, additionLineProducts, fS, gS, additionIndexes, outsideIterations, remainingIterations, n); //Count operations tateDynamicCountAll, tateDynamicCountDoublingsOnly, ateDynamicCountAll, ateDynamicCountDoublingsOnly :=OperationCounts(k, e, squareCountFpv, multiplicationCountFpv, doublingMultiplicationCountFpe, doublingSquareCountFpk, doublingMultiplicationCountFpk, additionMultiplicationCountFpe, additionMultiplicationCountFpk, omega); // "check" bilinearity and non-degeneracy if (f1-f2) ne 0 then print "GOOD: Pairing is non-degenerate"; else print "BAD: Pairing is degenerate"; end if; if (f1^(fact1*fact2*exp)-f2^exp) eq 0 then print "GOOD: Pairing is bilinear"; else print "BAD: Pairing is not bilinear"; end if;