Some new results on binary polynomial ... - Semantic Scholar

Report 4 Downloads 114 Views
Some new results on binary polynomial multiplication Murat Cenk Institute of Applied Mathematics, Middle East Technical University, Ankara, Turkey joint work with Anwar Hasan

April 10, 2015

Murat Cenk

New results on binary polynomial multiplication

1 / 36

Outline

1

Why do we need efficient multiplication algorithms in F2n ?

2

Known methods

3

Complexity tables for high speed cryptography

4

Improving complexities further

5

New results

Murat Cenk

New results on binary polynomial multiplication

2 / 36

Motivation: Why do we need efficient multiplication algorithms?

Cryptographic systems must be efficient. F2n is suitable for implementations. The value of n for practically used elliptic curve cryptography changes between 163 and 571, and one scalar multiplication requires several hundreds of field multiplications, i.e., it is not efficient unless careful designs and efficient algorithms are used.

Murat Cenk

New results on binary polynomial multiplication

3 / 36

Example Bernstein [Crypto 2009] A binary Edwards curve over F2251 = F2 [t]/(t251 + t7 + t4 + t2 + 1) is used. A single scalar multiplication requires 1266 field multiplications. Each multiplication needs 33974 bit operations, where 33096 bit operations for 251-bit polynomial multiplication, and 878 bit operations is required for reducing the 501-bit product modulo defining polynomial. So, totally 1266 × 33974 = 43011084 bit operations. Note that the other operations such as additions, squarings, multiplication by a fixed element of field, and conditional swap require totally 1668531, which is negligible compared to multiplication.

Murat Cenk

New results on binary polynomial multiplication

4 / 36

Notation and model of computation

Fqn is used for the finite field with q n elements (where q is a prime power), and Fq [X] is employed for the ring of polynomials over Fq . Mq (n) represents the minimum number of bit operations required for the computation of the product of two polynomials of a degree less than n over Fq . DA and DX denote the delay of bit level multiplication and addition, respectively. The cost metric related to polynomial multiplication is taken as the number of bit operations (bit addition and bit multiplication) required for multiplying polynomials over F2 or F4 , and since the computations are over characteristic two fields, addition and subtraction are equal.

Murat Cenk

New results on binary polynomial multiplication

5 / 36

Two measures of the complexity of an algorithm. Arithmetic complexity: the total number of operations required for multiplying polynomials and denoted by M (n).

Delay complexity: the depth of the corresponding arithmetic circuit, i.e., the length of the longest path and denoted by D(n).

Murat Cenk

New results on binary polynomial multiplication

6 / 36

Two measures of the complexity of an algorithm. Arithmetic complexity: the total number of operations required for multiplying polynomials and denoted by M (n).

Delay complexity: the depth of the corresponding arithmetic circuit, i.e., the length of the longest path and denoted by D(n). A0

A2

A1

x

A0

Murat Cenk

R6

R1

R4

α

A2

New results on binary polynomial multiplication

6 / 36

1

Why do we need efficient multiplication algorithms in F2n ?

2

Known methods

3

Complexity tables for high speed cryptography

4

Improving complexities further

5

New results

Murat Cenk

New results on binary polynomial multiplication

7 / 36

The computational complexity of multiplication Polynomial multiplication: Consider two degree n − 1 polynomials A(x) =

n−1 X

ai xi ,

B(x) =

i=0

n−1 X

bi xi .

i=0

The school-book multiplication gives us the product C(x) of A(x) and B(x) to be C(x) =

n−1 X n−1 X

ai bj xi+j .

i=0 j=0

This algorithm requires n2 multiplications and (n − 1)2 additions. Reduction: This step is generally easy and the cost is less than 5n. Murat Cenk

New results on binary polynomial multiplication

8 / 36

Karatsuba Algorithm

Karatsuba algorithm has better complexity. For example, consider two 2-term polynomials, A(x) = a0 + a1 x,

B(x) = b0 + b1 x.

Karatsuba algorithm computes the product C(x) = A(x)B(x) as C(x) = a1 b1 x2 + [(a0 + a1 )(b0 + b1 ) − a0 b0 − a1 b1 ]x + a0 b0 . Here we need just three multiplications a0 b0 , (a0 + a1 )(b0 + b1 ), a1 b1 and four additions.

Murat Cenk

New results on binary polynomial multiplication

9 / 36

Asymptotic complexity of Karatsuba Algorithm

Now, the size of polynomials are four (degree three): A(x) = a0 +a1 x+a2 x2 +a3 x3 = a0 + a1 x + |{z} x2 (a2 + a3 x) = A0 +yA1 , | {z } | {z } y

A0

A1

B(x) = b0 +b1 x+b2 x2 +b3 x3 = b0 + b1 x + |{z} x2 (b2 + b3 x) = B0 +yB1 . | {z } | {z } B0

y

B1

A(x)B(x) = A1 B1 y 2 +[(A0 +A1 )(B0 +B1 )−A0 B0 −A1 B1 ]y+A0 B0 . For 2n-term polynomials, we have M (2n) ≤ 3M (n) + 8n − 4, M (1) = 1, M (n) ≤ 7nlog2 3 + 4n − 4 = 7n1.585 + 4n − 4.

Murat Cenk

New results on binary polynomial multiplication

10 / 36

Karatsuba algorithm (with Bernstein’s improvement)

A(x) = A0 + X n A1 ; B(x) = B0 + X n B1 , A0 =

B0 =

n−1 X

ai X i , A1 =

n−1 X

i=0

i=0

n−1 X

n−1 X

i=0

bi X i , B1 =

ai+n X i ,

bi+n X i .

i=0

(A0 + X n A1 )(B0 + X n B1 ) = (1 + X n )(A0 B0 + X n A1 B1 ) + X n (A0 + A1 )(B0 + B1 )

Murat Cenk

New results on binary polynomial multiplication

11 / 36

The arithmetic complexity of the algorithm is as follow :  M2 (n + k) ≤ 2M2 (n) + M2 (k) + 3n + 4k − 3, n/2 ≤ k ≤ n,    D2 (2n) ≤ D2 (n) + 3DX , M (n) ≤ 6.5n1.58 − 7n + 1.5,    2 D2 (n) ≤ 3 log2 (n)DX + DA .

Murat Cenk

New results on binary polynomial multiplication

12 / 36

Karatsuba-like improved 3-way split algorithm This algorithm was obtained by C., Negre and Hasan in 2012 using a technique similar to that employed in [Zhou-Michalik]. P0 P2 P4 P5

= A0 B0 = P0L + P0H X n , P1 = A1 B1 = P1L + P0H X n , = A2 B2 = P2L + P2H X n , P3 = (A1 + A2 )(B1 + B2 ) = P3L + P3H X n , = (A0 + A1 )(B0 + B1 ) = P4L + P4H X n , = (A0 + A2 )(B0 + B2 ) = P5L + P5H X n ,

R0 = P0H + P1L , R1 = R0 + P0L , R2 = R1 + P4L , R3 = P1H + P2L , R4 = R1 + R3 , R5 = P4H + P5L , R6 = R4 + R5 , R7 = R3 + P2H , R8 = R7 + R0 , R9 = R8 + P3L , R10 = R9 + P5H , R11 = R7 + P3H , C = P0L + R2 X n + R6 X 2n + R10 X 3n + R11 X 4n + P2H X 5n . M2 (3n) ≤ 6M2 (n) + 18n − 6, M2 (2n + k) ≤ 5M2 (n) + M2 (k) + 12n + 6k − 6, n/2 < k ≤ n, M2 (2n + k) ≤ 5M2 (n) + M2 (k) + 13n + 4k − 5, k ≤ n/2, D2 (3n) ≤ D2 (n) + 4DX , M2 (n) ≤ 5.8n1.63 − 6n + 1.2, D2 (n) ≤ 4 log3 (n)DX + DA . Murat Cenk

New results on binary polynomial multiplication

13 / 36

Bernstein 4-way split algorithm

A = A0 + A1P X n + A2 X 2n + A3 X 3n , BP= B0 + B1 X + B2 X 2n + B3 X 3n n−1 n−1 where Aj = i=0 ai+nj X i and Bj = i=0 bi+nj X i for j = 0, 1, 2, 3. Bernstein’s 4-way algorithm is the following: AB = (1 + X 2n )((1 + X n )(A0 B0 + X n A1 B1 + X 2n A2 B2 + X 3n A3 B3 ) +X n (A0 + A1 )(B0 + B1 ) + X 3n (A2 + A3 )(B2 + B3 )) +X 2n (A0 + A2 + (A1 + A3 )X n )(B0 + B2 + (B1 + B3 )X n ). M2 (4n) ≤ M2 (2n) + 6M2 (n) + 27n − 8, M2 (3n + k) ≤ M2 (2n) + 5M2 (n) + M2 (k) + 19n + 8k − 8, n/2 ≤ k ≤ n, D2 (4n) ≤ D2 (n) + 5DX , M2 (n) ≤ 6.425n1.58 − 6.8n + 1.375, D2 (n) ≤ 5 log4 (n)DX + DA .

Murat Cenk

New results on binary polynomial multiplication

14 / 36

Interpolation method

Let C(x) = A(x)B(x) in Fqn and 2n − 1 ≤ q where q is a prime power. Step 1 (Selection) Choose 2n − 1 points i.e. w0 , w1 , · · · , wd−1 . Step 2 (Evaluation) For i = 0, 1, . . . , 2n − 1, (i) Compute A(wi ) and B(wi ) (ii) Compute the product A(wi ) · B(wi ) Step 3 (Interpolation) Compute the polynomial product C(x) of A(x) and B(x) of (2n − 1)-term such that C(wi ) = A(wi ) · B(wi ).

Murat Cenk

New results on binary polynomial multiplication

15 / 36

Step 3 can be done  c0  c1   ..  . |

explicitly by the following matrix equation.    A(w0 )B(w0 )   A(w1 )B(w1 )     (1)  = V −1 ·   ..    .

cd−1 {z } C

|

where    V = 

A(wd−1 )B(wd−1 ) {z } A(wi )B(wi ) ··· ··· .. .

w0d−1 w1d−1 .. .

2 1 wd−1 wd−1 ···

d−1 wd−1

1 1 .. .

w0 w1 .. .

w02 w12 .. .

    

The matrix V is called interpolation matrix. Since matrix V a is Van der Monde matrix, it is invertible.

Murat Cenk

New results on binary polynomial multiplication

16 / 36

Bernstein’s 3-way split formula A = A0 + A1 Y + A2 Y 2 , B = B0 + B1 Y + B2 Y 2 . Bernstein has used these five elements 0, 1, X, X + 1 and ∞. Evaluations P0 P1 P2 P3 P4

Murat Cenk

= A0 B 0 , = (A0 + A1 + A2 )(B0 + B1 + B2 ), = (A0 + A1 X + A2 X 2 )(B0 + B1 X + B2 X 2 ), = (A0 + A1 + A2 ) + (A1 X + A2 X 2 )  (B0 + B1 + B2 ) + (B1 X + B2 X 2 ) , = A2 B 2 .

New results on binary polynomial multiplication

17 / 36

Bernstein’s 3-way split formula A = A0 + A1 Y + A2 Y 2 , B = B0 + B1 Y + B2 Y 2 . Bernstein has used these five elements 0, 1, X, X + 1 and ∞. Evaluations P0 P1 P2 P3 P4

= A0 B 0 , = (A0 + A1 + A2 )(B0 + B1 + B2 ), = (A0 + A1 X + A2 X 2 )(B0 + B1 X + B2 X 2 ), = (A0 + A1 + A2 ) + (A1 X + A2 X 2 )  (B0 + B1 + B2 ) + (B1 X + B2 X 2 ) , = A2 B 2 .

Reconstruction

U

W , X2 + X = P0 + (P0 + P1 )X, V = P2 + (P2 + P3 )(X n/3 + X)

W

= (U + V + P4 (X 4 + X))(X 2n/3 + X n/3 )

C = U + P4 (X 4n/3 + X n/3 ) +

Murat Cenk

New results on binary polynomial multiplication

17 / 36

Complexities

M (n) ≤ 3M (n/3) + 2M (n/3 + 2) + 35n/3 − 12, M (n/3 + 2) ≤ M (n/3) + 8n/3 + 4, M (n) ≤ 25.5nlog3 (5) − 25.5n + 1, M (n) = O(n1.46 ).

Murat Cenk

New results on binary polynomial multiplication

18 / 36

Multi-evaluation and reconstruction data flow

A0

P3

P2

A2

A1

P4

P0

P1

3 n

1

n 3

n

3

3

1

1

n 3

A0

R3

R1

R4

A2 Div. by X 2+X n 3

n

n

3

3

C

Murat Cenk

New results on binary polynomial multiplication

19 / 36

Delay evaluations Reconstruction C = U + P4 (X 4n/3 + X n/3 ) +

W , X2 + X

Division by X 2 + X Divide W by X which is a shift of the coefficients of W . Divide W/X by X + 1. The coefficients of W/(X 2 + X): 0 wn−j = wn + wn−1 + · · · + wn−j+2 .

The corresponding delay is (n − 2)D⊕ where D⊕ is the delay of a bit addition. Delay complexity D(n) = ( Murat Cenk

3n 3 + 8 log3 (n) − )D⊕ + D⊗ . 2 2

New results on binary polynomial multiplication

20 / 36

Three-way formula based on field extension C., Negre and Hasan proposed a different approach. F4 = F2 [α]/(α2 + α + 1) = {0, 1, α, α + 1}. Evaluate the polynomials at 0, 1, α, α + 1 and ∞. Evaluations P0 P1 P2 P3 P4

= = = = =

A0 B0 , (A0 + A1 + A2 )(B0 + B1 + B2 ), (A0 + A2 + α(A1 + A2 ))(B0 + B2 + α(B1 + B2 )), (A0 + A1 + α(A1 + A2 ))(B0 + B1 + α(B1 + B2 )), A2 B 2 .

Reconstruction C = (P0 + X n/3 P4 )(1 + X n ) + (P1 + (1 + α)(P2 + P3 ))(X n/3 + X 2n/3 + X n ) + α(P2 + P3 )X n + P2 X 2n/3 + P3 X n/3 . Murat Cenk

New results on binary polynomial multiplication

21 / 36

Complexities

MF2 (n) ≤ 2MF4 (n/3) + 3M F2 (n/3) + 29n/3 − 12, MF4 (n) ≤ 5M F4 (n/3) + 58n/3 − 21, MF4 (n) ≤ 30.75nlog3 (5) − 29n + 5.25, MF2 (n) ≤ 30.75nlog3 (5) − 9.67n log3 (n) − 30.5n + 0.75.

Murat Cenk

New results on binary polynomial multiplication

22 / 36

Multi-evaluation and reconstruction data flow

P0

P2

P1

P4

P3 n

A0

3

A2

A1



x

x (1+α)

n

α

n

3

n

3

A0

R6

R1

R4

n

3

n

n

3

3

A2

C

Murat Cenk

New results on binary polynomial multiplication

23 / 36

Delay evaluations

DF2 (n) ≤ 7D⊕ + DF4 (n/3), DF4 (n) ≤ 9D⊕ + DF4 (n/3), DF4 (n) ≤ 9 log3 (n)D⊕ + D⊗ , DF2 (n) ≤ (9 log3 (n) − 2)D⊕ + D⊗ .

Murat Cenk

New results on binary polynomial multiplication

24 / 36

Complexity comparisons

CNH complexities M (n) ≤ 30.75nlog3 (5) − 9.67n log3 (n) − 30.5n + 0.75, D(n) ≤ (9 log3 (n) − 2)D⊕ + D⊗ .

Murat Cenk

New results on binary polynomial multiplication

25 / 36

Complexity comparisons

CNH complexities M (n) ≤ 30.75nlog3 (5) − 9.67n log3 (n) − 30.5n + 0.75, D(n) ≤ (9 log3 (n) − 2)D⊕ + D⊗ .

Bernstein’s complexities log3 (5) − 25.5n + 1, M (n) ≤ 25.5n   3n 3 + 8 log3 (n) − D(n) ≤ D⊕ + D⊗ . 2 2

Murat Cenk

New results on binary polynomial multiplication

25 / 36

1

Why do we need efficient multiplication algorithms in F2n ?

2

Known methods

3

Complexity tables for high speed cryptography

4

Improving complexities further

5

New results

Murat Cenk

New results on binary polynomial multiplication

26 / 36

1

Why do we need efficient multiplication algorithms in F2n ?

2

Known methods

3

Complexity tables for high speed cryptography

4

Improving complexities further

5

New results

Murat Cenk

New results on binary polynomial multiplication

27 / 36

A new split method for Bernstein’s 3-way split algorithm

We compute (XA(X))(XB(X)) instead of A(X)B(X) by using Bernstein’s 3-way split algorithm. XA(X) = A0 + A1 X n+1 + A2 X 2n+2 XB(X) = B0 + B1 X n+1 + B2 X 2n+2 , This method splits 3n-term polynomials as (n, n + 1, n − 1) rather than (n, n, n) M2 (3n) ≤ M2 (n) + 2M2 (n + 1) + M (n + 2) + M (n − 1) + 35n − 12, M2 (3n − 2) ≤ 2M2 (n) + M2 (n + 1) + 2M (n − 1) + 35n − 13.

Murat Cenk

New results on binary polynomial multiplication

28 / 36

Improved 5-way split algorithm

A = A0 + A1 X n + A2 X 2n + A3 X 3n + A4 X 4n , B = B0 + B1 X n + B2 X 2n + B3 X 3n + B4 X 4n . m1 = A0 B0 , m2 = A1 B1 , m3 = A2 B2 , m4 = A3 B3 , m5 = A4 B4 , m6 = (A0 + A1 )(B0 + B1 ), m7 = (A0 + A2 )(B0 + B2 ), m8 = (A2 + A4 )(B2 + B4 ), m9 = (A3 + A4 )(B3 + B4 ), m10 = (A0 + A2 + A3 )(B0 + B2 + B3 ), m11 = (A1 + A2 + A4 )(B1 + B2 + B4 ), m12 = (A0 + A3 + A1 + A4 )(B0 + B3 + B1 + B4 ), m13 = (A0 + A1 + A2 + A3 + A4 )(B0 + B1 + B2 + B3 + B4 ),

Murat Cenk

New results on binary polynomial multiplication

29 / 36

Let C =

P10

i=1

Ui X (i−1)n

t1 = p1 + p2 , t2 = t1 + p3 , t3 = t2 + p11 , t4 = p4 + p5 , t5 = p12 + p13 , t6 = t4 + t5 , t7 = t2 + t6 , t8 = t1 + t4 , t9 = p6 + p7 , t10 = t8 + t9 , t11 = t10 + p9 , t12 = p14 + p15 , t13 = t11 + t12 , t14 = p19 + p23 , t15 = t14 + p25 , t16 = t13 + t15 , t17 = p8 + p9 , t18 = t17 + p10 , t19 = t18 + p18 , t20 = p6 + p7 , t21 = t18 + t20 , t22 = p16 + p17 , t23 = t21 + t22 , t24 = t23 + t3 , t25 = p20 + p21 , t26 = p25 + p26 , t27 = p19 + p24 , t28 = t25 + t26 , t29 = t28 + t27 , t30 = t29 + t24 , t31 = t7 + t19 , t32 = t28 + t31 , t33 = p22 + p23 , t34 = t32 + t33 , t35 = t11 + p1 , t36 = t35 + p10 , t37 = t36 + t12 , t38 = t37 + p22 , t39 = t38 + p24 , t40 = t39 + p26 , U1 = p1 , U2 = t3 , U3 = t7 , U4 = t16 , U5 = t30 , U6 = t34 , U7 = t40 , U8 = t23 , U9 = t19 , U10 = p10 ,

Murat Cenk

New results on binary polynomial multiplication

30 / 36

Asymptotic complexities of this algorithm are the following: M2 (n) ≤ 13M2 (n) + 56n/5 − 18, M2 (1) = 1, M2 (n) ≤ 6.5n1.58 − 7n + 1.5, D2 (n) ≤ D2 (n/5) + 12DX , D2 (1) = DA , D2 (n) ≤ 12 log5 (n)DX + DA .

Murat Cenk

New results on binary polynomial multiplication

31 / 36

New improved 3-way algorithm

P0 = A0 B0 , P1 = (A0 + A1 + A2 )(B0 + B1 + B2 ), P4 = A2 B2 , P2 = (A0 + A2 + α(A1 + A2 ))(B0 + B2 + α(B1 + B2 )) = P2,0 + αP2,1 , C = P4 X 4n + (P0 + P1 + P2,1 )X 3n + (P2,0 + P1 + P2,1 )X 2n +(P4 + P1 + P2,0 )X n + P0 Asymptotic complexities of this algorithm are the following: M2 (n) ≤ 3M2 (n/3) + M4 (n/3) + 20n/3 − 5, , M2 (1) = 1, M2 (n) ≤ 15.125n1.46 − 14.25n − 2.4274 log3 (n) + 0.125, D2 (n) ≤ D4 (n/3) + 8DX , D2 (1) = DA , D2 (n) ≤ 10 log3 (n)DX + DA .

Murat Cenk

New results on binary polynomial multiplication

32 / 36

Comparison of complexities

Table: Cost of multiplication

Algorithm

Split

M (n)

Delay

Bernstein

2

6.5n1.58

Bernstein

3

25.5n1.46 + O(n)

(1.5n + O(log3 (n))DX

3

5.8n1.63

4 log3 (n)DX

3

30.25n1.46

3

15.125n1.46

4

6.425n1.58

5

6.5n1.5

CNH CNH CH Bernstein CH

Murat Cenk

+ O(n) + O(n) + O(n) + O(n)

+ O(n)

+ O(n)

3 log2 (n)

10 log3 (n)DX 10 log3 (n)DX 5 log4 (n)DX 11 log5 (n)DX

New results on binary polynomial multiplication

33 / 36

1

Why do we need efficient multiplication algorithms in F2n ?

2

Known methods

3

Complexity tables for high speed cryptography

4

Improving complexities further

5

New results

Murat Cenk

New results on binary polynomial multiplication

34 / 36

New results

Murat Cenk

n

Previous

New

9 15 17 18 19 21 22 23 24 25 26 27 163 233 251 256 283 407 408 409 571

132 329 414 456 502 602 641 678 704 800 856 922 16923 29354 33096 34079 38735 67374 67582 67753 112569

126 317 407 438 498 596 632 676 702 791 853 912 16828 29156 32604 33397 38432 66931 67137 67284 111621

New results on binary polynomial multiplication

35 / 36

Thank you for your attention.

Murat Cenk

New results on binary polynomial multiplication

36 / 36