Differential Fault Analysis on AES Key Schedule and Some Countermeasures Chien-Ning Chen and Sung-Ming Yen Laboratory of Cryptography and Information Security (LCIS) Dept of Computer Science and Information Engineering National Central University, Chung-Li, Taiwan 320, R.O.C. {ning;yensm}@csie.ncu.edu.tw http://www.csie.ncu.edu.tw/˜yensm/
Abstract. This paper describes a DFA attack on the AES key schedule. This fault model assumes that the attacker can induce a single byte fault on the round key. It efficiently finds the key of AES-128 with feasible computation and less than thirty pairs of correct and faulty ciphertexts. Several countermeasures are also proposed. This weakness can be resolved without modifying the structure of the AES algorithm and without decreasing the efficiency. Keywords: AES, Differential fault analysis (DFA), Physical cryptanalysis, Rijndael, Smart cards.
1
Introduction
Since physical cryptanalysis [1,2,3,4] was first considered a few years ago, secure implementations of cryptographic systems have received much attention. Conventional cryptanalysis deals with only the mathematical properties of a system, but physical cryptanalysis focuses on the physical behavior of a system when an implementation executes. Differential fault analysis (DFA) is one category of physical cryptanalysis and was originally proposed by Biham and Shamir in 1997 [5]. It assumes that an attacker can induce faults into a system and collect the correct as well as the faulty behaviors. The attacker compares the behaviors in order to retrieve the secret information embedded inside a system (more precisely, an implementation). As to the reality of DFA attacks or other kind of hardware fault attacks, it was once considered to be more or less theoretical work. However, more and more researchers in this field warn people of the danger of hardware fault attacks. The most important thing is that extremely few attention has ever been paid to DFA on AES and we have the first result in early 2002 [6] which was motivated by our another unpublished work in 2000 of DFA on IDEA [7]. In [6], it assumes that a single bit fault can be induced on the temporary result within the cipher.
Supported in part by the National Science Council of the Republic of China under contract NSC 91-2213-E-008-032.
R. Safavi-Naini and J. Seberry (Eds.): ACISP 2003, LNCS 2727, pp. 118–129, 2003. c Springer-Verlag Berlin Heidelberg 2003
Differential Fault Analysis on AES Key Schedule
119
By comparing the correct and the faulty decryption outputs, the attacker can retrieve possible values of one byte of the last round key. The unique value of the byte can be retrieved by intersecting several sets of those possible values. However, the DFA considered in [6] has its limitation in practice because the operands of most computers are ‘byte’ or ‘word’, but not ‘bit’. In practice, it is not easy to induce a fault within only one bit. The result in this paper is much important than our result in [6] since the new attack needs fewer faulty ciphertexts to mount the attack and makes more reasonable assumption. Another DFA attack on the AES algorithm was proposed in 2003 [8]. Its fault module is to induce a fault within one byte of the temporary result before the ninth round’s MixColumns. The MixColumns will propagate the fault to four bytes, which will cause four-byte differences in the input of the last round’s SubBytes. Possible values of the differences in the input of SubBytes can be derived from the relationships between them. After retrieving the differences in the input and output of SubBytes, the remaining process is similar to [6]. Not only the temporary result, the AES key schedule has similar weakness, because both of them employ the same non-linear function, SubBytes. Giraud proposed another DFA on AES based on this observation [9]. It was announced that the key of AES-128 can be resolved with about 250 faulty ciphertexts in five days by using a modern personal computer. Thus not only the temporary result but also the key schedule is vulnerable under the DFA attacks. The main contribution of this paper is that we extend Giraud’s DFA and develop a novel methodology to recover the encryption key with fewer faulty ciphertexts and with extremely less computational complexity. The first two steps of our method are similar to Giraud’s. But in the second step of our method, only three bytes of the round key are retrieved in order to reduce the number of samples required. The third step of Giraud’s method requires a huge amount of computation, on the contrary, the third step of ours focuses on the inverse SubBytes and requires only very few samples and computations. The requirement of samples is analyzed rigorously in this paper. Finally, three countermeasures are proposed. All of them are compatible with the standard of AES-128. All other parts of this paper are organized as follows. In Sect. 2, the AES algorithm will be briefly reviewed and some necessary notations will be defined. The proposed DFA on the AES key schedule will be described in Sect. 3 with rigorous analysis on the correctness and its attack performance. Finally, three possible countermeasures will be provided in Sect. 4.
2
The AES Specification
The AES algorithm [10,11] was published by NIST in Nov 2001. Figure 1 is the block diagram of the last two rounds. The key schedule is split into the linear part, Li , and the non-linear part, Ni .
120
C.-N. Chen and S.-M. Yen
L9
L10
K8
K9
K10
N9
N10
Round 9
M8
SB
Round 10 SR
MC
M9
SB
SR
C
Fig. 1. The last two rounds of AES-128.
2.1
The Key Schedule of the AES Algorithm
The KeyExpansion function in Fig. 2 generates the round keys for AES-128. There are two sub functions and one table in the KeyExpansion. They are SubWord(), RotWord(), and Rcon[].
01 02 03 04 05 06 07 08 09 10 11
key[]: The input 16-byte key. w[]: The resulting round keys, stored in an array of four-byte words. KeyExpansion(byte key[16], word w[44]){ word temp; for (i = 0; i < 4; i++) w[i] = (key[4*i], key[4*i+1], key[4*i+2], key[4*i+3]); for (i = 4; i < 11; i++){ temp = w[i-1]; if (i mod 4 == 0) temp = SubWord(RotWord(temp)) ⊕ Rcon[i/4]; w[i] = w[i-4] ⊕ temp; } } Fig. 2. Pseudo codes for the key expansion of AES-128.
SubWord() is a function that takes a word input and applies the SubBytes() to each of those four bytes to produce a word output. RotWord() takes a word (a0 , a1 , a2 , a3 ) as its input, performs a cyclic permutation, and returns the word (a1 , a2 , a3 , a0 ). Rcon[] is a four-byte table and Rcon[i] contains the values, (2i−1 , 0, 0, 0), where 2i−1 is the power of 2 (= 1 · x) in the defined field. 2.2
Notations
Necessary notations are defined in the following. Mi is the temporary result of message after the ith round. Mi is the temporary result with fault. C is the ciphertext, and the faulty ciphertext is D. (Note: The subscript, “Byte” or “Word”, means that the size of the elements in the bracket is 8-bit or 32-bit.)
Differential Fault Analysis on AES Key Schedule
121
Mi = (mi [0], · · · , mi [15])Byte , and C = MNr = (c[0], · · · , c[15])Byte . Mi = (mi [0], · · · , mi [15])Byte , and D = MN = (d[0], · · · , d[15])Byte . r Ki is the ith round key and Ki is the ith round key with fault. Ki = (ki [0], · · · , ki [15])Byte = (w[4i], · · · , w[4i + 3])Word , and Ki = (ki [0], · · · , ki [15])Byte . In addition, Li is the linear part of the key schedule, and Ni is the non-linear part. Ki = Li ⊕ Ni . In AES-128, Li and Ni are defined as Li = (li [0], · · · , li [15])Byte = (w[4i − 4], w[4i − 4] ⊕ w[4i − 3], w[4i − 4] ⊕ w[4i − 3] ⊕ w[4i − 2], w[4i − 4] ⊕ w[4i − 3] ⊕ w[4i − 2] ⊕ w[4i − 1])Word and Ni = (ni [0], · · · , ni [15])Byte = (SubWord(RotWord(w[4i − 1])) ⊕ Rcon[i], SubWord(RotWord(w[4i − 1])) ⊕ Rcon[i], SubWord(RotWord(w[4i − 1])) ⊕ Rcon[i], SubWord(RotWord(w[4i − 1])) ⊕ Rcon[i])Word . Li = (li [0], · · · , li [15])Byte is the faulty Li , and Ni = (ni [0], · · · , ni [15])Byte is the faulty Ni .
3
The DFA Attack on the AES-128 Key Schedule
This section describes DFA on AES-128 with round keys generated on the fly. It assumes that the attacker can induce a single byte fault on the round key and collect the correct ciphertext C as well as the faulty ciphertext D. The idea of this attack is also suitable for AES-192 and AES-256, but the attacker can retrieve only a part of round keys. 3.1
Faults on the Last Four Bytes of K9
In order to retrieve the last four bytes of K9 , a fault is induced only on one of last four bytes in K9 . When the single one-byte fault occurs, there are five non-zero bytes in C ⊕ D. Four bytes of them are equal and lay on the same row. The remaining one is placed on the particular byte corresponded to where the fault is induced on. If the faults occur on more bytes, there will be more non-zero rows in C ⊕ D. Inducing faults on more bytes doesn’t mean that this attack will fail. It may reduce the required samples but need analyze case by case. When
122
C.-N. Chen and S.-M. Yen Fault
L10
K9
K
00000 000 1000 000 000 00000 00 00000 00
N10000 00 00000 00 00000 000 Round 10
M9
SB
C
SR
000 00000 000 00 000 00 00
Fig. 3. Fault induced on the twelfth byte of K9 .
the difference is the specified form, the probability that the fault occurs on more bytes is extremely low. The fault, δ9 [i] (i = 12 ∼ 15), induced on the ith byte of K9 can be derived [i] = l10 [i] ⊕ l10 [i] = k9 [i] ⊕ k9 [i] = δ9 [i]. The possible from c[i] ⊕ d[i] = k10 [i] ⊕ k10 values of k9 [i] can be deduced from c[i − 1] ⊕ d[i − 1] = k10 [i − 1] ⊕ k10 [i − 1] = n10 [i − 1] ⊕ n10 [i − 1]
= SubBytes(k9 [i]) ⊕ SubBytes(k9 [i] ⊕ δ9 [i]).
(1)
The unique value of k9 [i] can be retrieved by intersecting several sets of the possible values caused by different induced faults. k9 [12 ∼ 15] can be retrieved in this step.
3.2
Faults on the Last Three Bytes of K8
In the second step, a fault is induced on a single one-byte of w[35]. When the single byte fault occurs, there are nine or ten non-zero bytes in C ⊕ D. Eight of them lay on two rows.
Fault
K8
L9
000000 000 000 000000 L10
00 00000 000 00 000 00000 00000 K009000
000 00 000 00 000 00 000 00
SB
00 00 000 000 00 000 00 000 00 000 00000 10
000N0000000
N9
10
Round 9
M8
000000K00000000000000
000 00 000 00 000 00 000 00
SR
MC
00 000 00 000 00 000 00 000 00 000 M00000 9
Round 10 SB
SR
0000000000C0000000000
Fig. 4. Fault induced on the thirteenth byte of K8 .
If the fault occurs on one of w[35]’s last three bytes, the induced fault, δ8 [i] (i = 13 ∼ 15), can be derived from c[i] ⊕ d[i]. In following equations, fiSR (i) denotes the location for kx [i] before ShiftRows(), i.e., the fiSR (i)-th byte will be rotated to the ith byte by ShiftRows().
Differential Fault Analysis on AES Key Schedule
123
[i]) c[i] ⊕ d[i] = (SubBytes(m9 [fiSR (i)]) ⊕ k10 [i]) ⊕ (SubBytes(m9 [fiSR (i)]) ⊕ k10 = k10 [i] ⊕ k10 [i] = l10 [i] ⊕ l10 [i] = k9 [i] ⊕ k9 [i] = l9 [i] ⊕ l9 [i] = k8 [i] ⊕ k8 [i] = δ8 [i]. (2)
k9 [i − 1] can be derived from c[i − 2], d[i − 2], and k9 [12 ∼ 15]. For each (c[i − 2], d[i − 2], c[i] ⊕ d[i]) pair, some possible values of k8 [i] can be deduced from the following equatioin. The unique value of k8 [i] can be retrieved by comparing several sets of those possible values, (i = 13 ∼ 15). c[i − 2] ⊕ d[i − 2] = (SubBytes(m9 [fiSR (i − 2)]) ⊕ k10 [i − 2]) ⊕ (SubBytes(m9 [fiSR (i − 2)]) ⊕ k10 [i − 2]) = n10 [i − 2] ⊕ n10 [i − 2]
= SubBytes(k9 [i − 1]) ⊕ SubBytes(k9 [i − 1]), so k9 [i − 1] = SubBytes−1 (c[i − 2] ⊕ d[i − 2] ⊕ SubBytes(k9 [i − 1])). k9 [i − 1] ⊕ k9 [i − 1] = n9 [i − 1] ⊕ n9 [i − 1] = SubBytes(k8 [i]) ⊕ SubBytes(k8 [i] ⊕ δ8 [i]).
(3)
But the behavior is different when the fault occurs on k8 [12]. Because the rotation amount of the ShiftRows in the first row is 0, δ8 [12] can not be derived directly from (2). Without the value of δ8 [12], k8 [12] can not be retrieved. 3.3
Faults on the Eighth to Eleventh Bytes of K8
In the third step, a fault is induced on the single byte of w[34], k8 [8 ∼ 11]. When the fault occurs on only one byte, there are six or seven non-zero bytes in C ⊕ D. Four of them appear in the same row with the equal value.
Fault
L9
L10
K8
K9
K
N9
000 00 000 00 000 00 000 00
Round 9
M8
SB
10 00 000 00 000 00 000 00000
N 000 000 00 1000 Round 10 SR
MC
M9
SB
SR
C
000 00 000 00 000 00 000 00
Fig. 5. Fault induced on the eighth byte of K8 .
If the fault occurs on the ith byte of K8 (i = 8 ∼ 11), it raises c[i − 5] ⊕ d[i − 5] = SubBytes(k9 [i + 4]) ⊕ SubBytes(k9 [i + 4]) = SubBytes(k9 [i + 4]) ⊕ SubBytes(k9 [i + 4] ⊕ δ8 [i]) and δ8 [i] = SubBytes−1 (c[i − 5] ⊕ d[i − 5] ⊕ SubBytes(k9 [i + 4])) ⊕ k9 [i + 4].
124
C.-N. Chen and S.-M. Yen
c[fSR (i)] = SubBytes(m9 [i]) ⊕ k10 [fSR (i)] and d[fSR (i)] = SubBytes(m9 [i]) ⊕ k10 [fSR (i)]
= SubBytes(m9 [i] ⊕ δ8 [i]) ⊕ k10 [fSR (i)]. −1 δ8 [i] = SubBytes (c[fSR (i)] ⊕ k10 [fSR (i)]) ⊕ SubBytes−1 (d[fSR (i)] ⊕ k10 [fSR (i)]). Similarly,
(4)
δ8 [i] = SubBytes (c[fSR (i + 4)] ⊕ k10 [fSR (i + 4)]) ⊕ SubBytes−1 (d[fSR (i + 4)] ⊕ k10 [fSR (i + 4)]).
(5)
−1
δ8 [i] can be derived easily from above equations. k10 [8], k10 [12], k10 [5], k10 [9], k10 [2], k10 [6], k10 [15], k10 [3] can be retrieved from (4) and (5). These deductions use the same technique that deduces the last four bytes of K9 in Section 3.1. 3.4
The Whole of K10
The last four bytes of K9 and the last three bytes of K8 are retrieved in Sects. 3.1 and 3.2. The ninth to eleventh bytes of K9 can be derived from k9 [i] = k8 [i + 4] ⊕ k9 [i + 4], (i = 9 ∼ 11). In Sect. 3.3, we retrieve eight bytes of K10 , k10 [2, 3, 5, 6, 8, 9, 12, 15]. Another five bytes of K10 , k10 [7, 10, 11, 13, 14], can be deduced by the relationship between ninth and tenth round keys. The remaining three bytes of K10 can be efficiently retrieved by exhaustive search. 3.5
Performance Evaluation
This proposed DFA attack is based on retrieving possible values of keys from the differences inside and outside the SubBytes(). The set of the possible values is defined by Definition 1. SubBytes() is the S-Box of AES. δx and δy are the differences inside and outside the SubBytes(). Let F(δx , δy ) is the set associated the δx and δy by F(δx , δy ) = {x | SubBytes(x) ⊕ SubBytes(x ⊕ δx ) = δy }. The requirement of samples is affected by the size of the set or the intersection of several sets. The following propositions are necessary to evaluate the requirement. Proposition 1. If δx , δy = 0, then the size of F(δx , δy ) is zero, two, or four. Proposition 2. x ∈ F(δx , δy ) if and only if (x ⊕ δx ) ∈ F(δx , δy ).
Proposition 3. When δx , δy = 0, |F(δx , δy )| = 4 iff {0, δx } ⊂ F(δx , δy ). Moreover, F(δx , δy ) = {0, δx , δx · 0xBC, δx · 0xBD} and δy = SubBytes(0) ⊕ SubBytes(δx ).
Differential Fault Analysis on AES Key Schedule
125
Proposition 4. If F(δx , δy ) ∩ F(δx , δy ) > 1, (δx , δx = 0 and δx = δx ), then F(δx , δy ) = F(δx , δy ) = {0, δx , δx , δx ⊕ δx }. Proposition 5. If F(δx , δy ) = {0, δx , x2 , x3 }, then F(δx , δy ) = F(x2 , δy ) = F(x3 , δy ) where δy = SubBytes(0) ⊕ SubBytes(x2 ) and δy = SubBytes(0) ⊕ SubBytes(x3 ). In addition, x2 ⊕ x3 = δx . Proposition 6. If F(x1 , δy ) = {0, x1 , x2 , x3 } and F(x, δy ) = F(x1 , δy ). Then x ∈ {x1 , x2 , x3 }.
Proposition 7. There are 85 various sets whose size is 4.
Definition 2. Similar to the definition of F(δx , δy ), G(δx , δy ) is defined by G(δx , δy ) = {y | SubBytes−1 (y) ⊕ SubBytes−1 (y ⊕ δy ) = δx }. Proposition 8. The properties of G(δx , δy ) are similar to F(δx , δy ). And G(δx , δy ) = {y | SubBytes−1 (y) ∈ F(δx , δy )}. Because of the above propositions, the probability that the size of two sets’ intersection equals one is approximately 100%. The probability that the size is larger than one is 1/127 if the corresponding unknown byte of the round key is ‘0’, or 3/(255 ∗ 127) if the byte is not ‘0’. In general, this means that an unique solution can be retrieved by intersecting two sets. In the worst case, intersecting four sets can determine the unique byte. Seven bytes of the round keys can be retrieved by (1) and (3) in Sects. 3.1 and 3.2. Each byte requires two, three, or four faulty samples. In Sect. 3.3, (4) and (5) can share the faulty ciphertexts, so the attacker can retrieve four bytes by (4) and four bytes by (5) in parallel. Therefore, one correct ciphertext and less than forty-four faulty ones are sufficient if the fault can be induced accurately. In most cases, twenty-two faulty ciphertexts are enough. The table, {δx , δy , F(δx , δy )} (δx , δy ∈ [1, 255]) can be constructed by checking if δy = SubBytes(x) ⊕ SubBytes(x ⊕ δx ) for all values of δx , δy , and x. Because the equation contains only simple exclusive-OR and SubBytes()-table lookup, this table can be constructed very efficiently. When simulating, this can be completed within one second on a Pentium 4 computer. The thirteen bytes of the last round key can be obtained easily because the set associated with δx and δy can be found simply by a table lookup. The remaining three bytes require a search on the 224 possibilities. The time required by this exhaustive search is similar to decrypting 224 blocks which can be completed within one minute on a Pentium 4 computer. Thus, the computational cost of this attack is extremely small.
126
4 4.1
C.-N. Chen and S.-M. Yen
Possible Countermeasures The First Approach – Storing the Round Key
The first proposed countermeasure is to avoid generating the round key on the temper-proof device. Storing the round keys requires more flash memory but can eliminate the code that generates the round keys. Removing the key schedule will decrease the code size in ROM and increase the performance. However, some systems need to update the key frequently. For those systems, two countermeasures are suggested and described in the following. 4.2
The Second Approach – Generating the Round Key Once
The proposed fault model has three stages, and each of the stages requires various faults induced. If the round key is generated only once when updating the key, the attack has only one chance to induce the fault on the key schedule. It is impossible to collect sufficient pairs to satisfy the requirement. Furthermore, the deduction in Sect. 3.1 only depends on the induced fault, i.e., it is independent of the value of the corresponding plaintext. With only one chance to induce the fault, the attacker can only recover partial information about the last four bytes of K9 . Without the accurate values of the last four bytes of K9 , the remaining deductions in Sects. 3.2 and 3.3 are infeasible. 4.3
The Third Approach – Parity Checking for Round Keys
Under some circumstance, the system designer still wishes to generate the round keys on the fly. For example, to parallel the round and the key expansion is a countermeasure to protect the key expansion against the SPA attacks [12]. In this case, the third approach can be employed to detect if faults are induced. Before describing this countermeasure, the linear part and the non-linear part of the key expansion are reorganized as 1111 li [0] li [4] li [8] li [12] 0 1 1 1 li [1] li [5] li [9] li [13] Li = li [2] li [6] li [10] li [14] = Ki−1 0 0 1 1 = Ki−1 T. 0001 li [3] li [7] li [11] li [15] t 1 1 1 1 , and Ni = ni [0] ni [1] ni [2] ni [3] t t Ni = Ni 1 0 0 0 = ni [0] ni [1] ni [2] ni [3] . T is the transformation matrix and T 4 = I4 . Since each column of Ni is equal, the extra notations, Ni , is defined to simplify the expressions. The round keys are reorganized as K1 = K0 T ⊕ N 1 . K2 = K1 T ⊕ N 2 = K0 T 2 ⊕ N 1 T ⊕ N 2 . ·········
Differential Fault Analysis on AES Key Schedule
K10 = K0 T 10 ⊕
10
127
(Ni T 10−i )
i=1
= K0 T ⊕ (N2 ⊕ N6 ⊕ N10 ) 1 1 1 1 ⊕ (N1 ⊕ N5 ⊕ N9 ) 1 0 1 0 ⊕ (N4 ⊕ N8 ) 1 1 0 0 ⊕ (N3 ⊕ N7 ) 1 0 0 0 . 2
w[40] = w[0] ⊕
10
Ni ,
i=1
w[41] = w[1] ⊕ N2 ⊕ N4 ⊕ N6 ⊕ N8 ⊕ N10 , w[42] = w[0] ⊕ w[2] ⊕ N1 ⊕ N2 ⊕ N5 ⊕ N6 ⊕ N9 ⊕ N10 , and w[43] = w[1] ⊕ w[3] ⊕ N2 ⊕ N6 ⊕ N10 . So, w[40] ⊕ w[41] ⊕ w[42] ⊕ w[43] = w[2] ⊕ w[3] ⊕ N3 ⊕ N7 and the parity check for the ith row of the last round key is k10 [i] ⊕ k10 [4 + i] ⊕ k10 [8 + i] ⊕ k10 [12 + i] = k0 [8 + i] ⊕ k0 [12 + i] ⊕ n3 [i] ⊕ n7 [i]. The value only depends on the master key and the non-linear part of the third and seventh round keys. The row parity check can detect the fault induced on the eighth to tenth round keys. In a similar way, the column parity check is constructed. For example, the equation for the last column of K10 is 7 15 3 15
k10 [i] = k0 [i] ⊕ k0 [i] ⊕ (n2 [i] ⊕ n6 [i] ⊕ n10 [i]) . i=12
i=4
i=12
i=0
Both of the row and column parity checks can verify the correctness of the round key. This method only requires few additional resources and is suitable for both software and hardware implementations.
5
Conclusions
This paper describes the DFA attack on the AES-128 key schedule. This method can retrieve thirteen bytes of the round key efficiently with an acceptable amount of samples. Another three bytes are derived by exhaustive search with feasible computation. In most cases, it only requires one correct and twenty-two faulty ciphertexts. And forty-four samples are sufficient in the worst case. This paper also recommends three possible countermeasures against the proposed DFA. The first and the second countermeasures are to avoid generating the round key on the fly. The last countermeasure is a parity check method, i.e., a method to verify the correctness of the round key. None of these three countermeasures need to modify the AES algorithm. In fact, it is hard to say whether any physical attack can be served as a tool or indicator to tell us which cipher is more secure in a smart card. This is much relevant to the real implementation of the cipher we consider. Basically, all assumptions for all kinds of physical attacks apply (although may not be always equally) to all ciphers when they are implemented. Each attack may
128
C.-N. Chen and S.-M. Yen
be extremely different from others depending on the real implementation and maybe also a little bit depending on the basic property of the cipher itself. For example, when selecting the final AES candidate, people also considered the resistance against every kind of physical attack when it will be implemented. But unfortunately when evaluating Rijndael, most people believed that Rijndael can be implemented easily to be physical attack immune. The authors of Rijndael also claimed such in their book about the design of Rijndael cipher [13] (at the end of the book). But, this is not really true as shown in this paper and in another paper of these proceedings [14]. Acknowledgments. The authors wish to thank Dr. Greg Rose for his kindness to provide many useful discussions and assistance on editing the paper which improve both presentation and technical content.
References 1. D. Boneh, R.A. DeMillo, and R.J. Lipton, “On the importance of checking cryptographic protocols for faults,” In Advances in Cryptology – EUROCRYPT ’97, LNCS 1233, pp. 37–51, Springer-Verlag, 1997. 2. P. Kocher, “Timing attacks on implementations of Diffie-Hellman, RSA, DSS, and other systems,” In Advances in Cryptology – CRYPTO ’96, LNCS 1109, pp. 104– 113, Springer-Verlag, 1996. 3. P. Kocher, J. Jaffe and B. Jun, “Introduction to differential power analysis and related attacks,” 1998, available at URL . 4. P. Kocher, J. Jaffe and B. Jun, “Differential power analysis,” In Advances in Cryptology – CRYPTO ’99, LNCS 1666, pp. 388–397, Springer-Verlag, 1999. 5. E. Biham and A. Shamir, “Differential fault analysis of secret key cryptosystems,” In Advances in Cryptology – CRYPTO ’97, LNCS 1294, pp. 513–525, SpringerVerlag, 1997. 6. S.M. Yen and J.Z. Chen, “A DFA on Rijndael,” In Information Security Conference 2002, Taiwan, May 2002. 7. X. Lai, On the Design and security of Block Ciphers, Ph.D. thesis, Swiss Federal Institue of Technology, Zurich, 1992. 8. P. Dusart, G. Letourneux and O. Vivolo, “Differential Fault Analysis on A.E.S.,” Cryptology ePrint Archive of IACR, No. 010, 2003, available at URL . 9. C. Giraud, “DFA on AES,” Cryptology ePrint Archive of IACR, No. 008, 2003, available at URL . 10. J. Daemen and V. Rijmen, “AES Proposal: Rijndael,” AES submission, 1998, available at URL . 11. NIST, “Federal Information Processing Standards Publication 197 – Announcing the ADVANCED ENCRYPTION STANDARD (AES),” 2001, available at URL . 12. S. Mangard, “A simple power-analysis (SPA) attack on implementations of the AES key expansion,” In Information Security and Cryptology – ICISC 2002, LNCS 2587, pp. 343–358, Springer-Verlag, 2003. 13. J. Daemen and V. Rijmen, The Design of Rijndael, AES – The Advanced Encryption Standard, Springer-Verlag, Berlin, 2002.
Differential Fault Analysis on AES Key Schedule
129
14. S.M. Yen, “Amplified differential power cryptanalysis of some enhanced Rijndael implementations,” In the Eighth Australasian Conference on Information Security and Privacy – ACISP 2003, 2003. 15. J.B. Fraleigh, A First Course in Abstract Algebra, / 5th Edition, Addison-Wesley Publishing Company, 1994. (Corollary 2 of Section 5.6, p.322)
A
The Proofs of the Propositions in Section 3.5
Proof (of Proposition 1). If x ∈ F(δx , δy ), then x satisfies the equation, δy = SubBytes(x) ⊕ SubBytes(x ⊕ δx ) (6)
= M (x−1 ) ⊕ 0x63 ⊕ M ((x ⊕ δx )−1 ) ⊕ 0x63 = M (x−1 ) ⊕ M ((x ⊕ δx )−1 ), where M is the transformation matrix and 0x63 is the constant vector. If δy = M (0−1 ) ⊕ M (δx −1 ) = M (δx −1 ), then 0 and δx are two solutions of Eq 6. (0−1 = 0 is defined in AES.) When x = 0, δx , we also have M −1 (δy ) = x−1 ⊕ (x ⊕ δx )−1 x2 ⊕ δx x = (M −1 (δy ))−1 δx .
(7)
Because of the theorem in [15], (7) has at most two solutions. If x is a solution, x ⊕ δx will be another one. Thus, (6) has zero, two, or four solutions. Proof (of Proposition 3). “only if” part can be retrieved in the proof of Proposition 1. If {0, δx } ⊂ F(δx , δy ), then the size of F(δx , δy ) is most two. “if” part: Because {0, δx } ⊂ F(δx , δy ), we have δy = M (δx −1 ) and (7) can be reduced to x2 ⊕ δx x ⊕ δx 2 = 0. Since x = 0xBC and x = 0xBD are two solutions of x2 + x + 1 = 0, x = δx · 0xBC and x = δx · 0xBD are two solution of (7). Proof (of Proposition 4). We assume {x1 , x1 ⊕δx } ∈ F(δx , δy ) and {x2 , x2 ⊕δx } ∈ F(δx , δy ). Because δx = δx , we have {x1 , x1 ⊕ δx } = {x2 , x2 ⊕ δx }. Thus, the size of one of the two sets is four. Without loss of generality, we assume F(δx , δy ) = {0, δx , δx ·0xBC, δx ·0xBD}. Moreover, one of 0 and δx belongs to F(δx , δy ), and one of δx · 0xBC and δx · 0xBD belongs to F(δx , δy ). Therefore, F(δx , δy ) = {0, δx , δx · 0xBC, δx · 0xBD} can be retrieved by one of the two equations, SubBytes(0) ⊕ SubBytes(δx · 0xBC) = SubBytes(δx ) ⊕ SubBytes(δx · 0xBD) and SubBytes(0) ⊕ SubBytes(δx · 0xBD) = SubBytes(δx ) ⊕ SubBytes(δx · 0xBC). Proof (of Proposition 5). This proposition can be proved by Proposition 4. Since {0, x2 } ∈ F(x2 , SubBytes(0) ⊕ SubBytes(x2 )), F(x2 , SubBytes(0) ⊕ SubBytes(x2 )) = F(δx , δy ) = {0, δx , x2 , x3 }. Proof (of Proposition 6). Since 0 ∈ F(x, δy ), F(x, δy ) = {0, x, x , x }. Thus, {x, x , x } = {x1 , x2 , x3 } and x ∈ {x1 , x2 , x3 }. Proof (of Proposition 7). There are one ‘0’ and three nonzero elements in the set whose size is four. And if 0 and a nonzero x belong to two sets, F(δx , δy ) and F(δx , δy ), then F(δx , δy ) = F(δx , δy ). Therefore, 85 = 255/3.