Professional Documents
Culture Documents
9 pages
Research Report
On String Replacement Exponentiation
Luke O'Connor IBM Research Zurich Research Laboratory 8803 Ruschlikon Switzerland
LIMITED DISTRIBUTION NOTICE This report has been submitted for publication outside of IBM and will probably be copyrighted if accepted for publication. It has been issued as a Research Report for early dissemination of its contents. In view of the transfer of copyright to the outside publisher, its distribution outside of IBM prior to publication should be limited to peer communications and speci c requests. After outside publication, requests should be lled only by reprints or legally obtained copies of the article (e.g., payment of royalties).
IBM
Abstract
The string replacement (SR) method was recently proposed as a method for exponentiation ae in a group G. The canonical k-SR method operates by replacing a run of i ones in a binary exponent, 0 < i k, with i ; 1 zeroes followed by the single digit b = 2i ; 1. After recoding, it was shown in D. Gollman et al., Designs, Codes and Cryptography, 7:135{151, 1996] that the expected weight of e tends to n=4 for n-bit exponents. In this paper we show that the canonical k-SR recoding process can be described as a regular language and then use generating functions to derive the exact probability distribution of recoded exponent weights. We also show that the canonical 2-SR recoding produces weight distributions very similar to (optimal) signed-digit recodings, but no group inversions are required.
1 Introduction
One of the fundamental operations in cryptography is exponentiation ae over groups such as Zp Zn, general nite elds, and the group of points on an elliptic curve 16, 3, 4]. Many algorithms o er complexity improvements over the standard binary method 9], including the sliding-window method ( 12, 1, 7] for example), signed-digit representations 15, 13, 10, 8, 18], the signed-window method 11] and the Lempel-Ziv recoding 19] (see 12] for a survey). The String Replacement (SR) method for exponentiation was recently proposed and analysed by Gollman, Han and Mitchell 5]. Let 1i denote a run of i ones, and let 1i k denote a run 1i where 1 i k. The basic approach of the SR method is to select a parameter k and then replace runs in the exponent e of the form 1i k with the string 0i;1 b where b = 2i ; 1, which is known as the canonical k-SR recoding of the exponent. The set of possible values for b are 3 5 7 : : : 2k ; 1, which are precomputed, and the value of ae is then determined by using the precomputed values and a variant of the b-ary method. As is customary, the e ciency of the canonical k-SR method is measured in terms of the number of required squarings and multiplications to compute ae. The k ; 1 values a2i+1 1 i k ; 1 can precomputed using the binary method at a cost of (k ; 1) squarings and multiplications. The number of squarings to complete ae is approximately n;1, which depends on the length of the most signi cant run in e. On the other hand, the number of multiplications to complete ae is determined to be wk SR(e) ; 1 where wk SR(e) is the recoded weight of e, or equivalently, the number of nonzero digits in the recoded exponent. In 5] it was shown that
n2k;2 + (2k ; k ; 1)2k;2 (2k ; 1) (2k ; 1)2 and thus the expected recoded weight is tending to n=4 as k increases. In this paper we will use techniques presented in 14] to precisely analyse the distribution of exponent weights after (canonical) k-SR recoding. For a recoding rule A,P the basis of the analysis in 14] is to de ne a bivariate generating function (bgf) GA (x z) = n m 0 an mzmxn such that Pr(wA (e) = m j #e = n) = am n=2n , where #e is the bit length of e. The bgf GA (x z) can be directly determined when the steps performed by A can be carried out by a nite state machine, or equivalently, described by a regular language 6]. We will show that k-SR recoding is regular, and prove that X x ; zxk + zx Gk SR(x z) = an mxnzn = 1 ; 2x +1x; : (1) 2 ; zxk + 2zxk+1 + zx2 nm 0
E wk SR(e)]
Given Gk SR (x z) it is straightforward to examine the exact weight distribution, compute moments of the distribution and hence determine the variance of wk SR(e). The authors of 5] stress the advantages of k-SR recoding over signed-digit recoding, where the latter requires a group inversion to be computed. Using previous results we are able to compare the weight distributions of these two methods closely and show, for example, that 2-SR encoding is almost equivalent to signed-digit encoding with respect to weight, but no inversions are required. The paper is set out as follows. In x2 we derive the bgf Gk SR (x z) by rst introducing some enumeration theorems for regular languages in x2.1, and then modeling k-SR recoding as a regular language in x2.2. In x3 we expand the coe cients of Gk SR (x z), and then examine the weight distribution and its moments. We analyse the weight distributions of 512-, 768- and 1024-bit exponents for k-SR recoded weight, with k in the range 2 k 5. For example, we show that a 4-SR recoding of a 512-bit exponent will have a weight between 128 and 145 with probability greater than a half, and a weight between 125 and 148 with probability at least 1
three quarters. We also show that for n-bit exponents the recoded weight for unrestricted k (that is k n) has an expectation of (n + 1)=4 with a variance of (n + 1)=16.
bgfs GR (x z) and GS (x z). Then if R + S , RS and R are also unambiguous, w(R) and w(S ) are independent, then GR (x z) + GS (x z) enumerates R + S , GR(x z)GS (x z) enumerates RS , and 1=(1 ; GR(x z)) enumerates R . 2 For a binary alphabet, we may then interpret GR(x=2 z) as a probability generating function from which moments can be directly calculated (see for example 17, p.138]). Thus the expectation and variance of the weight of length n strings can be determined.
Corollary 2.1 Let R and S be unambiguous regular expressions, that are enumerated by the
We now describe the k-SR canonical recoding process, then show how it can be modeled by a regular expression Rk SR , and lastly derive the rules for forming Gk SR (x z) from Rk SR. The k-SR recoding of e is in general not unique in that e may have several such encodings. For the purposes of analysis, the authors of 5] de ned the canonical k-SR recoding of e as the output of the following string replacement algorithm: for i from k down to 2, starting from the most signi cant bit of e and scanning left towards the least signi cant bit, replace 1i with 0i;1 bi, bi = 2i ; 1. While the canonical k-SR recoding is not guaranteed to produce a recoded exponent of minimal weight, the resulting recoded exponents appear to have near optimal weight 5].
Example 2.2 Consider the 64-bit exponent e = 14312983206104981813 whose binary expansion is 110001101010000111101110011001000111110000000100111000010011010 and thus w(e) = 29. The 3-SR and 4-SR canonical recodings are given, respectively, as 030000301010000007100070003001000007030000000100007000010003010 030000301010000000E 00070003001000000E 10000000100007000010003010 where 15 has been recoded as the hexadecimal digit E . In both cases the recoded weight of e is 16, which is about 55% of the original weight. 2 We begin our analysis by observing that e can be uniquely decomposed into substrings e = s1s2 sm such that si = 0ji or 1ji 0 for 1 i < m and sm = 0jm or 1jm , where each ji is maximal, 1 i m. We will call this the zero terminated run (ZTR) decomposition of e. All runs are terminated with a zero, with the possible exception of the last sm . We will call s1 s2 : : : sm the ZTR substrings of e. (2)
Lemma 2.1 Let e = s1s2 sm be the ZTR decomposition of e. Then in the canonical k-SR
recoding of e each si is recoded independently.
Proof. The canonical k-SR-recoding only operates on runs in e of the form 1t . The recodings of the si will be independent if no run spans two adjacent substrings si and si+1. However this is not possible since each substring si that contains a run terminates the run with a 0 within si if si+1 exists. All si of the form 0ji are recoded independently since each 0 is recoded independently (to itself). 2
We now consider how ZTR substrings that contain runs are recoded. Let si = 1ji 0 be a substring of the ZTR of e, such that ji = kq + r where r ji mod k. Recalling that bi = 2i ; 1, the canonical k-SR recoding of e will recode si to (0k;1bk )q 0r;1 br 0 since
(3)
Thus conceptually the canonical k-SR recoding of si = 1ji 0 where ji = kq + r r < k, is to parse 1ji into q runs of length k and a single zero terminated run of length r. After this parsing, s = (1k )q 1r 0, 1k is recoded as 0k;1 bk , 1r is recoded as 0r;1br , and these recodings are substituted into si. Thus we may consider the canonical k-SR-recoding of an exponent to proceed in two steps: rst the parsing step, then the recoding step. Now consider the following regular expression, Rk SR, de ned as ! ! k ;1 k ;1 X X k i i Rk SR = Rk1Rk2 = 0 + 1 + 1 0 + 1 (4)
;1 i where Pk + 1k;1 0 and is the empty string. We will relate Rk SR to i=1 1 0 = 10 + 110 + the canonical k-SR parse of an exponent, which will be evident from the proof of the next theorem.
i=1
i=1
Corollary 2.2 The bgf Gk SR for Rk SR can be derived via the rules of Theorem 2.1.
Proof. The corollary follows from the fact that Rk SR is unambiguous and that each si is recoded independently. 2
Thus in Rk SR we have a regular expression that generates all exponents (binary strings) e unambiguously, such that the manner in which the substrings of Rk SR are selected to form e corresponds exactly to the parse of e produced by its k-SR canonical recoding. We now derive the Gk SR using Theorem 2.1. Theorem 2.3 Let an m be the number of binary strings of length n for which an unrestricted SR recoding has hamming weight m 0 m < n. Then X x ; zxk + zx Gk SR (x z) = an mxn zn = 1 ; 2x +1x; (5) 2 ; zxk + 2zxk+1 + zx2 nm 0
Proof. We use Theorem 2.1 to transform Rk SR to Gk SR (x z). Rk1 and Rk2 are transformed as follows
=) =)
x + zxk + z
For example, in GRk1 (x z) the term x corresponds to 0 (length 1 and no weight in the recoded exponent), and the term zxk corresponds 1k (length k and one non-zero digit in the recoded exponent). The theorem follows from simplifying Gk SR (x z) = 1=(1;GRk1 (x z))GRk2 (x z). 2 As a check on the correctness of our expression for Gk SR(x z) we note that it can be veri ed Gk SR (x 1) = 1=(1 ; 2x), as expected. In practice we expect k to be small, say less than 10, but we may also consider the weight distribution when k is large, or at least k n, where n P is the exponent length. Let GSR = n m 0 an mzmxn be the bgf such that am n is the number of n-bit exponents recoded to weight m by a k-SR canonical recoding where k is unrestricted. Theorem 2.4 Let an m be the number of binary strings of length n for which a k-SR recoding has hamming weight m 0 m < n. Then X ; x ; xz : GSR (x z) = an mxnzn = 1 ;12x (6) + x2 ; zx2 nm 0
Proof. In the unrestricted case it is directly observed that the parsing of the unrestricted canonical recoding is described by the regular expression
RSR
0 1 0 1 X X = @0 + 1i0A @ + 1i A :
i
1
(7)
RSR unambiguously generates all binary strings, and can be enumerated using Theorem 2.1. The proof of this is very similar to Theorem 2.2 and Corollary 2.2 for the same properties with respect to Rk SR, and is thus omitted. 2
Theorem 3.1 Let wSR(e) be the weight of an exponent e after canonical recoding. Then for
a uniformly distributed n-bit exponent e 1 Var w (e)] = n + 1 : E wSR(e)] = n + (8) SR 4 16 0 Proof. The expectation is given by G0SR (x=2 1) and the variance by G00 SR (x=2 1)+GSR (x=2 1); 0 2 (GSR (x=2 1)) . 2 5
Recall that Chebyshev's inequality bounds the deviation of a random variable X from its mean in terms of its variance 2: Pr(jX ; j d) 2=d2 . Then de ne (X p) as ! 2 (X p) = min (9) d d2 < (1 ; p) which states that d is the smallest for which Pr(jX ; j < d) > p according to bounds derived by Chebyshev's inequality. Using (X p) from (9) we may bound the weight distribution for unrestricted k, as shown in Table 1 for 512-, 768- and 1024-bit exponents. For example, the table states that for a 1024-bit exponent, its weight deviates by less than 12 from its mean value of 64:0625 for more than half the exponents. On the other hand, 99% of exponents deviate by less than 81 from the mean value. Table 1: The weight distribution of unrestricted k-SR canonical recodings for 512-, 768- and 1024-bit exponents. The columns show the value of (wSR(e) p) p 2 f0:50 0:60 0:75 0:90 0:95 0:99g.
k E wSR(e)] Var wSR(e)] 0:50 0:60 512 128.25 32.0625 9 9 768 192.25 48.0625 10 11 1024 256.25 64.0625 12 13
0:75 12 14 17
We use the word optimal here since there are several schemes for signed-digit recoding. By optimal here we refer to the schemes that produce sparse forms 5], where adjacent digits have a product of zero.
1
The simple expressions for the expectation and variance given in Theorem 3.1 can be found since GSR (x z) does not depend on a parameter k. It is more di cult to analyse Gk SR(x z), which does depend on k. We have calculated E wk SR(e)] and Var wSR(e)] for k in the range 2 k 5, which appears to cover those values of practical interest. We have veri ed that E wk SR(e)] in this range is given as in (1) and the variances are given in Table 2. For example, we can now conclude that a 4-SR recoding of a 512-bit exponent will have a weight between 128 and 145 with probability greater than a half, and a weight between 125 and 148 with probability at least three quarters. Considering the last row from Table 2 we see that for 1024 exponents and k = 5, E wSR(e)]=E wk SR(e) 0:97, and that the deviations up to p = 0:75 are quite similar. Thus a large amount of the potential weight reduction from canonical recoding can be achieved by k = 5 for 1024 exponents. Table 2 complements the computational results presented in Table 2 of 5] by bounding the deviation from the expectations. We remind the reader that the deviation bounds in Tables 1 and 2 are based on Chebyshev's inequality and more precise information for a given exponent length n (say n = 160 as in Figure 1 below) can be obtained by expanding Gk SR (x z) as a power series. We now consider the case where k = 2, since the expected canonical 2-SR recoded weight and the optimal1 signed-digit weight are approximately both equal to n=3. It is clear from the original SR method paper 5] that one of the motivations for the method was to propose a run-based recoding technique that did not require group inversions. Let wSD (e) be the optimal signed-digit (OSD) weight of an exponent. Then it is known 5, 14] that 4 + o(1) Var w (e)] = 2n + 14 + o(1) + E wOSD(e)] = n OSD 3 9 27 81
Table 2: Canonical k-SR recoding distributions for 512-, 768-, 1024-bit exponents. The columns show the value of (wk SR(e) p) p 2 f0:50 0:60 0:75 0:90 0:95 0:99g. For k = 2 3 4 5, Var wk SR(e)] is asymptotic to 2n=27, 18n=343, 172n=3375 and 1592n=29791 ren2k;2 + (2k ;k;1)2k;2 . spectively, and E wk SR(e)] (2 k ;1) (2k ;1)2
n 512 512 512 512 768 768 768 768 1024 1024 1024 1024
k E wk SR(e)] Var wk SR(e)] 2 170.7 38.0 3 146.4 26.9 4 136.7 26.2 5 132.3 27.4 2 256.1 56.9 3 219.6 40.4 4 204.9 39.2 5 198.4 41.1 2 341.4 75.9 3 292.7 53.8 4 273.3 52.3 5 264.5 54.8
0.14
0:50 9 8 8 8 11 9 9 10 13 11 11 11
0:60 10 9 9 9 12 11 10 11 14 12 12 12
0:75 0:90 13 20 11 17 11 17 11 17 16 24 13 21 13 20 13 21 18 28 15 24 15 23 15 24
0:95 28 24 23 24 34 29 29 29 39 33 33 34
0:99 62 52 52 53 76 64 63 65 88 74 73 75
0.12
0.1
probability
0.08
0.06
0.04
0.02
0 0 10 20 30 40 50 weight 60 70 80 90
Figure 1: Weight distribution for optimal signed-digit (OSD) and k-SR canonical recoding of 160-bit exponents, 2 k 5. and using G2 SR (x z) we can directly prove that 1 + o(1) Var w (e)] = 2n + 8 + o(1): E w2 SR(e)] = n + 2 SR 3 9 27 81 Thus we see that the two weight distributions agree quite closely in expectation and variance. The similarity is highlighted in Figure 1 where the recoded weight of 160-bit exponents is plotted for the OSD method, and the k-SR method for 2 k 5. Here the OSD and 2-SR distributions are essentially identical, and the 5-SR distribution is already clustering around 7
the expected weight for an unrestricted recoding. We consider 160-bit exponents as elds of this size are suitable for use in elliptic curve cryptosystems, as OSD recodings could be used as group inversion is a cheap operation.
4 Conclusion
Our main result has been to derive Gk SR (x z), the gf describing the probability distribution of canonical k-SR recodings, which is the parameter of interest in the analysis of SR exponentiation. Extending the method presented in 14], we have used regular languages to characterize the canonical k-SR parsing of the exponent which leads directly to Gk SR (x z). We were also able to show that 2-SR recodings and optimal signed-digit recodings produce weights that are distributed very similarly, but the 2-SR recoding has the advantage of requiring no group inversions.
References
1] J. Bos and M. Coster. Addition chain heuristics. Advances in Cryptology, CRYPTO 89, Lecture Notes in Computer Science, vol. 218, G. Brassard ed., Springer-Verlag, pages 400{407, 1990. 2] N. Chomsky and P. Schutzenberger. The algebraic theory of context-free languages. In P Bra ort and North Holland Hirchberg, D., editors, Computer programming and formal languages, pages 118{161, 1963. 3] W. Di e and M. Hellman. New directions in cryptography. IEEE Transactions on Information Theory, 22(6):472{492, 1976. 4] T. ElGamal. A public key cryptosystem and signature system based on discrete logarithms. IEEE Transactions on Information Theory, 31(4):473{481, 1985. 5] D. Gollman, Y. Han, and C. Mitchell. Redundant integer representations and fast exponentiation. Designs, Codes and Cryptography, 7:135{151, 1996. 6] J. Hopcroft and J. Ullman. An Introduction to Automata, Languages and Computation. Reading, MA: Addison Wesley, 1979. 7] L. Hui and K.-Y. Lam. Fast square-and-multiply exponentiation for RSA. Electronics Letters, 30(17):1396{1397, 1994. 8] C. K. Koc. High-radix and bit encoding techniques for modular exponentiation. International Journal of Computer Mathematics, 40:139{156, 1991. 9] D. E. Knuth. The Art of Computer Programming : Volume 2, Seminumerical Algorithms. Addsion Wesley, 1981. 10] N. Koblitz. CM curves with good cryptographic properties. Advances in Cryptology, CRYPTO 91, Lecture Notes in Computer Science, vol. 576, J. Feigenbaum ed., SpringerVerlag, pages 279{287, 1992. 11] K. Koyama and T. Tsuruoka. Speeding up elliptic curve cryptosystems using a signed binary window method. In Advances in Cryptology, CRYPTO 92, Lecture Notes in Computer Science, vol. 740, E. F. Brickell ed., Springer-Verlag, pages 345{357, 1992. 8
12] A. Menezes, P. van Oorschot, and S. Vanstone. Handbook of Applied Cryptography. CRC press, 1996. 13] F. Morain and J. Olivos. Speeding up the computations on an elliptic curve using additionsubtraction chains. Theoretical Informatics and Applications, 24(6):531{544, 1990. 14] L. J. O'Connor. An analysis of exponentiation based on formal languages. Advances in Cryptology, EUROCRYPT 99, Lecture Notes in Computer Science, vol. 1592, J. Stern, ed., Springer-Verlag, pages 375{388, 1999. 15] G. Reitwiesener. Binary arithmetic. In F. L. Alt, editor, Advances in Computers, pages 232{308, 1960. 16] R. L. Rivest, A. Shamir, and L. Adleman. A method for obtaining digital signatures and public key cryptosystems. Communications of the ACM, 21(2):120{126, 1978. 17] R Sedgewick and P. Flajolet. An introduction to the analysis of algorithms. AddisonWesley Publishing Company, 1996. 18] J. A. Solinas. An improved algorithm for arithmetic on a family of elliptic curves. Advances in Cryptology, CRYPTO 97, Lecture Notes in Computer Science, vol. 1294, B. S. Kaliski ed., Springer-Verlag, pages 357{371, 1997. 19] Y. Yacobi. Exponentiating faster with addition chains. Advances in Cryptology, EUROCRYPT 90, Lecture Notes in Computer Science, vol. 473, I. B. Damgard ed., SpringerVerlag, pages 222{229, 1991.