Proof that factoring algorithm in figure 4.10 works: Some review of basic concepts: Definition of phi(n): the size of the set of positive integers m less than n such that gcd(m,n) = 1. Computation of phi(n): a. If gcd(a,b) = 1, then phi(a)*phi(b) = phi(a*b). Proof of part a: We show that each pair of numbers A < a and B < b with gcd(a,A) = 1 and gcd(b,B) = 1 determines and is determined by a unique number c (which is in fact AB) < ab with gcd(a,b) = 1. In one direction this is easy -- any c < ab with gcd(c,ab) = 1 determines A = c mod a and B = c mod b: certainly gcd(A,a) = 1 and gcd(B,b) = 1. And A and B determine c uniquely by the Chinese Remainder Theorem: if gcd(a,b) = 1, there is only one number < ab with any given combination of remainders mod a and mod b. Now there are phi(a) numbers A less than a and relatively prime to a, and there are phi(b) numbers B less than b and relatively prime to b, so there are phi(a)*phi(b) numbers less than ab and relatively prime to ab, which was to be proved. b. phi(p^n) can be computed easily when p is prime. There are p^n-1 positive integers less than p^n, and of these only the multiples of p are not relatively prime to p^n: there are p^(n-1) - 1 positive multiples of p less than p^n , so phi(p^n) = (p^n-1)-(p^(n-1)-1) = p^n-p^(n-1). A special case: it is easy to see that phi(p) = p-1 when p is prime. c. Now we can compute phi(n) for any n for which we know the prime factorization. For example, when N is an RSA modulus pq, phi(N) = (p-1)(q-1). Review the theorem a^phi(n) = 1 mod n (if gcd(a,n) = 1). The important idea here is "order". The order of a residue class a mod n with gcd(a,n) = 1 is the _smallest_ exponent b such that a^b = 1 mod n. Things to observe: a. Every element of a residue class relatively prime to n has an order: if we write a, a^2, a^3,... we must eventually have a repetition, and if a^i = a^j for i < j, modular division tells us that a^j/a^i = a^(j-i) = 1. (It is important to note here that if a is relatively prime to n, every power of a will be relatively prime to n). b. The order of a will be the same as the order of a^(-1) mod n. Observe that a^i * (a^(-1))^i will be 1 for any i: this implies that a^i = 1 iff (a^(-1))^i = 1. c. If a^j = 1, then j is a multiple of the order of a. Suppose otherwise: let k = j mod i, where i is the order of a. k = j - q*i for some q (division algorithm) from which it follows that a^k = a^(j-q*i) = a^j/a*(q*i) = 1, but k < i, so k must be 0 or else i is not the order of a. d. If the order of a is m and p is a divisor of m, then there is a power of a which is of order m: m = p*q for some p, and (a^q)^p = a^(p*q) = 1, and further (a^p)^r for r < p cannot be 1 because (a^q)^r = a^(q*r) and q*r < q*p = the order of a. e. if b is a power of a, then the order of b is a divisor of the order of a: let the order of a be i, and we see that b^i = 1 mod n, so i must be a multiple of the order of b. e. If a is of order p and b is of order q, and gcd(p,q) = 1 then ab is of order pq: certainly ab to the power of any number which is a multiple of both p and q will be 1 (by definition of order). If ab^r = 1 then a^rb^r = 1 and so a^r and b^r have the same order, which must be a divisor of both p and q -- but 1 is the only possibility, so a^r = 1 and b^r = 1 and r must be a multiple of pq, from which the order of ab must be pq. f. Now we claim that for any a of order i and b of order j we can find c of order lcm(i,j): let k = j/gcd(i,j). k and i are relatively prime. We can find d a power of b of order k (by part d). Then ad will be of order ik = lcm(i,j). g. Repeated application of f gives us a residue class mod n which has order equal to the least common multiple of all orders of elements mod n. Observe that mod p (a prime) there can be no more than i elements of orders dividing i, because there cannot be more than i solutions to the ith order equation a^i-1 = 0 (by the usual zero divisor property). This means that the lcm of all orders mod p must be p-1 (because there are that many nonzero residue classes), so there must be an element of order p-1 (a primitive root). This is a slightly different proof than the one I gave before. Since the lcm of all orders of residue classes mod p must be p-1, it follows that a^(p-1) = 1 mod p for all nonzero residue classes. h. relative to any modulus, the number of elements which are powers of some order i element must be a multiple of i: because each such element g determines a set of i elements g, g^2, ... ,g^(i-1) and any two such sets are either the same set or disjoint (if g^b = h and g and h are both of order i, then b is relatively prime to the order i of g and h, and h^(b^(-1) mod i) = g, so the powers of g and the powers of h make up the same set. Suppose c = b^(-1) mod i. This means that bc = 1+qi for some q. h^c = (g^b)^c = g^(1+qi) = g. i. Prove that the order of any residue class mod n which is relatively prime to n must be a divisor of phi(n): take all the powers of a relatively prime a (there will be "order of a" of these). Now pick any relatively prime residue class k which is not a power of a: the set of elements ka^i will not contain any element of the first set of a^j's , because if k(a^i) = a^j then k = a^(j-i) is in the set (negative powers are also in the set, because the inverse of a is always a power of a as well). Choose an m relatively prime to n and not in either of these sets. The set of residue classes ma^i will not intersect the set of a^i's by the argument already given, and it won't intersect the second class either -- if ma^i = ka^j then m = ka^(j-i), but m was chosen not to be in the second set. Continue to choose such sets until all relatively prime residue classes are used up. Each set will be of size "order of a": if ka^i = ka^j then a^i = a^j by modular division. Since the set of relatively prime residue classes can be partitioned into disjoint sets of size "order of a", phi(n) (the number of relatively prime residue classes mod n) is divisible by the order of a. Finally, this implies that a^phi(n) = 1 mod n. Now, back to the proof that the algorithm in Figure 4.10 works: Vitally important bit of notation: a | b means "a is a divisor of b", that is, "b = ka for some integer k" or b=0 mod a. Let N be our RSA modulus. N = pq, where p and q are primes. phi(N), which we usually just call phi = (p-1)(q-1). 1. Square roots of 1 mod N: A square root b of 1 mod N will satisfy b^2 = 1 mod p and b^2 = 1 mod q. Mod p or q, there are only two square roots of 1 by the usual argument: x^2 = 1 implies (x-1)(x+1) = 0 so either x-1 or x+1 = 0 (all mod p); works mod p a prime because we have ab = 0 implies a or b = 0. So b = 1 or -1 mod p and b = 1 or -1 mod q. Using the Chinese Remainder Theorem, we can see that there are exactly four square roots of 1 mod N. Two of these are the usual 1 and -1 mod N: the others are solutions of {b = 1 mod p,b =-1 mod q} and {b = -1 mod p, b = 1 mod q}. We can see that these weird roots are additive inverses of one another. Observe that if we find a weird square root of 1, we can factor N: b+1 will solve either {b = 2 mod p, b = 0 mod q} or {b = 0 mod p, b = 2 mod q}, so gcd(b+1,N) will be either p or q, and of course once we have one we have the other! 2. How the algorithm works when it works: We are given the modulus N, an encryption modulus a and the corresponding decryption modulus b. We know that ab = 1 mod phi(N) (this is how a and b are chosen!). It follows that ab-1 is a multiple of phi(N). Note that since phi(N) = (p-1)(q-1) is even, ab-1 must also be even. We express ab-1 in the form (2^s)r, where r is odd. r and s are uniquely determined by a and b. We choose a random w, a positive integer less than N. w^phi(N) = 1 mod N by theorem, so w^(ab-1) = 1 mod N because ab-1 is a multiple of phi(N). It follows that w^((2^s)r) = 1 mod N (because (2^s)r = ab-1). Define v as w^r. w^((2^s)r) is the number obtained by squaring v s times, and it is equal to 1. So there must be a first number w^((2^i)r) which is equal to 1. We are interested in cases where the number w^((2^(i-1)r)) coming just before w^((2^i)r) will be a weird square root of 1 mod N. Clearly this won't happen if i = 0 -- so the algorithm fails if w^r = w^(2^0r) = 1 mod N. If w^r =/= 1 mod N, then certainly w^((2^(i-1)r)) will be a square root of 1. But it might be -1 mod N, and in that case we also fail. In any other case, w^((2^(i-1)r)) will be a weird square root of 1 mod N, and we can use it to factor N. 3. Why the algorithm works at least half the time: There are two ways the algorithm can fail: either w^r = 1 mod N or w^((2^i)r) = -1 for some non-negative integer i < s. We count these bad values of w and find that there aren't enough of them! A. How many solutions of w^r = 1 are there? It is sufficient to find how many solutions of w^r = 1 mod p and mod q, and multiply these numbers together. So how many solutions are there mod p? Express p-1 in the form 2^ip_1 and q-1 in the form 2^jq_1 where p_1 and q_1 are odd. Since p-1 and q-1 both go evenly into phi(N) = (p-1)(q-1) = (2^s)r and p_1 and q_1 are odd factors of p-1 and q-1, both p_1 and q_1 must go evenly into r. Let g be a primitive root mod p. Express w in the form g^u. We then have g^(ur) = 1 mod p. This means that ur is a multiple of p-1 (the order of the primitive root g). Thus, ur is a multiple of 2^ip_1. p_1 goes evenly into r, so it is both necessary and sufficient for 2^i to go into u. u was chosen less than p-1: there are exactly p_1 = p-1/2^i appropriate values of u, so there are p_1 values of w which solve this equation. By the same argument, there are q_1 solutions to w^r = 1 mod q, so there are p_1q_1 solutions mod N (by the Chinese Remainder Theorem). B. How many solutions of w^((2^t)r) = -1 are there? Once again, count solutions mod p and mod q, and multiply. Write w = g^u with g a primitive root, as before. If w^((2^t)r) = g^(ur2^t) = -1 mod p, we have ur2^t = (p-1)/2 mod p-1 so p-1 | ur2^t -(p-1)/2 so 2(p-1) | ur2^(t+1) - (p-1) so 2^(i+1)p_1 | ur2^(t+1)r - 2^ip_1 so 2^(i+1) | ur2^(t+1)/p_1 - 2^i. There can be no solutions to this if t >= i, because 2^(i+1) will go evenly into ur2^(t+1)/p_1 (notice that r/p_1 is an integer!) and not into 2^i. If t <= i-1, then u is a solution just in case u is an odd multiple of 2^(i-t-1). To see this, observe that 2^(i+1) will go evenly into X - 2^i just in case X is an odd multiple of 2^i. r/p_1 is odd and can be left out of consideration: u2^(t+1) will be an odd multiple of 2^i just in case u is an odd multiple of 2^(i-(t+1)). There are thus (p-1)/2^(i-t-1) x 1/2 solutions = 2^tp_1 solutions. Applying the same argument to q, we find that there are (2^t)p_1 x 2^tq_1 = 2^(2t)p_1q_1 solutions for t < min(i,j) and none for larger t. t can range from 0 to s-1. Suppose i>=j (without loss of generality): there are at most p_1q_1 + p_1q_1(1+2^2+...+2^(2i-2)) solutions. This is at most p_1q_1(1+ (4^i-1/3)) solutions, which is equal to p_1q_1(2/3 + 2/3(2^(2i))) Note that p_1q_1 < (p-1)/2.(q-1)/2 < phi/4 < n/4. Also, 2^2ip_1q_1 <= 2^(i+j)p_1q_1 = (p-1)(q-1) < N Thus p_1q_1(2/3 + 2/3(2^(2i))) < N/6 + N/3 = N/2 so the chance of failure of the algorithm is less than 1/2!