Is factorization a hard problem?

Is factorization a hard problem? There is plenty of empirical evidence that it is so. Take the following 309-digit number that is known as RSA-1024, an example of an RSA number.


RSA-1024, a 1024-bit number, is a product of two prime numbers p and q. No one has been able to factor this number, despite the advances in factoring algorithms and computing technology in recent decades. RSA-1024 is part of the RSA Factoring Challenge that was created in 1991. Even though the challenge was withdrawn in 2007, it is believed that people are still taking up the challenge to factor this and other unfactored RSA numbers. In fact, the successful factoring of RSA-1024 or similarly sized numbers would have huge security implication for the RSA algorithm. The RSA cryptosystem is built on the difficulty (if not the impossibility) of factoring large numbers such as RSA-1024.

Yet it is very easy to demonstrate that RSA-1024 is not a prime number. The fact that it is composite can be settled by performing one modular exponentiation. Denote RSA-1024 by N. We compute 2^{N-1} \ (\text{mod} \ N).

We find that 2^{N-1} \equiv T \ (\text{mod} \ N) where T is the following 309-digit number.


Obviously T is not 1. This fact is enough to prove that the modulus N is not a prime number. This is because the number N lacks a property possessed by all prime numbers. According to Fermat’s little theorem, if N were prime, then that a^{N-1} \equiv 1 \ (\text{mod} \ N) for all integers a that are relatively prime to N. In particular, if N were prime, then we would have 2^{N-1} \equiv 1 \ (\text{mod} \ N), the opposite of our result.

The modular exponentiation a^{N-1} \ (\text{mod} \ N) discussed here can be performed using the fast powering algorithm, which runs in polynomial time. In the fast powering algorithm, the binary expansion of the exponent is used to convert the modular exponentiation into a series of squarings and multiplications. If the exponent N-1 is a k-bit number, then it takes k-1 squarings and at most k-1 multiplications. For RSA-1024, it takes 1023 squarings and at most 1023 multiplications (in this instance exactly 507 multiplications). This calculation, implemented in a modern computer, can be done in seconds.

The above calculation is a vivid demonstration that factoring is hard while detecting the primality or compositeness of a number is a much simpler problem.

The minimum RSA key length prior to the end of 2013 is 1024. After 2013, The minimum RSA key length is 2048. In fact, the largest RSA number is RSA-2048 (has 2048 bits and 617 decimal digits), which is expected to stay unfactored for years to come barring dramatic advances in factoring algorithms or computing capabilities.



Using a software package that can handle modular exponentiation involving large numbers, it is easy to check for “prime versus composite” status of a large number. Find a number n whose prime factorization is not known. Either use known numbers such as RSA numbers or randomly generate a large number. Then calculate the modular exponentiation a^{n-1} \ (\text{mod} \ a) for several values of a (it is a good practice to start with a=2). If the answer is not congruent to 1 for one value of a, then we know n is composite. If the exponentiation is all congruent to 1 for the several values of a, then n is a likely a prime number.

\copyright \ \ 2014 \ \text{Dan Ma}

The magic words are squeamish ossifrage

The title of the blog post is the answer to a decryption challenge problem in connection to a factoring problem of a 129-digit number that is known as RSA-129. The challenge problem was posed in 1977. Our goal here is to use this challenge problem to demonstrate how to encrypt and decrypt in the RSA algorithm. The author of this blog recently built a calculator that can handle modular exponentiation involving large numbers. The challenge problem of RSA-129 provides an opportunity to verify the calculator.

In the August 1977, Martin Gardner wrote an article called “A new kind of cipher that would take millions of years to break” in his Mathematical Games column in Scientific American. This article posed a challenge problem created by the inventors of the RSA cryptosystem. Here’s the gist of the problem.

    An English sentence is converted into a number according to the rule that maps A to 01, B to 02, C to 03, . . . ,Y to 25, Z to 26, with 00 representing a space between words. Call the resulting number m. Then raise m to e= 9007 and then reduce the exponentiation result modulo the following 129-digit number N.


    In modular arithmetic, the notation for this calculation is m^{9007} \equiv c \ (\text{mod} \ N), where c is the following number:


    The number c is the ciphertext. What is the plaintext m? What is English phrase that is behind m?

During the 17 years after the publication of the challenge problem, no one other than the creators of the problem knew the English phrase. In 1994, a team led by Atkins, Graff, Lenstra and Leyland [1] cracked the code and found that the answer is the title of this blog post. The effort that led to the answer was a massive computing project that was distributed over the Internet. Their goal was to factor the 129-digit number N (the modulus) into the two prime factors p and q. A more detailed description of the problem can also be found in a piece from MAA (see here).

Once the factors of the modulus are known, the decryption exponent d was calculated using the extended Euclidean algorithm. Then raise the ciphertext c to d and reduce modulo N. The result is the plaintext m. Using the predetermined rule described earlier, the plaintext m is translated back to the original English phrase. In the remainder of this post, we demonstrate how this process works with the prime factors p and q as a given.

The modulus N is a 129-digit number that is known as RSA-129. It is the product of two prime numbers with 64 and 65 digits. Instead of taking millions of years to factor this modulus as suggested in the title of Martin Gardner’s article, it took a mere 17 years! In 1977, Rivest, one of the creators of the RSA algorithm, estimated that factoring a 125-digit number that is a product of two 63-digit prime numbers would require at least 40 quadrillion years using the best factoring algorithm available. This goes to show that what was considered hard or impossible at one point in time may one day become doable or even easy due to advances in computer power and factoring algorithms. What is considered a secure RSA modulus today (say 1024-bit) may be breakable in the future.


Decrypting the challenge message

The modulus N as indicated above is a 129-digit number that is known as RSA-129. The number pair N and e= 9007 is an RSA public key. With only the knowledge of public key represented by N and e and the ciphertext c, how hard is it to recover the plaintext m? An answer emerged only after 17 years of the publication of the challenge. In this post, we demonstrate how to retrieve the plaintext from the ciphertext by using the prime factors p and q of the modulus N. Here’s are the factors p and q (found here):



As mentioned earlier, we take the prime factors p and q of N as a given. Then compute the product of p-1 and q-1:

    \phi(N)=(p-1) \times (q-1)

The function \phi is important one in number theory and goes by a special name, Euler’s phi function. Then the decryption exponent d is the number satisfying the following congruence relationship:

    ed \equiv 1 \ (\text{mod} \ \phi(N))

where e=9007. In other words, the number d is the multiplicative inverse of e modulo \phi(N). The typical approach in finding d is to use the extended Euclidean algorithm. Once d is found, the following modular exponentiation is computed:

    c^d \equiv m \ (\text{mod} \ N)

We use the predetermined rule to recover the original English phrase. The steps are summarized as follows:

  1. Using the Euclidean algorithm to compute \text{GCD}(e,\phi(N)), the greatest common divisor (GCD) of E and \phi(N)=(p-1) \times (q-1).
  2. The GCD in the previous step should be 1. Performing Euclidean algorithm is so that we can work backward (the extended Euclidean algorithm) to compute d=e^{-1} modulo \phi(N).
  3. Use the fast powering algorithm to compute c^d \equiv m \ (\text{mod} \ N).
  4. Use the predetermined mapping scheme to obtain the plaintext.


Step 1. Euclidean algorithm

The following is the product of p-1 and q-1.


Next, use Euclidean algorithm to find \text{GCD}(9007,N), which should be 1. The point is that we will work backward to find the multiplicative inverse of e= 9007. The Euclidean algorithm consists of a series of divisions until reaching the GCD. The divisions are shown in the following matrix:

    \left[\begin{array}{rrrrrrrr}      \phi(N) & = & 9007 & \times & T & + & 5166 & \text{ }\\      9007 & = & 5166 & \times & 1 & + & 3841 & \text{ } \\      5166 & = & 3841 & \times & 1 & + & 1325 & \text{ } \\      3841 & = & 1325 & \times & 2 & + & 1191 & \text{ } \\      1325 & = & 1191 & \times & 1 & + & 134 & \text{ } \\      1191 & = & 134 & \times & 8 & + & 119  & \text{ } \\      134 & = & 119 & \times & 1 & + & 15 & \text{ } \\      119 & = & 15 & \times & 7 & + & 14 & \text{ } \\      15 & = & 14 & \times & 1 & + & 1 & \text{*} \\      14 & = & 1 & \times & 14 & + & 0 & \text{ }    \end{array}\right]

where T is the following number:


The last column in the above table shows the remainders in the divisions. The last non-zero remainder (marked by *) is the GCD of 9007 and \phi(N), in this case 1. Next, we apply extended Euclidean algorithm to solve the following equation:

    9007 \times x + \phi(N) \times y = 1

The solution x will be the decryption exponent d.

Step 2. Extended Euclidean algorithm

The extended Euclidean algorithm is to take the above table of divisions and perform back substitutions. The process of doing back substitutions is logically clear but can be tedious. We use a tabular formulation of this process as shown below.

    \left[\begin{array}{rrrrrrrrr}      \text{divisor} & \text{ } & \text{quotient} & \text{ } & \text{back substitution} & \text{ } & \text{sign} & \text{ }\\      \text{ } & \text{ } & \text{ } & \text{ } & \text{ } & \text{ } & \text{ } & \text{ }\\      \phi(N) & \text{ } & \text{ } & \text{ } & W & \text{ } & - & \text{ } & \text{Step 9}\\      9007 & \text{ } & T & \text{ } & 605 & \text{ } & + & \text{ } & \text{Step 8}\\      5166 & \text{ } & 1 & \text{ } & 347 & \text{ } & - & \text{ } & \text{Step 7}\\      3841 & \text{ } & 1 & \text{ } & 258 & \text{ } & + & \text{ } & \text{Step 6}\\      1325 & \text{ } & 2 & \text{ } & 89 & \text{ } & - & \text{ } & \text{Step 5}\\      1191 & \text{ } & 1 & \text{ } & 80 & \text{ } & + & \text{ } & \text{Step 4}\\      134 & \text{ } & 8 & \text{ } & 9 & \text{ } & - & \text{ } & \text{Step 3}\\      119 & \text{ } & 1 & \text{ } & 8 & \text{ } & + & \text{ } & \text{Step 2}\\      15 & \text{ } & 7 & \text{ } & 1 & \text{ } & - & \text{ } & \text{Step 1} \\      \text{ } & \text{ } & \text{ } & \text{ } & \text{ } & \text{ } & \text{ } & \text{ }\\      14 & \text{ } & 1 & \text{ } & \bold 1 & \text{ } & + & \text{ }\\      1 & \text{ } & 14 & \text{ } & \bold 0 & \text{ } & - & \text{ }          \end{array}\right]

where W is the following number:


Let’s explain the table for the back substitution. The first number in the first column is \phi(N). The rest of the numbers in the first column are the divisors in the Euclidean algorithm. Note that the bottom number is always the GCD. The second column consists of the quotients in the Euclidean algorithm. The third column is where the back substitution takes place. It works from bottom up. At the bottom of the third column, the numbers are always 0 and 1 (in bold). To make it clear to see, there is blank row separating the bottom two rows from the rows above. The fourth column has the signs of the back substitution numbers. Starting with minus at the bottom, the signs are alternating: minus, plus, minus, plus etc (going from bottom to top).

We now demonstrate the calculation for the back substitution. The following shows the calculation for the rows labeled Step 1, Step 2, Step 3 and Step 9:

    1=1 \times 1 + 14 \times 0 (Step 1)

    1=7 \times 1 + 1 \times 1 (Step 2)

    1=1 \times 8 + 7 \times 1 (Step 3)

    W=T \times 605 + 1 \times 347 (Step 9)

The idea is that each number is the sum of the cross multiplications of the two rows below. This recurrence relation is a more efficient way of performing extended Euclidean algorithm.

The first two rows of the above table give the solution to the equation 9007 \times x + \phi(N) \times y = 1. In the first two rows, cross multiply the divisor and back substitution columns (along with the sign) as follows:

    9007 \times (-W) + \phi(N) \times 605 = 1

Then -W is the solution to ed \equiv 1 \ (\text{mod} \ \phi(N)). We need d to be positive. To find d, simply add \phi(N) to -W. Thus d is the least positive integer such that -W \equiv d \ (\text{mod} \ \phi(N)) where


One additional comment about the above table. When you cross multiply divisor column and back substitution column for any two rows, the result is the GCD. In particular, when this is done on the first two rows, we solve the equation 9007 \times x + \phi(N) \times y = 1 and find the multiplicative inverse of e as a result.


Step 3 and Step 4. Decipher

With the decryption key d solved in the above step, we are now ready to calculate the following modular exponentiation:

    c^d \equiv m \ (\text{mod} \ N)

The standard approach is to use the fast powering algorithm. The idea is to use the binary expansion of the exponent d to convert the exponentiation c^d into a series of squarings and multiplications. The exponent d is a 426-bit numbers, which means that the fast powering algorithm only requires 425 squarings and at most 425 multiplications. The algorithm is fast and efficient since it runs in polynomial time. Our calculator produces the following result:


Using the predetermined mapping of A = 01, B = 02 (and so on), the numeric plaintext is converted to the following English phrase:

    the magic words are squeamish ossifrage

which is the title of this blog post.



The above demonstration shows how to derive the decryption key d from the knowledge of the two prime factors p and q of the modulus N. Thus the prime factors p and q should be kept secret. Of course, the decryption key d should also be kept secret so that the originator/creator of the public key is the only one who can read the encrypted messages sent to him or her.



Suppose the public key N and e is as above. Verify that encrypting the plaintext m will derive the ciphertext c given in the challenge problem. The plaintext m is repeated below.




  1. Atkins D., Graff M., Lenstra, A. K. Lenstra, Leyland P. C., The magic words are squeamish ossifrage, ASIACRYPT-94, Lecture Notes Comput. Sci. Volume 917, Springer-Verlag, Berlin, 1995.

\copyright \ \ 2014 \ \text{Dan Ma}