css.php

Is RSA Safe?

There has been some talk in the news recently that the security provided by the RSA encryption algorithm isn’t as secure as it used to be.

RSA is an acronym standing for Rivest, Shamir, and Adleman, the individuals who designed the algorithm at MIT in 1977.

Rivest, Shamir, Adleman

Rivest, Shamir and Adleman

An equivalent system was developed by British mathematician Clifford Cocks in 1973, though at the time he was working for a clandestine branch of the government, and his work went unattributed for many years.

Clifford Cocks

Clifford Cocks

The RSA algorithm is one of the cryptographic workhorses of the internet, helping to put the “s” in “https” on the websites we use every day.  Through ingenious means, which we won’t discuss here, it is also used to produce digital signatures, which guarantee that messages originate from their specified sources.

The article linked above, concerning looming insecurities in RSA, discusses a recent talk delivered by Alex Stamos at the annual Black Hat conference in Las Vegas.  From the article it is difficult to fathom how dire the crisis might be.  It’s said that Stamos is disconcerted by recent progress made by French mathematician Antoine Joux on the discrete logarithm problem.  There is no comment from Joux, whose word I would find more definitive.  How certain can we be that the alarm isn’t an illusion?

Those things will become more clear in coming weeks no doubt.  But what is the discrete logarithm problem, and how does it relate to RSA?

We should start with a review of the ordinary logarithm function, invented by the Scottish wizard Napier in the 17th century. We then give a short description of modular arithmetic, define the discrete logarithm problem, and review RSA.  Finally, we’ll see how all these fit together, and why the ability to compute discrete logarithms quickly challenges current security protocols.

A logarithm is a function which takes two positive real numbers, a base b together with an input a and returns the number c such that b^c = a. In notation we express this by \log_b(a) = c.

The fact that there is such a number c is interesting in itself.  The existence of c can quickly be confirmed by examining a graph of the exponential function.  The following figure is for b = 2, though for other values of b the figure is essentially the same.  Note that for any output (on the y axis) there is an input (on the x axis) such that the function strikes the output with the input.  This justifies the definition of the logarithm.

exp2

The graph of the curve described by y = 2^x

To understand the discrete logarithm, it is necessary to understand the discrete context.  Here we are not concerned with real numbers, but rather with integers (whole numbers).  In fact, we are only concerned with the numbers \{0,1,2,\ldots,n-1\} for a fixed integer n.  We can do arithmetic in this finite realm provided we are willing to “wrap around” when our sums an products go out of scope.  The exact nature of what happens is discussed in this introductory article from the Khan Academy.  If you would like something more serious, the article by Gauss himself is not difficult, and uses (in fact introduces) all the modern notation.  Amazingly I cannot find an English edition of Disquisitiones Arithmeticae online, and so I refer the reader to the excerpts found in the collection God Created the Integers.

To give a few fast examples, we write
5^2 \equiv 1 \mod{4}

to mean that if we were to consider 25 = 5^2 stones and count them out in groups of 4, in the last pile we would have a single stone.  We call 1 the residue of 25 modulo 4. Note that if we were to take 5 stones and count them out in groups of 4, then in the last pile we would have only 1 stone.  That is, the residue of 5 mod 4 is already 1.  Note also that in this case the product of the residues is the residue of the product.  In other words, using C++ notation,

5^2 \% 4 == (5\%4)(5\%4) because 1 = 1\cdot 1.

In fact this is true in general, and explains many of the properties of numbers we learn in grade school.  For instance we learn that a number is divisible by 3 if and only if its digits sum to three.  I will use modular arithmetic to show why this is true for the arbitrary example of 2349. First expand using the definition of a decimal number.

2349 = 2\cdot 10^3 + 3\cdot 10^2 + 4 \cdot 10 + 9

 

Now note that 10 \equiv 1 \mod{3} and so the same is true for any power of 10.  Thus 2349 \equiv 2+3+4+9 \mod{3}.

Now 2349 is divisible by three if and only if 2349 \equiv 2+3+4+9 \equiv 0 \mod{3}.  That is, it is equivalent to the condition that the sum of the digits is also 0 mod 3, or in other words that the sum of the digits is divisible by 3.  This gives something of the flavor of discrete arithmetic.

What should a logarithm be in the discrete context?  We can use the old definition, with an additional twist.  For numbers b and a in \{0,1,\ldots,n-1\} define the discrete logarithm (base b ) of a to be the number k in \{0,1,\ldots,n-1\} such that b^k \equiv a \mod{n}.  For instance, because we know that 11^5 \equiv 10 \mod{17}, it follows that the discrete logarithm of 10 base 11 mod 17 is 5.

Again, we have the question of whether the logarithm is well defined.  Is it the case that for any choice of a,b and n, the integer k exists such that b^k \equiv a \mod{n}?  The answer is no — you should find it easy to produce a counter example.  Also, unlike the continuous cases, the discrete exponential function is not one-to-one.  This means that the uniqueness of k is also an issue.

In group theoretic terms, the base now has to be a generator of the multiplicative group of integers mod n in order for the definition of logarithm to make sense.  These details don’t matter for a rough discussion of discrete logs as they apply to cryptography.

Questions of efficient means of computing discrete logarithms arise in many cryptographic systems, but we will focus our attention of RSA.  At this point we need some account of what RSA is, for which I have written the this using the iPython notebook to produce an annotated example.

After following the link and diligently reading, you must now know that the crux of the RSA algorithm is the decryption step M = C^{d} \mod{n}.   Can we cast this as a question about a logarithm?  Recall from the RSA example that the secret parts of the above equation are d and M.  The ciphertext C is public knowledge, as is the modulus n.  But anyone is free to encrypt a message using any public key.  This means that we can pick M, and so we can know that value too.  Thus the real mystery is d, the private exponent.

To find d, what we need to know is:  To what power must C be raised in order to be congruent to M modulo n?

In other words, to crack RSA we want to know the discrete logarithm of M base C modulo n.

For this reason, if Joux or his colleagues ever do find a fast method for computing discrete logarithms, the current implementations of many common cryptographic systems, including systems for producing digital signatures, will become obsolete.

This is not the only way to break RSA.  It seems like it should be easier in fact to crack RSA for a particular message M rather than find d and unravel the whole system.  To do this, imagine that the message M is produced not by us but by someone communicating with the victim.  To recover M we must solve for it in the equation

C = M^e \mod{n},

in which it is the only unknown.  This is not a logarithm problem, but is instead the problem of discrete root extraction.  In fact this problem has its own name — it is called the RSA problem.  Obviously no practical means is yet known for solving this problem either.

RSA could fall because of advances in the science of number factoring.  While this has not yet led to the gelding of RSA as far as anyone is saying, still the speed with which numbers can be factored has improved in dramatic and unexpected ways.

Shortly after RSA was announced, the popular mathematics writer Martin Gardner asked Rivest, Shamir, and Adleman for an encrypted message with which he could tease his readers.  They agreed, and produced an encoded message using a 129-digit public key.  The value of n in the key was: 1143816257578888676692357799761466120102182 967212423625625618429357069352457338 97830597123563958705058989075147599290026879543541

The prize for producing a solution was $100.  Rivest calculated, based on mathematical technology existing at the time, that factoring this number would require 40 quadrillion years.  This figure assumed a machine capable of performing 1 billion modular multiplications per second, which seems to have been achieved at the PC level only in 2009.

As this article explains, Rivest’s figures were off by many orders of magnitude, but not because he underestimated the growth in computing power.  Rather, he was overly optimistic about innovations in factoring large numbers, in particular sophisticated variants of the quadratic sieve. This article by Pomerance outlines some of the history.

For those who are curious about the solution to Gardener’s puzzle, it may interest you to know that (as a team of hackers found in 1994)

1143816257578888676692357799761466120102182 967212423625625618429357069352457338 97830597123563958705058989075147599290026879543541 = 34905295108476509491478496199038 98133417764638493387843990820577 \times 32769132993266709549961988190834 461413177642967992942539798288533

The message which was encoded read: THE MAGIC WORDS ARE SQUEAMISH OSSIFRAGE

I am indebted to Julian Brown’s book The Quest for the Quantum Computer for this anecdote.  Incidentally Brown’s book is a good starting place for reading about that other perennial threat to our online security: quantum computing.

 

 

This entry was posted in Uncategorized. Bookmark the permalink.

6 Responses to Is RSA Safe?

  1. John Smith says:

    Thanks Hunter Johnson for such grateful insights about RSA. I was reading it on many online sources but did not get it briefly. I would love to have this in my bookmarks and tweet it for my other mates to read it.

    Regards
    John.

  2. Wow, very good read.

  3. funmath says:

    Great Article about RSA and why it is difficult to break.

  4. Dear Michael,

    Thank you for these thoughtful corrections!

  5. Michael Pace says:

    Just in terms of general fastidiousness or the lack thereof:

    “OSIFRAGE” should be OSSIFRAGE.

    “Gardener” should be Gardner.

    “Pommerance” should be Pomerance.

    And “. . . whether the logarithm is well defined. Is it the case that for any choice of a, b, and n, the integer k exists such that a^k = b (mod n) ?” should also include the issue of whether k is unique (among {0, 1, . . ., n-1}).

    Finally, since the article is about the discrete logarithm, these details -do- matter.

Leave a Reply

Your email address will not be published. Required fields are marked *