The foundation of Web security rests on the notion that two very large prime numbers, numbers divisible only by themselves and 1, once multiplied together are irreducibly difficult to tease back apart. Researchers have discovered, in some cases, that a lack of entropy—a lack of disorder in the selection of prime numbers—means by analogy that most buildings on the Web would stand in spite of gale winds and magnitude 10 earthquakes, while others can be pushed over with a finger or a breath. The weakness affects as many as 4 in 1,000 publicly available secured Web servers, but it appears in practice that few to no popular Web sites are at risk.

**Why primes are useful for security**

Web security relies on generating two large prime numbers with sufficient randomness that it is vanishingly unlikely that any two parties in the world would ever stumble across the same number twice—ever. It's impossible to assure, but uniqueness is nevertheless a requirement of the most nearly universally relied-upon algorithm, RSA, to secure Web browser/server transactions via HTTPS.

Here's how it works. Take two big prime numbers that are also similar in size and multiply them:

*858599503 * 879190747 = 754872738416398741*

The only legitimate divisors (the factors of the product) of such a number are 1, the number itself, and the two primes. The fact that it divides by 1 and itself are obvious. But even using all available mathematical and computational approaches, it takes an inordinate amount of time to tease out the two prime numbers used to create it.

So long as enough entropy is involved, larger primes and thus a larger product—at least 1024 bits long or over 300 digits in decimal, but, as we'll see, far better at 2048 or 4096 bits—result in better security. (The example above, only 23 bits long in binary, would be woefully inadequate, taking perhaps seconds to factor.)

Less entropy means that the same large prime numbers wind up being used repeatedly, which makes finding their factors simple for publicly reachable sites, routers, and other hardware.

**What's wrong with RSA**

The flaw in randomness that has been revealed has to do with the *public *part of public-key cryptography; RSA is a public-key algorithm and virtually all public Web servers rely on it. With the RSA algorithm, the two primes are used as part of a method that derives two related exponents—further multiplications of a number by itself. The exponents and the primes' product are used to encrypt and decrypt a message. The product of the primes and the public exponent may be distributed freely without compromising the private exponent (nor the two prime factors, for that matter, which aren't used again). A sender with a recipient's public key sends a message; the recipient uses the private key to reverse the process (using a *multiplicative inverse* operation, as I just learned).

The weakness found by two different sets of researchers doesn't relate to the algorithm. It's already known that sufficiently small products may be factored to their original primes through the best-known computational method. Products of 512 bits (150 decimal digits) or smaller are essentially broken; 768 bits is within reach, but only with substantial effort.

No, the problem isn't with a new method of speeding up factorization. Rather, as noted early, it's about entropy and the public nature of public keys. Web sites that use SSL/TLS for encrypting connections with clients publish a certificate that the client receives as part of an initial handshake to establish security. This certificate includes the public key information along with other identifiers and, in nearly all circumstances, a signature from a certificate authority (CA) to allow additional validation. (There are problems with certificate authorities' authoritativeness, too, but that's a different issue.)

These certificates from public Web sites may be freely retrieved. This is happening ever more frequently because the growing lack of trust in the security measures employed by CAs to keep SSL/TLS certificates from being generated for illegitimate parties has led to various projects scanning all public Web sites on a continuous basis to determine and alert users and site operators if the certificate for a given domain changes over time. This led the Electronic Frontier Foundation (EFF), for one, to pull down most recently 7.1 million certificates as part of its SSL Observatory. Another research group harvested millions of public certificates through their own efforts in a relatively fast and painless operation.

Once you have a large database of certificates, you can extract the public keys and look at the products. Researchers from the École Polytechnique Fédérale de Lausanne (Switzerland) and one independent collaborator released an academic paper on February 14th that examined both public keys in SSL/TLS certificates and PGP-style public keys used typically for person-to-person communications. The PGP keys revealed almost no problems; RSA keys in Web certificates, on the other hand, were very troublesome. The authors found that about 0.2% (2 in a 1,000) of the dominant 1024-bit RSA keys they checked in a first set and closer to 0.4% of a larger set with more recent data were vulnerable because two or more certificates shared a prime factor.

It's not impossible that the RSA key generation software would have "collisions," in which the same primes were generated on multiple occasions by different systems. But it should be far more unlikely. Thus, there is some flaw that prevents the degree of randomness necessary to ensure the least possible repetition. In private use, this wouldn't matter. But because public-key cryptography relies on publishing keys, such overlap may be found easily. (CAs don't typically generate the public/private key pair. System administrators generate an RSA key using software like openssl on a computer under the admin's control. That key is then bound up in a certificate signing request that's submitted to a CA to get its imprimatur.)

I'm not a mathematician, so I can't translate the specifics of how sets of different products sharing a single prime factor can result in a relatively easy extraction of the other prime. Once you have both primes, the calculation necessary to create the private key exponent may performed without any fuss. That eliminates all obfuscation from transactions conducted using that RSA key.

SSL/TLS and other methods of establishing secure network connections use public-key cryptography for the initial part of the setup to exchange a short symmetrical key used by both parties that's much faster to compute, and is used only for a session or part of a session. The public-key portion ensures that the session key isn't sniffed. But if the private key is known, the initial exchange of the session key may be decrypted through passive data sniffing, and thus the session key revealed. In some implementations, as EFF notes in a post on this research, learning the private key could allow decryption of previous encrypted communications if someone attempting to intercept such traffic retained records for later cracking.

The Lausanne researchers worried about disclosure, as their work could be easily replicated. In the end, they notified as many certificate owners with extractable primes as they could, based on information in the certificates and at Web sites at which the certifications are used. They found that to be quite spotty in practice. EFF is also engaged in notifying vulnerable parties.

**The fallout**

This is all troubling, but what's the upshot for practical threats?

First, the authors of the Lausanne paper found that 2048-bit RSA keys aren't invulnerable, but a literal handful (10 keys out of 3.2 million in the latest dataset examined) could be factored. That compares to about 20,000 out of 3.7 million 1024-bit keys. EFF's recent scan also shows that Web sites are migrating to 2048 bits, with a shift of nearly a quarter of all certificates moving from 1024 to 2048 bits from its August 2010 to its 2012 scan.

Second, for a broken certificate to be used to sniff data, an observer has to be on the wire, in a position between clients and the server. If you can break into a server to monitor traffic, you don't need to subvert SSL/TLS. You can grab the decrypted data on the server as it arrives. Being able to sniff at a major data center or Internet exchange where SSL/TLS is assumed to cover any risks would possibly be easier to arrange, but not trivial. Being able to be a man in the middle at the right point in the topology is largely relegated to corporate spies and government agencies.

Third, a separate group of researchers outed themselves on February 15th, as they had been engaged in a similar set of research. In a post by Nadia Heninger at Princeton's Freedom to Tinker blog, she explained we shouldn't get our IP panties in a wad over this. Her group, which has a forthcoming paper it will release after finishing disclosures to affected hardware manufacturers, found that the primary position for these weak public keys was in embedded devices, such as consumer routers and corporate firewalls. She suggests that embedded device makers have the most work to do.

This is the dilemma with all security and encryption issues. This is a problem—a serious problem! And it's not theoretical, but the practical impact appears to be small, and it's fixable in a reasonable amount of time without an insane amount of effort. But there's no way to know which popular sites used weak RSA keys unless someone replicates this work and publishes it openly. Nor can we know whether any data was extracted from sessions conducted with them. The odds seem to favor few sites and no data, but it's simply impossible to know.

You can ignore any hype about this problem. RSA, public-key cryptography, and SSL/TLS certificates aren't broken or even damaged. We just need a little more random acts of key generation (and an option when making keys to check them against a public database of identical prime products), and the problem will disappear.

I am a mathematician, so I can tell you how it works:

If two numbers share a prime factor, then a fraction of those two numbers can be reduced by that factor.

The usual way it’s taught is that we have to find the prime factors first, but that’s not true: we can use something called the Euclidean Algorithm, which is a lot faster and you don’t have to find any primes. Basically it goes like this.

If we have a fraction, say, 15/27, it’s relatively obvious that its reciprocal (27/15) will be reducible by the same amount. And the mixed number version of that (1 12/15) will, too, and the whole part of the mixed number doesn’t matter when we try to reduce, so 12/15 will be reducible by the same amount. Repeat this process (15/12 -> 1 3/12 -> 3/12 -> 12/3 -> 4) and eventually the mixed number version will have no remainder! When that happens, the denominator it would have (3) is the common factor between the numbers.

A lot of this though is just moving numbers around; the only actual arithmetic here is that we divide and get the remainder, an operation programmers call “mod” and use the symbol % to enact.

in python:

`def gcf(a, b):`

while b:

a, b = b, a % b

return a

27 % 15 = 12

15 % 12 = 3

12 % 3 = 0

common factor is 3.

Now, since we know that the keys are the product of two prime numbers, we know that their common factor is either 1, in which case they don’t share anything, or one of those prime factors, in which case they share that factor, and simple division can find the other, compromising the private key.

Now I get it! Thanks to you and Eric0142.

Say P and Q are both products of exactly two prime numbers. If by chance P = a*b, and Q = a*c, for a,b,c prime, then we can recover the factors given only the products P and Q.

How? The Euclidean algorithm for computing the greatest common divisor (gcd).

Why? The definition of the gcd of two integers (ie. gcd(P,Q)) is the largest integer which divides both P and Q. If the gcd is 1, then we say the numbers are relatively prime: they have no prime factors in common. If they did, then we could use that number to divide both and have a number larger than 1 that divided them both.

So here if P = a*b, and Q = a*c because a prime was reused, then gcd(P,Q) = a != 1. Because a will divided both P and Q. And since we know that P and Q are products of exactly two prime numbers, it must be the case that a is one of the prime factors (otherwise, we would have a number which divided a prime that wasn’t 1 or the prime itself). And from there we can compute b = P/a, and c = Q/a. We know that P/a and Q/a will be integers because of the definition of the gcd (the value of the gcd must divide the arguments)

The procedure given above can compute the GCD using the Euclidean algorithm. It is very fast and “the worst case behavior requires a number of steps that can never be more than five times the number of its digits (in base 10)” (wikipedia). So even though 2048 bit integers are on the order of 10^600, it’ll never take more than 10,000 arithmetic operations to compute the GCD. Even though it’s working with large numbers, they get small very fast.

You an extract the common factor using the Eucli… oh, someone said that already?

What I’m interested in is how the same primes got “randomly” chosen in the first place.

Primes are a lot more common than one might think, so a fairly common method to generating them is to pick a random number and start scanning from there (ex: at random, I chose 699, which isn’t prime, but 701 is, or picking 104 to get the prime 107). I wonder if there is an abnormally long stretch of composites in the middle of the common 1024-bit search space so the first prime greater than that gets picked more often than it should.

Even if there isn’t an abnormally long stretch of composites, there is always variation in the probability of generating a particular prime when doing it this way: if we’re generating 3-digit primes, 907 will show up ten times as often as 883.

It seems like a mix of the “Birthday Problem” and poor key generation. With N possible primes of particular size, a sample of some proportion of sqrt(N) is likely to have collisions. Mostly though it looks like flaws in the key generation routines of things like routers.

“Birthday Problem” is doubtful. There’s approximately 2^1013 primes with 1024 bits, which would imply a possible collision after generating over 2^500 key-pairs.

Definitely the key generation algorithm is the major problem. But the “birthday problem” tends to bite people when they calculate something like “the probability of a collision even with this shortcut is 10^(-18)” and think that’s good enough.

The article says that it’s mostly embedded devices. My guess would be that the problematic part is “pick a random number”, because this is something computers are not naturally good at. Unless the device contains a specialised circuit (extra cost, would have to be designed in early), it just has to collect randomness and hope it’s good enough. What is it going to use? Number of milliseconds since it was turned on? There’s just not that many milliseconds… not enough to get 300 random digits for two 150-digit random numbers.

Of course, it’s also possible that the implementations are simply wrong. There’s a standard function called “rand” or “random”, and if you don’t know what you’re doing, it’s very easy to just go use that. Normally, that will give you 10 digits’ worth of randomness, total, at best, no matter how many times you call it. That randomness will be spread thin among the 300 digits you need, but it’ll still only be 10 digits’ worth.

I get around this by using primes so low that no one would think to check them. My public key is 21. Good luck factoring that, suckers!

Thanks for the explanation, Glenn, I saw a blurb on this yesterday but didn’t have a chance to look into the details. Very helpful.

And thanks everyone else for the stuff on the Euclidean algorithm.

Grammar alert: “The weakness *effects* as many as 4 in 1,000 publicly available secured Web servers, but it appears in practice that few to no popular Web sites are at risk.” That should be AFFECTS.

Grammar crisis averted! Fixed, and thanks.

This description cites “a lack of disorder in the distribution of prime numbers”, and implies this is the problem throughout. But my understanding is that the problem lies in selecting a large prime in a random way — some primes are selected more frequently than others. So it is not a problem with how primes are spread out in the field of numbers (illustrated in the graphic), but a problem in one or more selection algorithms resulting in “a lack of disorder in the distribution of prime numbers selected for use in encryption”.

Excellent point. I changed it to “selection” as the issue is how primes are pulled out of the RNG, not how they distribute.

So, in other words, Sneakers for real, amirite?

I decided to see how long factoring the 32 bit example above with the factor linux command line utility:

:~$ time factor 754872738416398741

754872738416398741: 858599503 879190747

real 0m2.482s

user 0m2.480s

sys 0m0.000s

Yep, it took 2.482 seconds.

Now compare this to factoring both 754872738416398741 and 759883073492568163, knowing that they have a factor in common.

Using the code I posted above, without any optimizations and in a language that isn’t exactly known for speed, this task takes 3.75 microseconds. For 1024-bit numbers it will take a few hundred times longer than that, which is still under a millisecond.

Using the best known prime factorization algorithm, simply factoring a 1024-bit number will take tens of millions times as long as your attempt.

Oh come on, you guys are overthinking this. The discoverers said they would not reveal the fault because it is too easy to take advantage.

My guess is that the random number generators are being fed the same seed and are therefore delivering the same ‘random’ number to many requests. It has got to be this dumb for the developers to miss it to begin with.

It’s not a fault as such. And the work is easily reproducible by anyone with a modest amount of interest who can scan all the public certificates in the world (which is easier than it sounds).

I like the illustration. Reminds me of this which I read as a kid… Gardner, M. “Mathematical Recreations: The Remarkable Lore of the Prime Number.” Sci. Amer. 210, 120-128, Mar. 1964.

Hey I’m working on a cinematic educational series which explains this in detail called “Gambling with Secrets” on YouTube. Check it out if interested:

http://www.youtube.com/artoftheproblem

It’s not clear from this whether it’s a few certificates with very low entropy, or more certificates with only moderately low entropy. Assuming it’s the latter, Eve could use the certificate generating programs to generate many more certificates than are publicly available, and thus find more compromised public certificates.

That is a superb point. If one could determine the RNG in question, it would be possible to create prime collisions for factoring.

How many pairs did they compare? Even though GCD is a very fast operation, if they did this for all distinct pairs amongst 6 million keys, that’s a lot of pairs… but is that what they did?

only 18 trillion. Written in C and given to a small compute center and it’d be done in a week or two.

No, we had a more efficient way, and I think Heniger et al. used yet another.

All in all it took a few hours on a single desktop machine.

Very nice top image for the post, Glenn. Is it available in larger sizes?