Prime Suspect, or Random Acts of Keyness

Discuss

28 Responses to “Prime Suspect, or Random Acts of Keyness”

  1. Dan Uznanski says:

    I am a mathematician, so I can tell you how it works:

    If two numbers share a prime factor, then a fraction of those two numbers can be reduced by that factor.

    The usual way it’s taught is that we have to find the prime factors first, but that’s not true: we can use something called the Euclidean Algorithm, which is a lot faster and you don’t have to find any primes.  Basically it goes like this.

    If we have a fraction, say, 15/27, it’s relatively obvious that its reciprocal (27/15) will be reducible by the same amount.  And the mixed number version of that (1 12/15) will, too, and the whole part of the mixed number doesn’t matter when we try to reduce, so 12/15 will be reducible by the same amount.  Repeat this process (15/12 -> 1 3/12 -> 3/12 -> 12/3 -> 4)  and eventually the mixed number version will have no remainder!  When that happens, the denominator it would have (3) is the common factor between the numbers.

    A lot of this though is just moving numbers around; the only actual arithmetic here is that we divide and get the remainder, an operation programmers call “mod” and use the symbol % to enact.

    in python:

    def gcf(a, b):
      while b:
        a, b = b, a % b
      return a

    27 % 15 = 12
    15 % 12 = 3
    12 % 3 = 0
    common factor is 3.

    Now, since we know that the keys are the product of two prime numbers, we know that their common factor is either 1, in which case they don’t share anything, or one of those prime factors, in which case they share that factor, and simple division can find the other, compromising the private key.

  2. Eric0142 says:

    Say P and Q are both products of exactly two prime numbers. If by chance P = a*b, and Q = a*c, for a,b,c prime, then we can recover the factors given only the products P and Q.

    How? The Euclidean algorithm for computing the greatest common divisor (gcd).

    Why? The definition of the gcd of two integers (ie. gcd(P,Q)) is the largest integer which divides both P and Q. If the gcd is 1, then we say the numbers are relatively prime: they have no prime factors in common. If they did, then we could use that number to divide both and have a number larger than 1 that divided them both.

    So here if P = a*b, and Q = a*c because a prime was reused, then gcd(P,Q) = a != 1. Because a will divided both P and Q. And since we know that P and Q are products of exactly two prime numbers, it must be the case that a is one of the prime factors (otherwise, we would have a number which divided a prime that wasn’t 1 or the prime itself). And from there we can compute b = P/a, and c = Q/a. We know that P/a and Q/a will be integers because of the definition of the gcd (the value of the gcd must divide the arguments)

    The procedure given above can compute the GCD using the Euclidean algorithm. It is very fast and “the worst case behavior requires a number of steps that can never be more than five times the number of its digits (in base 10)” (wikipedia). So even though 2048 bit integers are on the order of 10^600, it’ll never take more than 10,000 arithmetic operations to compute the GCD. Even though it’s working with large numbers, they get small very fast.

  3. Buddha Buck says:

    You an extract the common factor using the Eucli… oh, someone said that already?

    What I’m interested in is how the same primes got “randomly” chosen in the first place.   

    Primes are a lot more common than one might think, so a fairly common method to generating them is to pick a random number and start scanning from there (ex: at random, I chose 699, which isn’t prime, but 701 is, or picking 104 to get the prime 107).  I wonder if there is an abnormally long stretch of composites in the middle of the common 1024-bit search space so the first prime greater than that gets picked more often than it should.

    • Dan Uznanski says:

      Even if there isn’t an abnormally long stretch of composites, there is always variation in the probability of generating a particular prime when doing it this way: if we’re generating 3-digit primes, 907 will show up ten times as often as 883.

    • taintofevil says:

      It seems like a mix of the “Birthday Problem” and poor key generation.  With N possible primes of particular size, a sample of some proportion of sqrt(N) is likely to have collisions.  Mostly though it looks like flaws in the key generation routines of things like routers.

      • Buddha Buck says:

        “Birthday Problem” is doubtful.  There’s approximately 2^1013 primes with 1024 bits, which would imply a possible collision after generating over 2^500 key-pairs.

        • taintofevil says:

          Definitely the key generation algorithm is the major problem.  But the “birthday problem” tends to bite people when they calculate something like “the probability of a collision even with this shortcut is 10^(-18)” and think that’s good enough.

    • sabik says:

      The article says that it’s mostly embedded devices. My guess would be that the problematic part is “pick a random number”, because this is something computers are not naturally good at. Unless the device contains a specialised circuit (extra cost, would have to be designed in early), it just has to collect randomness and hope it’s good enough. What is it going to use? Number of milliseconds since it was turned on? There’s just not that many milliseconds… not enough to get 300 random digits for two 150-digit random numbers.

      Of course, it’s also possible that the implementations are simply wrong. There’s a standard function called “rand” or “random”, and if you don’t know what you’re doing, it’s very easy to just go use that. Normally, that will give you 10 digits’ worth of randomness, total, at best, no matter how many times you call it. That randomness will be spread thin among the 300 digits you need, but it’ll still only be 10 digits’ worth.

  4. semiotix says:

    I get around this by using primes so low that no one would think to check them. My public key is 21. Good luck factoring that, suckers! 

  5. wysinwyg says:

     Thanks for the explanation, Glenn, I saw a blurb on this yesterday but didn’t have a chance to look into the details.  Very helpful.

    And thanks everyone else for the stuff on the Euclidean algorithm.

  6. Nancy J Gill says:

    Grammar alert: “The weakness *effects* as many as 4 in 1,000 publicly available secured Web servers, but it appears in practice that few to no popular Web sites are at risk.” That should be AFFECTS.

  7. Mark Dow says:

    This description cites “a lack of disorder in the distribution of prime numbers”, and implies this is the problem throughout. But my understanding is that the problem lies in selecting a large prime in a random way — some primes are selected more frequently than others. So it is not a problem with how primes are spread out in the field of numbers (illustrated in the graphic), but a problem in one or more selection algorithms resulting in “a lack of disorder in the distribution of prime numbers selected for use in encryption”.

  8. Halloween_Jack says:

    So, in other words, Sneakers for real, amirite?

  9. Jim Jones says:

    I decided to see how long factoring the 32 bit example above with the factor linux command line utility:

     :~$ time factor 754872738416398741
    754872738416398741: 858599503 879190747

    real    0m2.482s
    user    0m2.480s
    sys    0m0.000s

    Yep, it took 2.482 seconds.

    • Dan Uznanski says:

      Now compare this to factoring both 754872738416398741 and 759883073492568163, knowing that they have a factor in common.

      Using the code I posted above, without any optimizations and in a language that isn’t exactly known for speed, this task takes 3.75 microseconds.  For 1024-bit numbers it will take a few hundred times longer than that, which is still under a millisecond.

      Using the best known prime factorization algorithm, simply factoring a 1024-bit number will take tens of millions times as long as your attempt.

  10. timquinn says:

    Oh come on, you guys are overthinking this. The discoverers said they would not reveal the fault because it is too easy to take advantage.
     
    My guess is that the random number generators are being fed the same seed and are therefore delivering the same ‘random’ number to many requests. It has got to be this dumb for the developers to miss it to begin with.

    • It’s not a fault as such. And the work is easily reproducible by anyone with a modest amount of interest who can scan all the public certificates in the world (which is easier than it sounds).

  11. I like the illustration. Reminds me of this which I read as a kid… Gardner, M. “Mathematical Recreations: The Remarkable Lore of the Prime Number.” Sci. Amer. 210, 120-128, Mar. 1964.

  12. Brit Cruise says:

    Hey I’m working on a cinematic educational series which explains this in detail called “Gambling with Secrets” on YouTube. Check it out if interested:

    http://www.youtube.com/artoftheproblem

  13. main says:

    It’s not clear from this whether it’s a few certificates with very low entropy, or more certificates with only moderately low entropy.  Assuming it’s the latter, Eve could use the certificate generating programs to generate many more certificates than are publicly available, and thus find more compromised public certificates.

  14. Max Flander says:

    How many pairs did they compare? Even though GCD is a very fast operation, if they did this for all distinct pairs amongst 6 million keys, that’s a lot of pairs… but is that what they did?

  15. Nick O'Neill says:

    Very nice top image for the post, Glenn. Is it available in larger sizes?

Leave a Reply