The password breaches are getting stronger and worser, and hardly a week goes by without a dump that's a couple zeroes bigger than the biggest to date — but not all password breaches are created equal, and a lot depends on whether and how the passwords were hashed.
Password hashing uses "one-way functions" to convert passwords like "p4ssw0rd" into strings of numbers like "2a9d119df47ff993b662a8ef36f9ea20" (these numbers are called "hashes"). When you enter your password, the system hashes it and compares the hash to decide whether you've typed in the correct password. It doesn't save the password at all. In theory, it's very very hard to convert the hash back into the password, but it's an arms race.
Some hashing algorithms are old and have known weaknesses. But even for modern, strong hashing algorithms, attackers can use "rainbow tables" to break them. These are files that contain tens of millions of possible passwords (such as all four- or five-character combinations, as well as words and phrases, with and without common substitutions like 0 for "o") and their hashes.
Enter "salting," which combines the password with a random string, forcing attackers to produce new rainbow tables for each salt. When combined with algorithms that are computationally expensive enough to make this a huge expense (but not so computationally expensive as to preclude being used for actually authenticating users!), these make the job of attackers much, much harder — maybe even practically impossible.
But it's a big maybe. The kinds of computers used in big data analysis and bitcoin mining are also very efficient at producing rainbow tables, and there's a lot of money being poured into making these computers faster and cheaper, meaning the cost (in time and money) of generating a new rainbow table to attack a salted password file keeps on dropping.
I wrote a novella about the end-game of this, about a group of computer scientists who realize that they can generate rainbow tables quickly and cheaply enough to break any password file, and who decide to use this power for good — they invent a chivalric code for their new power, and dub themselves the "Knights of the Rainbow Table" (there's also a seven-part reading of the story online).
Hash-cracking schemes have for decades been locked in a cat-and-mouse game with the security community's attempts to make hashing more secure. Switching from normal computer processors or CPUs to graphics processors or GPUs allowed password crackers to exploit those chips' ability to perform many simple tasks in parallel, accelerating their guessing as much as a thousandfold. Hash-crackers have developed so-called "rainbow tables," immense lists of pre-computed hashes for every possible password. And modern password crackers don't merely guess passwords at random, but use "dictionary attacks" to cycle through real words, collections of known common passwords from past breaches, and to use statistical analyses of those passwords to find patterns that allow new passwords to be guessed faster. (LinkedIn's 177 million passwords will no doubt give password crackers plenty of new material to study for developing future hash-cracking techniques.)
The security world has responded with its own tricks to slow, if not altogether stop, password hash-cracking. To prevent pre-computation, hashing schemes now use a trick called "salting," adding random data to a password before hashing it and then storing that "salt" value along with the hash. (LinkedIn didn't even go that far with its 2012 password collection.) And modern hashing techniques like bcrypt and Argon2 don't simply run a password through a function like SHA1, but do so thousands of times, rehashing the resulting data again and again. Those functions require that data is stored in memory and then accessed again, creating a bottleneck: the work can't be split into many parallel tasks by a GPU without having to access memory at each step.
Hacker Lexicon: What Is Password Hashing? [Andy Greenberg/Wired]
(Image: Double-alaskan-rainbow, Eric Rolph, CC-BY-SA)