XKCD on the password paradox: human factors versus computers' brute force

Today's XKCD, "Password Strength," neatly illustrates the research from this paper (PDF) by Philip Inglesant and M. Angela Sasse from University College London, with the ironic conclusion that we've trained our users to use passwords that computers can easily guess and humans can't possibly remember.

Password Strength



  1. Ironically, as I recall from my time at UCL, they had the worst/most obnoxious password policy of any of the institutions I’ve been affiliated with…

    1. I saw the paper in question being presented, and IIRC, it was inspired by that selfsame obnoxious policy.

  2. Question is, how many places will such longer passwords be either not allowed or silently truncated?

    1. It shouldn’t matter.  You can type in “thisismylongpassword”, and if truncated to 10 characters (“thisismylo”), it’s still a password.

      Actually it does matter in the sense that 10 character passwords don’t offer as much protection against brute force as longer ones.  

    1. Dictionary attacks won’t work, as they try single words, not combinations. If they did try combinations, we’re talking about many more combinations than any combo of letters, numbers and symbols will ever provide.

        1. Dictionary attacks target individual words. Take the number of words in your dictionary and assume that it tries one at random, then figure out the number of tries it would take to find the right one. Then compare that to doing the same thing on a set of all permutations of between one and four dictionary words (i.e., a dictionary of n+n^2+n^3+n^4 entries). Then compare it to a dictionary of all permutations of between one and eight dictionary words… The number of possibilities very quickly grows past the point where dictionary attacks stop being useful.

        2. The comic explicitly states that it’s not considering stolen hashes, but the point is the same regardless. If you’re attacking hash files to get the original password, you’re still trying to generate possible passwords and see if they match the hash (you may have done this before with a rainbow table, but it’s the same thing).

          “correcthorsebatterystaple” is still an almost impossible thing for your hash-checker to have tried out, because the number of possible combinations of four random English words is staggeringly high.

        3. That’s one method. Another is to find collisions if they’re using a weak hashing algorithm. But more likely an attacker these days is going to use a Rainbow Table of pre-generated hashes (really a dictionary attack + optional salt. All the heavy lifting is done pre attack).

  3. As turn_self_off mentions truncation ruins this as his password would end up being “correcth” at our university for example. 
    Also, if we had trained people to use word concatination to form passwords, wouldn’t they simple be as vulnerabel to attacks focused on random concatination of dictionary words?

  4. Are those spaces between the four words?

         correct horse battery staple


    Spaces are rarely allowed. And both passwords are too long for many, many websites.

    So I don’t think this is a solution.

    1. You’ve hit on the key point – historically, most systems required passwords to be very short, forcing us to rely on letter substitution tricks and abbreviation mnemonics.  With modern computers, there’s no technical or practical reason password length should be limited, but we’d basically become conditioned to use short passwords, especially as many sites still inexplicably demand it.

    2. The POINT is that such a simple and effective solution is not used due to 20 years of policies which include “you must use short passwords.”  Get it now?

    3. Restricting passwords to e.g. 8 characters is a legacy of ancient crypto algorithms that couldn’t cope with longer passwords than that (or silently discarded anything after the 8th character). Or it’s database designers assuming that that’s the way passwords are (we’re asking for a “password” not a “passphrase”).

      Not allowing spaces, though, is just Unix prejudice. RAM is cheap, modern programming languages are good at string handling; goddammit, let us have spaces in account names, passwords and file names.

    4. True but that is more a problem in the password system used by websites than it is a problem with the actual pass phrases themselves. 

      Much the same as the comic suggesting we’ve spent 20 years developing engines to make passwords people can’t remember, we’ve also spent 20 years developing code that expects a machine made password and freaks when it’s fed something a human could actually deal with.

      This obviously has to change as people can not remember 16+ characters of what is essentially random bollocks without writing it down somewhere.

  5. Indiana University changed to pass “phrases” in 2006. Its amazing how much trouble people still have just trying to come up with 4 words. 

    This is from the KB:

    Network ID passphrases must:

    Contain at least 15 and no more than 127 characters.

    Use at least four unique characters (letters, numbers, or symbols).

    Use at least four words. “Word” is defined here as two or more distinct letters; words must be separated by one or more spaces or other non-letters, not including numbers or the underscore character ( _ ). I.e.:

    little pink houses-4unme contains four “words”, and would therefore be a valid passphrase.

    hoagy_carmichael plays123stardust only contains two “words” (the numbers and underscore do not act as separators), and would therefore not be a valid passphrase.

    1. Your post raises a good point – the term “password” itself implies a single short word, while “pass phrase” leads the reader to naturally consider the longer types of passwords sought here.

      Terminology is a big part of a good design.

    2. It’s the must contain four unique characters & a word must be 2+ characters bit that’d confuse people.  Four words is easy:

      The Quick Brown Fox
      I like Boing Boing

      Each of those is four words, as defined by what most people think, one of them however would fail those rules and confuse the hell out of someone when it was rejected. So the system needs to accept a single, space separated character (I and A spring to mind) in order for it to fit common western English usage.

      Yes the leading single character would weaken the strength of the phrase but you still gain the advantage of it not being something people’d need to write down and the whole phrase vs word/random rubbish thing.

      Obviously you’d have something that threw out A B C D (or similar) as invalid in the mix as well (error msg:  I am NOT a Speak & Spell, try again).

  6. In 1Password “correcthorsebatterystaple” received an “excellent” rating, and is apparently just as safe as, say, “A7mM(3aSgw6!6k”

    And if I add “cow” at the end of “correcthorsebatterystaple” I get the maximum strength possible in 1Password (green bar all the way to the right, “fantastic” rating).

  7. Ironically UCL use a system that makes you change passwords every 90 days to one of these stupid ones…. every member of staff and student just has to write them down……

  8. It’s just a ploy by Randall to get access to our stuff. Correct Horse Battery Staple will now join “Swordfish” as a very common password, giving him a good chance for entry with any given username.

  9. Not arguing with the basic idea in the comic, but I do question the amount of entropy he assigns to common and uncommon words. I can’t say I’ve researched the matter; maybe he has or has found some research.

    But 11 bits for common words sounds suspiciously like it’s based off the 2+k of words competent English speakers are meant to understand (2^11 = 2048). 16 bits seems rather higher than I’d expect people to actually chose from (I’ve taken some of this from http://www.qgroupplc.com/category/howmanywords which may or may not be a trustworthy source).

    I’m not really questioning the numbers themselves, but the implied assumptions about the set of words from which people would choose a password. My gut tells me that for uncommon words, people would scour a set far too small to allow for 16 bits of entropy, and for common words a set slightly to small to allow for 11 bits of entropy.

    I would like to see some research on *that*, though.

    1. I’m wondering how he calculated the entropy as well… Since we are talking about bruteforcing in the common-word-phrase example, it probably makes sense to use the entropy Shannon calculated for letter patterns in english (or the entropy he calculated for an average common english word, for the purpose of ballpark calculations). Since ‘correct horse battery staple’ is arguably grammatically correct (or at least easily parsed by all of us) it probably has less entropy in terms of a space-tokenized markov model based off a large english corpus than, say, “pretending automatic organization greenish”

      1. I’m not a cryptography expert but it seems to me that you should be testing the algorithm, not the example: “four random common words”. 

        I personally don’t make any more or less sense of “pretending automatic organisation greenish” than I do “correct horse battery staple”.  But I don’t think that that’s relevant, because we’re talking about the usefulness of ANY four random words as a password.

    2. I think that these are supposed to be assigned passphrases as opposed to chosen ones. I think that it’s a key difference as otherwise we wouldn’t expect that the distribution of chosen words to be equiprobable. Also, there would non-trivial correlations between the chosen words since people would be likely to choose actual English phrases.

      Also, the estimates in the 1st panel makes no sense unless the form has been predetermined.

  10. I first heard of this system of “phrases” from this site: http://www.baekdal.com/tips/password-security-usability. I actually presented it to my students (high school computer tech) and we had an interesting discussion. People are so ingrained to make passwords specific ways it is difficult to change people’s perceptions of secure.

    1. much to my surprise, the hardest thing for me to get over was the fact that i was putting OMG SPACESSS!!  in a password field.

      Kind of like putting your underwear on backward, it just didn’t feel right. I adjusted after a couple weeks.

  11. we’ve trained our users to use passwords that computers can easily guess and humans can’t possibly remember.

    And easily fit into a small database field, which is where a majority of them are now stored.  Before that they were stored on file systems with relatively limited size HDs.

    I think this is an excellent idea, but the allowed length of passwords is firmly rooted in the 80s.

    1. *security fail*
      Passwords should never be stored in any database field of any size. Only their hashes (salted for flavor) should be stored anywhere.

    1. One great example is Gawker, since pretty much all their lax security policies were made public.  The website would accept passwords of longer lengths, but in the background it would truncate to 8 characters before running DES crypt(3) on it.

  12. This needs to be read by every IT department in the world.
    One of the systems I am forced to use requires password change every month, must be 8-16 characters, must have numerals, must NOT have two consecutive identical characters.
    It’s horrendous. 

    And by not allowed consecutive repeat characters they are simply reducing the possibility-space and therefore security of the passwords I assume.

    1. And I assume you’d do what anyone else in that position would.  Find a password generator, tell it the rules you need to follow and enter what it spits out.

      Then write it down somewhere ‘safe’ because it spat out random bollocks which you’ll never remember in a million years.

  13. As someone who has used some obnoxiously long phrases for my SSH key, let me tell you…typing long phrases while not being able to see them makes for some real difficulty.  It’s one thing to type 8-12 characters blindly.  When you’re typing 35-50, having to slow down and make sure you don’t mess up without being able to get feedback makes for a lot of retries.  Then, of course, the system has to be more permissive of multiple retries, and there are issues with that, no?

    1. For lockouts we had to increase the attempt limit from 15 to 25 (I think 15… its been a while). Also set the 25 attempt limit to within a 2 hour period rather than simply ticking the failure count. 

    2. Muscle memory. Not kidding. I have a 50ish character passphrase for my PGP encrypted disk on my work laptop. I have to enter that every day just to boot up. I can type that passphrase off in seconds without looking. Every now and then my rhythm is off and the spaces gets messed up and I have to re-enter, but I’ve never had it enter the realm of inconvenience.

      There’s no rotation and I’ve used the same passphrase in PGP/GPG for years (at least for the work account, I have a separate one for my personal key). It’s an obscure quote randomly pulled from a work I like (not one that most would associate with said work –it’s fairly banal in context). My hands move fast enough over the keys that I really would only worry about shoulder surfing with a video camera or a BIOS/UEFI keylogger (remember: laptop)

      1. Is there a security term for the phenomenon where the more cleverly secure you make your password, the stronger desire you feel to give everyone hints about what it is?

        1. Ha :)

          You’re right. I’ve narrowed it down to the trillions of possible 50ish-character phrases in some
          “work”. Such work could be a play, book, poem, movie, etc… literally millions of possibilities.

          Hell, I’m feeling reckless, I’ll even say it’s an English “work”.

          (Cue the intense murmuring of the crowd with one voice piping up, “He’s mad, I tell you, mad!”)

          Gee, I better go change it now.

    3. This is why there is a movement to eliminate the mostly-pointless blind password fields. If I’m sitting at home on my computer, why the hell would I not want to be able to see my password? 

      Forcing blind password fields everywhere encourages people to use weaker passwords, for the very reason you described.

      Sites ought to all have checkboxes next to password fields to allow users to optionally see *s or the actual password.

      1. It’s a relic of TEMPEST, from the ’80s.  CRT monitors could be read remotely, due to their EM emissions.  Do LCDs  have the same problem?

        1. I suspect it has much more to do with being an entrenched method to reduce vulnerability to shoulder surfing in public/semipublic places, blindly applied to all circumstances without consideration of whether it’s actually necessary or advisable in all applications.

        2. > It’s a relic of TEMPEST, from the ’80s.  CRT monitors could be read
          remotely, due to their EM emissions.  Do LCDs  have the same problem?

          They do, and surprisingly the exploitable radio frequency signals in the case of LCD displays originate from the interface, not from the LCD panel itself.  See http://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-577.pdf for more than you ever wanted to know about TEMPEST in the 21st century.  Markus Kuhn’s PhD dissertation is the best survey of the field written since the 1980s.

        3. I’m finding it hard to believe that that’s really why password fields across the world are invisible. Was Tempest ever commonly used, or was it just research papers?

          For that matter, it’s been shown that it is almost trivial to decode key-presses if you are simply recording them using a microphone — each key sounds a little different, so it’s a simple substitution encryption. If you are in a position to capture a nearby person’s computer’s EM emissions, it’s even easier for you to record some sound as well…

      2. FWIW, there are multiple Firefox extensions that allow the password to be shown.  The obscuring is done in the browser whenever the input type is “password” so it’s up to the browser to decide how to display it.

    4. Yes – the solution is to stop hiding passwords from users while they type. The assumption that someone is sitting there looking over your shoulder while you enter your password is really very silly. What percentage of your computer use really involves someone looking over your shoulder? Unless you give sales demos all day it’s less than 1%. Show Password should be the default, with the option to uncheck it if you need it (although frankly that same person can just watch you type – it provides no value if you can’t trust the person watching). 

  14. Sadly, most of the sites that won’t let me use a password containing spaces are banking sites.  Paypal doesn’t allow spaces; IIRC, Bank of Montreal requires that their passwords be exactly 7 characters long, no more, no less, and contain only letters and numbers

  15. Also didn’t read the paper, but I find it much more fascinating why a random word association like “Correct horse battery staple” is much more memorable than “Tr0ub4r&3″: is it only the introduction of non-letter characters like the “4” for an “A” and the ampersand, or are there neat-o linguisticy “tags,” associative and mnemonic, that are attached to the words “correct horse battery staple”? Is it the scansion of the phrase, the image (or non-image) thus created, or what? Really, why is one more memorable than the other? Linguists, to the field!

    1. It’s the image, which tells a story. It’s a well-known technique in the “memory olympics”. Humans are very good at remembering stories. Also, wiki for Memory Palace, which should give lots of info about memorizing enormous volumes of information easily.

      1. Except not, as the words don’t *really* convey what the image depicts. “Correct,” for example, following standard English word order, is the first thing I think of, as it’s the first word in the set: to put it, as the image does, as the “correct” answer given by a horse regarding a “battery staple” (about which I know nothing, even if it *really* exists), would be like the fifth or even tenth reading I could give that set of words. I read it as “correct horse,” as in the horse is correct (adjective), or “correct [that] horse” (with “correct” as an imperative), or maybe even “correct, horse” (someone telling a horse to correct something). So, no, the words don’t “tell” me the same story the image does.

        The classical-Renaissance memory palace is quite different: here we’re not mnemonically tying these words to any part of a mental edifice: “correct,” for example, isn’t the “door,” etc. Anyhoo, thanks for responding, but I don’t think I agree with you at all. :D

    2. “I find it much more fascinating why a random word association like
      “Correct horse battery staple” is much more memorable than

      It’s probably a combination of the imagery and the distinct chunks – people don’t memorize the letters of the words, but rather the overall thing that the word “is”.  In order to remember more random things, people usually need them broken down into smaller chunks with a pattern, like phone numbers.

      I use completely random passwords, but that just comes from lots of memory practice from the BBS days.  That’s rather difficult to teach to a new user, unfortunately… especially if they already have the mindset of “I can’t possibly remember that string of garbage”.

      1. Except that “correct horse battery staple” isn’t *really* anything, anymore than “Yankee Hotel Foxtrot” or any other random string of words *is* anything (sure, the latter is a Wilco album, but the phrase itself is meaningless, or admits to a plurality of ambiguous meanings, and is functionally meaningless). Maybe I don’t “get” the picture, but to me it doesn’t fit the phrase at all: there’s a horse, sure, but I wouldn’t construe the meaning of the phrase the way that the image construes it. :(

        1. That’s why I mentioned the concepts of the words, and the “chunks”.  Remembering things by an image is a classic memory trick, but in this case I’d say most people find it easier to remember because we usually remember words by their meaning, not by the strings of letters that represent them.  Think in terms of elementary school spelling bees… the kids know the words, but remembering how to properly spell them is the tricky part.

          Remembering a completely random password is like memorizing how to spell a tricky word in a language with pronunciation rules different from what you grew up with: not impossible, but much harder than memorizing a sentence in your own language.  One requires rote memorization, while the other links concepts together that are already in your mental framework (either in an image, or just linking them in a list).

        2. It’s not about the elements “being” anything at all; it has much more to do with the simplicity of said elements, and the ease with which those elements can be moved from short-term to long-term memory. “Tr0ub4dor&3″ asks a person to remember several elements, including a misspelled phrase and three substitutions, which makes it very difficult to use chunking. “correct horse battery staple” is four very common words, and thus four easy-to-remember chunks. 

          The image thing is just one mnemonic device and might not work for everyone; the best mnemonic that uses images will be the one you come up with for yourself. But chunking is a well-understood phenomenon, and, if I remember correctly, is pretty widespread across cultures, etc. It’s how memory works. 

  16. Okay I admit that remembering ‘horsebatterystaplejuice’ is easier than ‘ie4tlot5ofpickle$23′ and probably easier to type. My question is now how are you going to have a *unique* password for every site you use? One thing I have been trying to do is have a unique password for sites I sign up for, that way if gawker gets hacked because of lax passwords, no one knows my gmail or banking passwords, even having mulitple “security tiers” seems to easy. I want to mitigate my risk with unique passwords so if i hear lulzsac posts paypal passwords to the whole world, I don’t worry about my bank.  For me I use a tabula recta to generate unique passwords with high levels of entropy without storing passwords in a vault and never ever forgetting them. If you want to try that technique see here for a generator and explanation: https://spreadsheets.google.com/ccc?key=0AhKa_IsEMANEdHZuaTVhU2JnT3kyS3JrVm8tVTJxcEE&hl=en#gid=0

    1. I use Password Safe on a usb flash drive with a strong password, then just use randomly generated passwords (with more than enough overkill on the password length) stored in it for all my various websites. This mainly applies to a bazillion internet forums… I’ve got the common Gmail and such memorised.

      1. I actually use LastPass.  There’s a version available that works as Password Safe does.  I like LastPass better because I don’t have to bother keeping my password files sync’d across all my machines and thumb drives, as it’s stored on a central server (encrypted).  Also LastPass works with popup authentication (htaccess) boxes, where PasswordSafe does not.  And it will realize when a site is asking for a new password and offer to generate one for you.

  17. I’ve long railed against making password checks case sensitive.

    Case sensitivity adds just one bit per character and results in tech support overhead and account loss (and hence sales loss).

    You can get the same security by just increasing the minimum password length by about 20%.

  18. Getting systems to increase the number of characters/bits they will store and use as passwords would be an essential step forward.

    But the _right_ answer may be something stronger than passwords — challenge-and-response and/or encryption and/or computational. My bank uses a multiple authentication, so a simple password attack seems middling-unlikely to break through it.

    BTW, you can increase the security of the “base word and dumb substitutions” approach by starting with base multiword phrases (random or not) and extracting the chunk used for your password on something other than word boundaries. “rdbounda” isn’t _strong_, since it’s still statistically close to English, but it’s better than “boundary”; probably good enough to keep the script kiddies out.

  19. I’ve been using 5 or 6 word sentences for passwords for a few years now, and it’s been pretty rare to come across a site that won’t let me do it. Use proper punctuation on whatever sentence you’ve chosen, and you’ll even satisfy the sites that require a few non-letter characters.

    It’s night and day how much easier it is to remember login info. 

  20. 2 days ago I just got an access request approved and had to set my password on first login. It took me about 10 tries to enter a password it would accept. It is limited to 8 characters, must contain at least one symbol, and the symbols it accepts are limited. Right now I have a password set on that system that doesn’t follow any of the rules I have for my passwords and will be difficult for me to remember.

    Personally, I prefer passphrases, and I use them where ever possible. More and more sites and systems are adopting this strategy as acceptable, but not enough are. My password database is protected by a phrase that’s actually kinda short, but tends to fit very few services (hence becoming my passdb passphrase to protect my passwords).

  21. Too many sites restrict the password length to be stupidly small. There’s no good reason for it – HD space is quite cheap, and a few extra bytes isn’t going to be a problem. Similarly, no good reason not to allow the entire ascii set – or better yet, all of unicode.

  22. I’ve been reading Surely You’re Joking, Mr. Feynman, and his chapter on reverse-engineering safe combinations left me even more cynical about password security than I already was.  He related anecdotes about a colleague who set all his safes to the first few digits of e, a general who made a big show of getting a steel safe that had to be lifted into his office with a crane and then leaving it set to the default combination, and a time he demonstrated that he could figure out a file cabinet’s combination by looking at it if the owner left the drawer open — and they “solved the problem” by not letting him into people’s offices anymore.

    Keep in mind that in all of these cases, these were people protecting ATOMIC SECRETS in the late 1940’s/early 1950’s.

    Nothing’s changed in the decades since.  This is useful information for people who already know how to set a secure password, but we will quite simply never train end users.  At this point I think any ideas on increasing password security are best treated as a stopgap until we abandon the password system entirely in favor of something that might actually work.

  23. Six months into a new job the idiots in IT demanded I change my easy-to-remember but difficult-to-guess password. So I changed it to “123456”, and to make sure I didn’t forget it, I put it on a post-it attached to my monitor.

  24. I use a set of lyrics from a song that is memorable to me, then take the first letter of each word, mix in some upper case & punctuation in a fairly standard fashion (for me).  It produces fairly random strings of letters & punctuation that are easily memorable.  For common web sites with a low need for security, I use a single phrase and append or prepend some letters derived from either the name or URL.  Now the thing that REALLY drives me nuts is that the vast majority of sites don’t provide you with the rules for generating a password until AFTER you’ve tried and failed.

  25. If I use a password manager (and I do) then the password is random upper/lower/numbers/digits and is at least 6 bits (more than that actually) but 10 such characters is > 60 bits of entropy.

    This is something that’s possible to implement now.  Most places won’t allow you to use very long passwords; heck, I have at least one bank that only allows 8 character passwords.

  26. My PGP password used the first sentence from a book, and it was nine words long. The first word was all caps. All I had to do was remember which book!

  27. And this is why I use phrases that are titles from my favourite TV show (which I am not telling you!). Sadly, the work system still insists on at least one punctuation mark or number in addition to the upper/lowercase mix. So !

    1. Your favorite show is Dr. Who and your password is based on ‘the angels have the phonebox’.
      You’re welcome.

  28. Now let’s see…was that horsecorrectbatterycable?…horsebattlestapleremover?…staplebatterycorrecthouse? I know it was something like that.

  29. I use password managing software now because I try to use unique passwords for different sites and I can’t remember which ones are used where, even if I use pass phrases made of several common english words.

    Besides, I think the chances of someone targeting me specifically for a dictionary attack are a lot lower than the chances of someone re-using a password that was leaked in a gawker-style breach. Dictionary attack thousands of passwords at once, then take all the cracked ones and try those username/password combinations at places like paypal and large banks. I’m sure there are plenty that will work, but not for people who use different passwords everywhere.

    1. Shouldn’t Tr0ub4dor&3 have a search space of 5.75 x 1021? It has 11 characters and 95 possibilities for each position.

      No, because the whole point is that “Tr0ub4dor&3″ is not completely random, but is relying on Humans needing to invent simplifying systems for supposedly-random passwords.

      The password is based on an English word. It then has a few extremely common substitutions (“0″ for “o” etc.), and it has a single random symbol and number at the end. This pattern probably accounts for a huge percentage of supposedly-“strong” passwords.

      So the cracker doesn’t need to search through all 95^11 possibilities. It just has to take regular English words and a few quite-common variations on each one. Still a very big set, but now that’s a more manage size for a computer to crack.

  30. I like your new system for signing in using other accounts.  I guess I have to trust that BBG won’t steal my password.

    I think the biggest problem with passwords is the number we are focred to create.  I could have maybe five really good passwords, but as the internet and technology would now work I am forced to personally have maybe 30-40 accounts.  For them all to be good, would be completely impossible, unless I decided that I myself would never again access most of my accounts….

    Then there is the bother that one account will give me a security alert and tell me to change my password and then I have to decide If I change all the account passwords…. 

  31. If you’re getting your security advice from a Web comic — even one as clever and funny as XKCD — you’ve already lost.

  32. Anecdote:

    Years ago I worked for an organization that had a very sophisticated password system:  Password changes were required every two weeks; passwords had to contain numbers and letters; they were not accepted if they resembled any word in English or French; they were not rejected if they were similar to any of the user’s previous passwords.

    The end result … you could turn over any keyboard in the office and find that user’s current password – it was the only way anyone could remember them.

    Now we’re to the point, as Reed points out above, that we have this kind of requirement on multiple systems.  I now keep a spreadsheet with all of my usernames and passwords – yes, I know that’s a security risk but without it I would have to stop using online services.

    BTW I just had to create another one – which I won’t remember tomorrow – in order to make this post.

  33. Is the work you pulled your passphrase from on the Internet anywhere?  In Google Books, perhaps?  If so, it’s probably in at least some cracking dictionaries.  If I were out to crack passwords and had the resources of a large company or government I’d definitely try every substring less than ~256 characters from every search engine/data repository I could get my hands on.  If Google can crawl the entire web every few days it could easily crack any passphrase taken from its database in at least the same amount of time.  Trying 256 passphrases for every character crawled would be trivial given the speed of memory/CPU versus the network.

    1. I think you’re making the exercise more trivial than it really is. I also think you underestimate the sheer amount of data you will need to hash to compare results. The cost is not aggregating the “dictionary” pool. The cost if hashing and comparing results. Plus, you don’t truly know how many characters are being used in the phrase, since a good hash gives absolutely zero clues as to the plaintext.

      Let’s say you set your upper bound for cracking a passphrase at 256 characters. You start at the first 256 chars and start hashing strings with the 1st char. Then with the 1st and 2nd character. Then with… you get it. No result. I’m going to be generous and say there will be no phrases that start mid-word. So now we shift over (on avg.) 5 characters and start all over again after the next white space or punctuation mark.

      This will take a *lot* of time. Less time than brute forcing, but still much more computationally expensive than its most likely worth. Much more expensive than the proprietary info I have in my work laptop (no customer stuff, just tech stuff). A much more cost-effective password cracking technique is a crescent wrench to my kneecaps.

  34. The only problem with this article is that now that I’ve remembered this really good password, I’m probably going to use correcthorsebatterystaple for everything but ofcourse you know that, so I’m going to get really clever and put all sorts of ampersands, umlauts and other substitutions in and promptly forget how to log back in to my account and edit this comment in which I complain that The only problem with this article is that…

  35. I have one password that is seven words in a foreign language. It is a quote, but, I learned after I had been using it for several years that I hadn’t memorized it right, so it’s just almost a quote (and grammatically incorrect, at that).

    1. that reminds me of reading something about a guy who made his password by switching to DVORAK keyboard layout, typed a phrase as if it were in QWERTY, then would switch back to QWERTY keyboard layout. lol

    2. And if you’ve used that password across many websites, and one of those websites was among the many that had security break-ins in the past year or will have in the years to come, all your sites are compromised.

      There is a reason that one should occasionally change their passwords, it’s not only to satisfy IT nerds.

  36. Odd you should bring up UCL because when I was at UCL-CS, I wrote the password distribution system for the compsci department (pre Sun yellow pages) in Oracle 4GL. At that time, we were still fighting 6 letter password cracking online, and a colleague (ex GCHQ) wrote a checker which found 3, 4, 5 and 6 letter combos from stripes up/down/diagonal on the standard keyboard…

  37. I really don’t see how the memorization difficulty is accurate. Years and years ago, my first ISP assigned me a password that was a random mix of upper/lowercase letters, numbers and symbols, not even based on a word. It really quite simple to make up a nonsense narrative out of the sequence (similar to the horse picture) such that I still remember it with ease.

  38. One weird thing about muscle memory is that it can be broken and then very difficult to put back together. Once I used a friend’s computer that had one of those ergonomic split keyboards and my hands went completely blank when I tried my email’s login password. Furthermore, when I went back to a regular keyboard, I couldn’t reconstruct what it was. Very frustrating, but a fascinating discovery about muscle memory.

  39. Is there value in offsetting the characters on the keyboard by one with your password, or is that a waste of time? For instance I might remember “applecomputer” but I make the actual password one character to the right on the keyboard so its actually “s[[;rvp,[iyrt”

    In my head this seems like it would be more difficult to crack? Or is it just a waste of time?

  40. Huh.  Almost 100 comments and so far I don’t see anyone wondering why in the heck a web site would even allow 1000 attempts per second over a period of 3 days (note the caption is specifically excluding the scenario where the attacker has the hashed password already).

    Seems to me, a limit of 1 attempt per second, with a 1000 attempt absolute limit would suffice to keep even these relatively weak passwords safe enough.

    Granted, sites should be more permissive about passwords, allowing multi-word phrases with spaces. But if a user prefers a weakly-encrypted (via character-replacement) approach, that should be strong enough against a basic brute force attack.

  41. Hello, I do security for a living and modern day password cracking is done on the order of over a million guesses per second.  A thousand is from maybe ten years ago.

    1. The comic is about brute forcing passwords remotely, over the web, to a web service that doesn’t notice lots and lots of failed logins.  If you tried doing a million (or 32 billion) guesses per second, this would just ddos the web service.

  42. So, the average vocabulary is around 10,000 words http://en.wikipedia.org/wiki/Vocabulary_development (yes, it’s uncited in the wiki, but that’s not the first time I’ve seen numbers around the 10k mark).  It’s probably safe to say that a great portion of that vocabulary size is shared among people with the same linguistic background (or else communication would be impossible).  Now let’s do some real analysis!

    The common characterset available for passwords is about 80-104 unique characters (or, 52 for English letters, 20 for numerals and their Shift+ symbols, and a handful of bits [3-5] of available symbols).  10,000 is merely 100^2, meaning each additional random standard vocabulary word you use is approximately as secure as 2 random characters.  The use of 4 regular words of English is about as secure as a single 8-digit random set of regular characters.  Which is to say, about 10 quadrillion possible combinations in either case.  Add another word to one type of password methodology?  Just add 2 more digits to the other…and vice versa.

    Ultimately, each of these methods are probably about as secure as the other.

    The real problem isn’t with the method.  The real problem is regularity.  Why do we regularly allow only about 100 unique characters for passwords?  There is a common standard for a 16-bit characterset, UTF-16.  Can’t I make a couple characters in my password katakana?  Now I have a huge per character entropy.  And why would we teach people to use common (or even uncommon) English words?  Why not teach people to use words from multiple languages, and even mix them up?  “Jin el Ich am” ~~ “Person the I am”.  Now I probably have a similarly huge per word entropy as using UTF-16 on a per character basis.

    The rub is, you’re not trying to teach people a new password methodology.  Teaching people a new password methodology isn’t going to make a significant advancement in password security…not after everyone is using the same methodology.  You need to be teaching people to come up with their *own* method, and give them the freedom to implement it.  But that requires teaching people to take it seriously enough to be smart about it.  That’s probably a hard sell when huge portions of people are still using “123456” and “password”.

    And it’s all moot when password databases get stored completely unencrypted and when the most assured way to get someone’s password with certainty remains completely in the hands of social engineering.  Fixing that requires teaching whole other levels of intellect to people, and if we can’t even get people to be unpredictable about their passwords to actually increase mathematical complexity, we’ll never get people to that level.

Comments are closed.