Passphrases suck less than passwords, but they still suck

In "Linguistic properties of multi-word passphrases" (PDF, generates an SSL error) Cambridge's Joseph Bonneau and Ekaterina Shutova demonstrate that multi-word passphrases are more secure (have more entropy) than average user passwords composed of "random" characters, but that neither is very secure. In a blog post, Joseph Bonneau sums up the paper and the research that went into it.

Some clear trends emerged—people strongly prefer phrases which are either a single modified noun (“operation room”) or a single modified verb (“send immediately”). These phrases are perhaps easier to remember than phrases which include a verb and a noun and are therefore closer to a complete sentence. Within these categories, users don’t stray too far from choosing two-word phrases the way they’re actually produced in natural language. That is, phrases like “young man” which come up often in speech are proportionately more likely to be chosen than rare phrases like “young table.”

This led us to ask, if in the worst case users chose multi-word passphrases with a distribution identical to English speech, how secure would this be? Using the large Google n-gram corpus we can answer this question for phrases of up to 5 words. The results are discouraging: by our metrics, even 5-word phrases would be highly insecure against offline attacks, with fewer than 30 bits of work compromising over half of users. The returns appear to rapidly diminish as more words are required. This has potentially serious implications for applications like PGP private keys, which are often encrypted using a passphrase. Users are clearly more random in “passphrase English” than in actual English, but unless it’s dramatically more random the underlying natural language simply isn’t random enough. Exploring this gap is an interesting avenue for future collaboration between computer security researchers and linguists. For now we can only be comfortable that randomly-generated passphrases (using tools like Diceware) will resist offline brute force.

Some evidence on multi-word passphrases (via Schneier)


  1. good lord, quality passwords are the easiest damn thing in the world to generate and remember.

    1) Pick a song that you know the lyrics to.
    2) Select a couple lines and use the first letter from each word
    3) L33T speak your way into using some numbers and punctuation.

    Ziggy Played Guitar, Jamming Good With Weird & Gilly & The Spiders From Mars


    find me a dictionary scan that will guess that password.


      Somebody had to.

      Surely though using a password manager (strictly offline) with a diceware generated random passphrase of 6 or so words with spaces is the way to go. Until i read something else which throws all that right in the shitter.

    2. Sorry, Jonathan, dictionary scans are  so 90’s.  Hash scans and brute force polynomials are now simple with parallel processors.  Thanks, Playstation GPUs!  Then further, your account on ILoveFurries.Net may not store it very well, so when they’re hacked, somebody knows your bank password.

    3. As others have said, dictionary attacks are passe and computers are
      immune to nonsense phrasing. “zpg,jgww&g&t5fm” is no more
      obscure (to a computer) than “abcdefghijklmno” or “ThisIsAPassword”.

      More importantly, your password scheme is NOT easy for humans to remember. Given the multitudes of passwords each of us has, how do you remember what phrase corresponds to each? Does Ziggy Stardust get me into my bank? Or is it Aladdin Sane? Or Lady Stardust? Hunky Dory?

      Finally, even if you do correctly associate your bank with Ziggy Stardust, and remember the appropriate line, L33T speak is hardly a one-to-one transformation. Is your password:








      …or any of the other countless variations?

      1. FWIW – I know enough about security to know I’d hire experts if I had responsibility for security.  But – I can’t think of a single web site that I use that has anything more complex than a X character password, maybe with some rules for number and punctuation use.    

        I’m really fond of the passfaces  idea ( – the human brain is hardwired to recognize faces.

  2. Passphrases are better than passwords, this has been known for a long time.  The problem is that in real life, many places force you to use passwords that are within a certain character limit.  I have a credit card where the password MUST be between 6-8 chars.  Not very useful — or secure.

    1.  This annoys the fuck out of me. Out of all the passwords I use online, my bank’s is the #1 shortest. And Paypal comes in second. WHY?!

        1. Assuming they’re not storing them as plaintext. Which, if they’re dumb enough to limit to eight characters, wouldn’t surprise me that much.

          1. @MrEricSir Sometimes storing a password plain text accessible is required, for example my bank asks me for my customer ID, then for random characters from an online banking pin and my online banking password.

            This prevents keyloggers stealing my full login credentials (the most likely way of losing them) but it means the bank’s backend system must be able to read my password in plain text.

            Of course if I were designing a system I would have an initial password and encrypt a 2nd password with the first to use for random character logins.

      1. One of my cousins is a programmer for a bank. He tells me some of the things he’s seen in the codebase make him want to keep his money as bills stuffed in his mattress, ‘cos it’d feel more secure.

  3. I wrote my honours thesis on mnemonic text passwords (though, sadly, without large-enough entropy to make them secure nor long-enough strings to tax recall).  

    Passphrases certainly have apparent advantages in memorability and resistance to brute force attacks, but they’re hamstrung by many of the same problems that all password systems share:

    – good security practice means you’ll need to remember a completely different string for each system or site that you have an account on

    – if your string is compromised, you’ll have to get a new one and associate it with the same account, meaning competing and conflicting recall (old password, new password, 24th password, etc.)

    – strings must be generated randomly to be truly useful from a security standpoint, which usually means their memorability is impaired

    – there is no natural association between the string and the account, so nothing about the actual password/phrase will tell you anything about the account, nor will the account or site give you any help recalling the password/phrase (was “monkey light pants aloe potato” for Facebook or Gmail?)

    – etc etc

    All of that notwithstanding, just about anything will be better than using the same password at every site.

      1. “It depends.”

        If the one you can remember is for a password manager and the ones stored in the password manager are long, random, and unique, then you’re doing well.

        If you’re using the same ultra-memorable password on every site, one breach exposes everything.

        The intersection of security and human factors is a series of trade-offs.

        1. On the other hand, if the password manager’s data file ever gets corrupt, or the HD crashes or whatever, you’re looking at having an annoying time getting back into everything.

          1. True.  It’s a question of Things You Can Control vs. Things You Can’t Control.  Making 5 backups on 5 physically separate devices is easier than policing user data protection at your 20 favourite websites.

        2. I was talking about important passwords, like for BGP routers and certificate issuance.  Website passwords aren’t really important, in the relative scheme of things; I can always make a new one of those, even for a bank site.

          If anyone gets even one of my important passwords, they can do so much damage it really won’t matter if they get more than one. If you are on fire and you can’t put it out, does it matter if you have an extra ounce of parafin in your pocket?

    1.  Good advice, but also don’t use just one other language. And for extra points use the names of obscure fictional characters or other fictional details, again not just from one source. Mix together and liberally l33t it up.

    2. That’s what I do but the more obscure the language the better and only one phrase in each language. Then all I have to do to associate any particular password/passphrase with a particular account is number them 1 = Hungarian = ……. 2 = Georgian = ……. and so on. It helps if you were a bit obsessive as a kid and read grammars for fun (maybe I was just weird).

  4. “The results are discouraging: by our metrics, even 5-word phrases would be highly insecure against offline attacks, with fewer than 30 bits of work compromising over half of users.”

    Downside: Passphrases suck.
    Upside: Researchers discover awesome new compression algorithm for ebooks.

  5. As the authors say, this only holds true if you restrict yourself to naturally-occurring n-grams.  They justify this by saying that’s what people tend to use – but is this because people can’t remember more random passphrases, or because they don’t generate them spontaneously when asked to pick a passphrase?

    In the latter case, you just need an appropriate generator to circumvent the problem.  Just because people don’t naturally come up with “young table” when asked to think up a passphrase, that doesn’t mean they won’t be able to use it if it’s assigned to them.

    The xkcd version falls down because it’s ungrammatical and so the user can’t easily remember the order the four words go in.  It’ll take up to 24 tries to get right, which is more than anyone will put up with.

    However, if you use randomly-generated phrases of the form “noun verbs adjective noun”, that should have the desired combination of high entropy and high memorability, since the order of the words is specified by the natural grammar of the phrase.

    Cheesecake paints blue hairnet

Comments are closed.