Rob Beschizza at 7:10 am Fri, Nov 23, 2012
ADVERTISE AT BOING BOING!
There’s an image of the code in question over at the BBC website:
Milk, bread, butter, biscuits, shaving cream, marmalade (orange),
Love the classic 5-letter groups.
Have you been googling about pigeons, Rob?
Help, I’m stuck in a fireplace.
Were there no false codes — complete gibberish that intentionally or unintentionally sucked up hundreds of man hours trying to decipher the code? I’m assuming every known code has been tried.
‘When you eliminate the impossible, whatever remains, however improbable, must be the truth.’ – Holmes
I think the key phrase missing from the story is “one-time pad”, which is indeed effectively impossible to crack, “if used correctly”.
I only know this because of Cryptonomicon.
Surely it could brute forced it and matched against valid words.
if each letter is equally likely then the message could be any that you like. With OTP a valid cipher text could be 11111111 and there’d be no way to decipher without having the key. (edit: replaced “message” with “cipher text”)
You mean there might be a code inside the code? Fair enough but how far does it go? Maybe you could just calculate the entropy on different possible solutions then attack the solutions with low entropy again.
No. With a one-time pad each letter or word position could be using a different code.
For example my pad could be 254 where each number is the offset to use. So the word ‘the’ would become ‘vmh’. If the pad was 837 then ‘the’ would be ‘bkl’. It’s called a one-time pad because an agent would only use it once. Pads are generated randomly so there’s no pattern to discern, each letter is random.
Letter offsets is just one way to do one-time pads. There are more complex versions but they all have in common is that without the pad they are virtually unbreakable. The downside is that rule applies if you lose the pad as well.
So here’s a basic One Time Pad scheme. Pad=key
Creating a key: using a 27 sided dice, roll 10 times and write down each result. Each roll must be unrelated to the previous roll and the dice must be perfectly fair.
Encrypting the message: Use a message of length 10 letters and spaces or less.
a-z=[1-26], spaces=27 (I’ll use underscores to denote spaces)
message = “nazis_suck” / “14, 1, 26, 9, 19, 27, 19, 21, 3, 11″
key = “abddh_tacl” / “1, 2, 4, 4, 8, 27, 20, 1, 3, 12″
add the key to the message:
cipher text = “occm__lvfw” / “15, 3, 3, 13, 27, 27, 12, 22, 6, 23″
The key must be shared with the receiver ahead of time. The receiver will know to subtract the key from the cipher text to decrypt it.
As the name implies, the key must only be used once or else it’s attackable as a two-time pad/many-time pad.
The problem is that with a one time pad, effectively EACH letter has it’s own cypher. So when you make guesses and match against valid words, every single possibility will eventually match.
Here is an example. Say my ciphertext is “aaaa”. One possible guess for that is “bake” (+1,+0,+10,+4). Horray, a match! But then you keep trying different substitutions for each letter and find that it also matches “kale” (+10,+0,+11,+4) and “make”, and “four” and “fork”. Eventually, you find that there is a match for every single four letter word in the dictionary.
The strength of using what are in essence, few-time pads. If you don’t have the pad, there’s little that can be used to decrypt it.
Regarding “few-time pads”: As long as the attacker isn’t aware which messages were encoded with the same pad.
“This pigeon is a traitor, kill him immediately!”
ah the Uriah message. (cuz y’see king david wanted to do bathsheba without her husband, uriah the hittite, hanging about. so he sent uriah to the war-front with a message to hand to ol’ general joab presumably written in some (semitic) language the hittite… well it’s an old story)
When I was in England I found a codebook in a secondhand bookstore in N. Yorkshire. Five-letter codes were assigned to words. It was a huge book, and not terribly overpriced, but it was just a bit out of my reach at that time. If only…
a lot of those were published weren’t specifically meant for secrecy purposes, but rather to reduce cost per words on telegrams (SXZDQ == “sell all stock holdings of”…) i always assumed it was mostly for that purely economic notion that Claude Shannon founded information theory (largely unrelated to him being a great expert on cryptology and juggling)
I think we need to immediately investigate the chimney for possible German sympathies.
Regarding one-time pads: Why not seed the pad into the message itself? I’m thinking that an arbitrary sequence of “bytes,” known to the recipient (e.g. “second letter of every other line” or somesuch) could be a date code, from which one could derive the key (say, the number of seconds since the stroke of midnight on New Year’s Eve A.D. 1, with each digit being an offset.) That way the recipient wouldn’t have to carry around a pad which could be compromised, he’d only have to know which bytes were the date code, and what baseline date to count seconds from. If that information is cracked or extracted, headquarters can simply change one or both of those for future messages, communicating the change to recipients in a compact and obscure manner.
For a one-time pad, the “key” is as long as the message itself: it’s a sequence of offsets applied to the sequence of characters in the message. The strength of a one-time pad is that there is no correlation between the coding for different characters.
Lots of interesting info on codes used for SOE agents in Europe in Leo Marks’s book “Between Silk and Cyanide”: poem ciphers, security checks that an agent omitted if captured and forced to communicate, false traffic, deciphering “indecipherable” messages, codes in one system that were made to look like they are in another system, one-time pads with letters instead of numbers. Most of the one-time “pads” used by the SOE in the field were actually rolls of silk, where the used part of the pad could be cut off and destroyed. Silk was preferred to paper, because silk doesn’t crinkle like paper does when you’re being patted down.
Right, I know how they work, but the key doesn’t have to be there in toto, is what I’m getting at. Any operation that provides a long enough sequence of numbers to cover offsetting the message content would do. Take bytes 2, 12, and 22, raise each one to the power of the next one and you’d have a pad big enough to encode the King James Bible. (Multiple encodings would probably be beyond the capabilities of the era, but with these computers we got nowadays you could probably come up with an alternate false encoding to give up in case of torture that would resolve to a plausible-sounding message.)
You’ll have only 2^24 possible sequences. I can grind through them all.
What you are describing are pseudo random number generators/functions and are the basis for all modern cryptography. But your output has to be indistinguishable from actual random numbers which can be tricky. As DrDave points out below, your example would be susceptible to a brute force attack against the seed.
Even the tiniest amount of predictable pattern can lead to a code being compromised and information being leaked.
When you see key length discussed with things like SSL, it is the length of the key (in bits) that is used with pseudo random functions to generate the pads big enough to encrypt large amounts of data, so you are on the right track. It’s just that you have to take it much further.
For example, key lengths for SSL certificates are now considered insecure if they are not at least 1024 bits.
Dammit! We’ll never beat the Hun at this rate!
Mail (will not be published) (required)