Following on from my last set of ramblings about the illusion of certainty that is projected by our lords and masters, the technospastic elite1, I thought I’d give a technical example of how the tools of uncertainty (probability) can be used to generate a degree of certainty.
I’m always amused by thrillers that feature some notion centred around an “unbreakable code”. It’s usually some plot device that helps to move the story along - or the story itself is about protecting this code. Or something.
Bendytoy Cabbagepatch playing Alan Turing in The Imitation Game - a movie about the breaking of the “unbreakable” Enigma code.
It’s amusing because we’ve known how to do unbreakable encryption since 1917, at least. The Vernam Cipher, more usually called The One-Time Pad (OTP), has been around (and used) since then.
I’m going to show you how to do it. It’s really simple.
An Unbreakable Encryption Algorithm
Let’s take some message that is 4 bits long - let’s just write 1100 for the sake of argument. Generate another 4 bits, at random, which we shall assume is 0110 (again for the sake of argument). And now we just do the bitwise binary addition of these two things. Binary addition is straightforward
0 + 0 = 0
0 + 1 = 1 + 0 = 1
1 + 1 = 0
Did he just write 1 + 1 = 0?
Woke bastard.
Yes, because here the + symbol doesn’t mean ordinary addition; it means addition modulo 2.
You can think of modular addition as “clock arithmetic”. If you’re working at night and it’s 10pm and you know you have another 4 hours to go, you don’t think you’ll finish at 14 o’clock do you? No, you know you’ll finish at 2am. In other words, you’ve done the modular addition of 10 + 4 = 2.
The + symbol here means something different to the + symbol used with “ordinary” addition. When we write an equation like 2 + 2 = 4, each of these symbols, including the plus sign, means something very specific.
So we apply this modular addition rule to our 4 bits as above and we get
m = 1 1 0 0
k = 0 1 1 0
c = 1 0 1 0
where c = m + k (and the rule is applied bitwise so that the first bit of c is generated from the binary addition of the first bits of m and k, and so on).
Why have I chosen these letters m, k and c? Well, we’re going to think of the first 4 bits we had (1100) as our message, the second lot of 4 bits (0110) as the key, and the new set of 4 bits we generate from our addition rule as the ciphertext.
And that’s it. This is unbreakable. If you’ve chosen the key at random and no one else knows it, then you can give anyone the ciphertext (1010) and ask them to figure out what the message was. They won’t be able to do any better than guessing.
In this case there are 16 possible initial messages that could have been encrypted; but if all you have is the ciphertext, it is not possible to determine which one of those 16 possibilities was the actual original message.
You now know how to do unbreakable encryption.
So why is there such a fuss in electronic communications surrounding encryption techniques? Don’t we already know how to make everything unbreakably secure?
Yes.
And no.
The OTP I’ve described above has some limitations
The key must be as long as the message
The key must be chosen entirely at random
The key must be used only once (hence the name “one-time pad”)
The key must be kept secret and known only to the legitimate users
So if Alice wants to send a large file (say 1MB long) to Bob, and she wants this to be kept secret then she can do it provided she and Bob share a secret key that is as long as the message (in this case the key must be 1MB long).
That’s a problem. How does Alice get this secret key to Bob in the first place? Isn’t the key just another kind of “message” that must be transmitted securely? And this key can’t be re-used. Once you’ve used it to secure one message, you have to throw it away and do it all over again.
There’s a nice physical analogy here. You can think of the message as a piece of paper that gets put in a box. The algorithm (in this case the algorithm is bitwise binary addition) is the locking mechanism. You lock the box with the key. The ciphertext is the locked box.
The locked box can be given to anyone who doesn’t possess the key and they can’t open it to see what’s inside.
How does the box get “unlocked” with our binary addition algorithm? This is also very simple. Alice has “locked” the message 1100 with the binary addition algorithm using the key 0110.
The box is 1010. Everyone can know this (we assume everyone can eavesdrop on the ciphertext that gets sent).
Bob, who also has the key 0110, just takes the box and does bitwise binary addition again. So, Bob computes c + k
c = 1 0 1 0
k = 0 1 1 0
m = 1 1 0 0
It’s hard to think of a simpler encryption/decryption process than the OTP - and yet it is (when used with the listed conditions) unbreakable.
So, What’s the Problem?
The problem, as it is with all encryption in some form or another, is that in order to communicate in secret we must be able to communicate in secret. We need to be able to shift these keys around in secret before we can send our actual message in secret.
Sending secret messages (encryption), then, is dependent on being able to distribute these keys securely2.
In times gone by3 (and doubtless in some instances still today) there’d be a bunch of people employed by the spooks to literally ferry huge lists of random numbers (to be used as one-time keys in the OTP) from place to place.
That’s a bit of an issue. It’s OK for a few messages where the utmost security is required, but it quickly becomes unscalable and unmanageable.
So we need a way to scramble (encrypt) our messages using shorter keys4. This can be achieved at the expense of security. All such ciphers are vulnerable to an attack known as exhaustive key search5. With these algorithms the attacker can always simply try out every possible key.
With the OTP this doesn’t help - because there’s no way to distinguish the ‘right’ message from all of the other possible ‘wrong’ messages. This indistinguishability is only possible when the key is (at least) as long as the message.
There’s another important thing I haven’t yet mentioned. We assume that any attacker wanting to read our secret message knows the detail of the locking mechanism. Eve, the eavesdropper, knows the algorithm. We try to create algorithms so that this doesn’t matter.
But how do we prove the “security” here? What do we mean by “security” in this context? What you’re actually trying to achieve with an encryption algorithm is to ensure that an exhaustive key search (which can always be done) is the best attack available to an eavesdropper. If we can do this, then we say the algorithm is “secure”.
Let’s give an example here. Suppose we’re sending our 1MB file and we’ve used some fancy encryption algorithm and locked it with a key that is 16 bits in length. Then it’s not going to take an eavesdropper very much time to cycle through all of the 2^16 possible key combinations here. There are 65,536 possible combinations to check - and that’s an easy job for a modern computer.
But what if we used a key that was 128 bits long? Now the problem is one of having to cycle through 2^128 possible combinations - and that’s a very big number.
Adding a single bit to a key length doubles the time it takes to do an exhaustive key search.
What we’re getting a hint of here is the relationship between security and computing power for modern encryption techniques.
One of the previously widely-used encryption algorithms was DES (the Data Encryption Standard) which employed in its standard implementation a key that was 56 bits long. If you had a computer that could do this exhaustive key search on DES in just 1 second, it would take the same computer 149 trillion years to break the current standard AES which uses a 128 bit long key.
The modern definitions of security when it comes to encryption build in this notion of available computing power. This is why (amongst other things) we have the field of computational complexity which seeks to prove things about the classes of problems that can be solved with a given degree of computational power. You may have heard various phrases like NP complete, or “polynomial time”, and these stem from this field.
A problem that can be solved in “polynomial time” is said to be within the capabilities of modern computers. What this means is that if we increase the size of the problem then the time scales as some polynomial function of the input size. An exhaustive key search scales as an exponential function of the input size.
When it comes to encryption, then, we definitely don’t want our algorithm to be able to be broken in “polynomial time” - this would mean there’s a faster attack than exhaustive key search.
So how do we go about assessing whether our encryption algorithms are secure? And how do we define (and prove) security?
Defining Security For Encryption
The way security is defined is rather neat. It’s based around a thought experiment. The most basic form of this experiment is as follows :
Eve, who is restricted to polynomial time computational power, is given two messages of equal length by Alice : m(0) and m(1)
Alice chooses one of these messages at random to encrypt and then hands the resulting ciphertext, c, to Eve
Eve now sends a message, either a 1 or a 0, back to Alice. This is her selection of which message has been encrypted.
The encryption scheme is said to be secure if Eve succeeds in choosing the right encrypted message with a probability that is at most negligibly greater than 1/2
There’s a very technical definition of what “negligible” means in this context which I won’t go into. The essence here is that Eve cannot do very much better (negligibly better) than guessing which of her initial messages has been encrypted.
Notice that it’s all built around probability and that we’re (effectively) only asking Eve to provide one bit of information.
The idea is to create an encryption algorithm so that you can’t even get one bit of information about the messages that have been encrypted (you can’t say, for example, it’s definitely not this message that has been encrypted) if you only have access to polynomial time resources.
The difficult job is, of course, proving that any given encryption algorithm meets this condition. Usually you can make a statement along the lines of “providing this permutation is pseudorandom6 the algorithm is secure against this kind of eavesdropping experiment”.
There are different kinds of attack potentially available to an eavesdropper. An eavesdropper is not necessarily just a passive actor simply “listening in” on the channel. An attacker might be able to create a whole list of ciphertext/message pairs in which she can choose the messages to be encrypted. We require security against these forms of attacks, too. In this case we’d construct a very similar, but slightly different, thought experiment to serve as our security definition. In this adapted form we would have Eve generating the messages to be encrypted herself, rather than being given them by Alice.
I’m sure your brains have been quite sufficiently scrambled enough for a Saturday. It must be nearly time to get the cheese out of the fridge and crack open that bottle of Nero d’Avola.
Of course there’s a lot of fuss made about encryption and key length. But consider the following exchange between Whit Diffie7 and a former NSA employee I witnessed at a conference. Whit made the statement that if you wanted to be secure against security agencies you should be using a 90 bit key rather than a 64 bit key (at the time that was a sensible statement). The former spook piped up “Whit, are you telling me that the NSA would find it more difficult to steal a 90 bit key than a 64 bit key?”.
There are lots of ways to skin cats in the security world.
The certainty is like some unconscious spasm that leaders feel they have to project. It’s particularly noticeable around technical issues and leads to idiot slogans such as “follow the science”
Public Key Encryption tries to get around this problem by using what are called public keys you can give to anyone. The physical analogy here is that you have a box with two locks. If Alice wants to send a secret message to Bob she just gets Bob’s public key off some (public) directory and uses it to lock a box with her message inside. The locks are such that the box can’t be opened with the public key. Only Bob, with a key known only to himself, can now use this second (private) key to open the box. The problem of distributing the keys has been “solved”, or rather it’s been shifted into one of authentication. How do we know that we’ve got Bob’s genuine public key off the directory?
Indeed, the Germans used OTP’s in the war, as well as the Enigma code
Much shorter than the messages that are being encrypted.
Also known as a “brute-force attack”
Again, there’s a technical definition of what this means which is not necessary to go into here. It involves not being able to distinguish, in polynomial time, whether a sequence is truly random or not.
Whit is a very smart guy and a really lovely and interesting bloke, and one of the inventors of public key techniques.
There's very memorable passage in Hasek's 'Brave Soldier Svejk' (highly recommended!) when the austrian-hungarian forces are handed a "brand new unbreakable cipher", consisting of a two-part novel. Certain letters in volume one corresponds to other letters in colume two, with pages numbers acting as guides to how to put simple abbreviated messages together.
Only, volume one and two of the actual books delivered are from different editions with different numbering of pages...
There's nothing man can make, that man can't break. Load a trooper down with gadgets and someone creep up on him and bash his head in with a rock, while he's trying to read a vernier.
(Internet cookie for recognising where that comes from.)
How about two or more people decided on 100 books, Example Orwells 1984, and we had 10 books on 10 shelves. The shelves are A, B, C to J and the books are numbered 1 to 10. So B6 could be our book 1984. and so our coded message begins with B6. Then we start at chapter 1, 2, 3 etc, so now we have B63 for example. Then we simply choose numbers counting in from the beginning of the first letter of the first word in chapter 3 to indicate letters of the alphabet.
Using 1984 chapter 1, here is my message to you all..... 67, 4,10, 6. ( You need to own a copy of 1984 in English) For variables we could have another set of a 100 books written in French .... so our our Key would be B6F4......You could also decide to use Chapter 2 so now it would be B6F2.
The only way I can see to break the code is to know all 100 books and how they were arranged etc, details that only those involved in creating the code might be tortured or incentivised to tell.
On a U-Boat, the captain could have 10 books on his cabin shelves and not share the significance with any other officer. You could of course have all the chapters in the books as computer files and a program randomly selecting the numbers corresponding to letters when you typed in your message, before sending as a list of numbers. Or you could simply text on Wechat and trust the Chinese Communist Governments end to end encryption promise.
I am sure there have been many similar ideas used...all rely on only certain people knowing the codes and sources. Besides actually knowing the books and what the numbers referred to, I don't think my code is breakable...... Rudolph ?
I now own a copy of 1984 with some random numbers written on page one of chapter one..... I hope it does not start a conspiracy theory in my future family members...... what was Great Grandad Fenn trying to tell us? Providing of course that 1984 is not banned by then ( I suspect it will be reason for immediate re education, better known as execution) If the Democrats can cheat their way into power again in 2024 then 10 years from now the book will almost certainly be banned in the USSA and USSK.