Encryption

From Jonathan Gardner's Tech Wiki
Jump to: navigation, search

Introduction

In today's internet and communications world, you need to understand the risks of sending your data across a network and how to protect yourself. One of the best ways to protect yourself in certain circumstances is with encryption.

How Does Encryption Work?

Encryption is a mathematical process. This is okay, because ultimately all data you want to send or receive is just a really, really big number.

Encryption relies on two keys. These keys are really just numbers. One key is called the private key. This is a key that is a complete and total secret number from the outside world. Even within a company, the keys are kept so secret that they aren't even archived. Only a handful of people--sometimes 0!--are allowed even to look at the secret key. The other key is called the public key. This is also a number, but one that is passed out and given to anyone who wants it.

Private keys and public keys are generated in a keypair. This is a set of matching private and public keys that, although they are different numbers, have an interesting property. You'll see below.

There are two operations you can do with your keys: signing, and encrypting.

Signing

Signing is an act wherein you generate a magic number called the signature. The signature was built using the content you wish to sign, and your secret key. No one else can generate the signature you can generate because no one else has the secret key.

When you distribute the data (which isn't hidden), you also distribute the signature. If someone has the matching public key, they can verify the signature is correct. (They can't, however, generate a signature with the public key!)

Signing is useful when you want everyone to know the content, but you want everyone to know it comes from you. It isn't useful when you want to pass secrets around.

Encryption

To pass secrets around, you use encryption. In this process, you generate a large number that is based on the original content you wish to send and the public key your intended reader has given you. To the outside world, this number is gobbledy-gook. It is impossible for them to decode it to the original without getting the private key. That's right, even with the public key, they can't access the encrypted content.

Once the intended recipient receives the encrypted data, they can decrypt it with their private key. Only people who know the secret key can do this.

Encryption is nice if you have a secret you only want the recipient to hear.

Weaknesses

This encryption scheme isn't perfect. Some algorithms used in times past have been found to have fatal flaws. These manifest in several ways.

Signature Collisions

It is possible to have two different documents signed with the same key generate the same signature. This is because the signature is typically much, much shorter than the document. However, finding another document that has the same signature as the original document is incredibly difficult. Even if the document is found, it is highly unlikely that the document would mean anything useful. So, using signature collisions, it is possible to generate a document, stamp it with a false signature, and pass it off as if the owner of the private key generated it.

Key Discovery

It is possible for someone to guess the private or public key. However, this is incredibly unlikely. It may be possible to reverse-engineer the key if you know some things. This is also incredibly difficult, even with today's powerful computers. However, with weaker encryption schemes, it isn't quite impossible. Some believe the best encryption schemes are unbeatable, but others believe that secret government agencies may already know algorithms to beat them.

Most of the methods to decode the secret key relies on getting a hold of a bunch of encrypted data, guessing what the data should contain, and then trying to guess the keys. This is obviously a difficult problem, but it is made easier if there is lots and lots of encrypted data, all encrypted with the same key. (See below on how this is handled.)

By far, the easiest way to get the private key is to steal it. That is, hack into the computer with the key on the disk, read the key, and copy it. That's the best way to do this.

For this reason, several companies have a policy of frequently changing the keys. This has the effect that even if you steal the key, it is only good until they new keys are used and the old keys abandoned. Also, if you discover a private key was stolen, you can quickly advertise that the old key is no longer used and a new one is being used.

You'll see below how most protocols frequently change keys--many times a minute--for security's sake.


Symmetric Key Pairs

Because of the weakness of having too much data laying all encrypted with the same key, modern protocols limit the amount of data encrypted with the private keys. The first encrypted message passed between the host and client is usually a request to use a newly generated symmetric key pair. The symmetric key pair is special because they tend to encrypt and decrypt data more quickly than the public/private key pair, and they are also the same keys.

The traffic between the host and client is sent using the symmetric key pair generated just for this session. After a while, they agree to use a new symmetric key pair. This means the keys keep changing. In the end, even if someone is able to monitor all the traffic between the client and server, they are left with very little data to work with. Trying to guess the private key is all but impossible.

Man-in-the-middle Attacks

Consider this scenario.

A client sends a message to the host. The host reads the message and responds to the client.

This sounds fine and dandy, until you realize that it is possible to hijack IP addresses and domain names on the internet. What the client sends to the host may or may not end up on the host. Someone could've hijacked the IP address or domain name of the host.

They could send the packet on to the real host, and respond the way the host would. This would mean that the client, and the server, can't detect that their communication has been hijacked. This is the man-in-the-middle attack.

If the client already knows the public key for the host he wants to talk to, then he can send his message encrypted with the public key. The man-in-the-middle, who doesn't have the private key, won't be able to read what the client said.

But what about the other way around? Will the man-in-the-middle be able to read what the host said to the client? Not if the client sent a message with his public key in it, and the host encrypted the message with the public key.

But consider this. What if the client doesn't know the public key for the host in the first place? Well, the first message has to be made out in the open, where the man-in-the-middle can read it. The man-in-the-middle can replace the public key with his own public key, and forward the message on. Then the host will encrypt the message with the man-in-the-middle's public key, and the man-in-the-middle will see the response. (He can then encrypt it with the client's public key so the client is none the wiser.)

So, man-in-the-middle is only thwarted if and only if the client knows the host's public key before he tries to communicate with him. If he doesn't, there is no guarantee that he isn't being fooled into thinking the man-in-the-middle is really the host.

That's what the SSL certificate process is all about: distributing public keys. That's what GPG/PGP keyrings are all about--collecting known public keys. Only if the client has the public keys can we be sure that the client is really talking to the real host.