In this article, I will motivate why plain encryption does not meet the goals of modern security. I will then propose authenticated encryption as the ideal solution.
Plain encryption provides confidentiality; in other words, an adversary in theory cannot recover the plaintext without knowing the secret key. However, plain encryption does not provide any way to detect if a ciphertext has been damaged or tampered with. As a result, an active adversary can tamper with data in storage and in transit. The target application will consume the modified data and react in undefined and potentially devastating ways, perhaps even leaking information about the plaintext. To investigate, let's craft random ciphertexts and ask the decryption oracle to decrypt them. We'll use green squares to indicate successful decryptions and red squares to indicate unsuccessful decryptions:
The first thing to note is that there are no invalid ciphertexts! Each ciphertext contains only enough information to recover the original plaintext and nothing more; we have no consistent and secure way of determining whether the ciphertext has been tampered with. This means an adversary can flip arbitrary bits of a known ciphertext and always arrive at another valid ciphertext. It should be noted that both stream ciphers and CBC-mode encryption are malleable, meaning an adversary can effect local change in the plaintext via ciphertext tampering. Tampering with a CBC-encrypted block does not result in propagating corruption of plaintext blocks. The correct solution is to use authenticated encryption, which provides strong confidentiality and authenticity guarantees.
Imagine we are able to sprinkle lots of decoy (invalid) ciphertexts amongst the valid ones, such that the valid ciphertexts are distributed throughout the space uniformly at random. To this end, we must define what it means for a ciphertext to be valid versus invalid. The way we achieve this is by employing verifiable redundancy. We need this redundancy to be a pseudorandom function of the input; in other words, if the adversary flips even a single bit of the encrypted plaintext then every bit of redundancy should flip with 50% probability. One possible method is to compute the MAC of the encrypted plaintext and use the result as our redundancy. Because the adversary does not know the secret key, she cannot feasibly predict the redundancy for a plaintext. Continuing the toy example, let's look at what happens when we add a single bit of redundancy:
We have effectively halved the density of valid ciphertexts; now, 1 in 21 ciphertexts is valid. This means an adversary would have to make an average of 20 forgery attempts in order to arrive at a valid ciphertext. What if we add another bit of redundancy?
Now we've halved the density of valid ciphertexts yet again; now, 1 in 22 ciphertexts is valid. The adversary would have to make an average of 21 forgery attempts in order to arrive at a valid ciphertext. Next, let's cut the density in half 8 more times:
Now, 1 in 210 ciphertexts is valid, requiring an average of 29 forgery attempts from the adversary in order to locate a valid ciphertext. Another way to think of this is that for every valid ciphertext, there exists 210-1 invalid ciphertexts.
To actually make forgeries infeasible, we need enough verifiable redundancy to make the space sufficiently large and sparse. For 128-bit security, we use 128 bits of verifiable redundancy to ensure that it would take the adversary an average of 2127 attempts to land on a valid ciphertext. This would be equivalent to halving the density of valid ciphertexts 128 times in a row, resulting in a ciphertext space populated with 2128-1 invalid ciphertexts for every valid ciphertext. With a ciphertext space this sparse, it would actually be easier to just brute-force search for the 128-bit key!
As far as authenticated encryption algorithms go, we have the following inputs:
The output from the authenticated encryption algorithm is the ciphertext. The ciphertext is always larger than the plaintext due to the embedded redundancy used to achieve the sparseness property described above. Depending on the exact algorithm being used, the ciphertext may be a composition of multiple strings. In the case of Deck-PLAIN, the ciphertext is composed of the encrypted plaintext and the authentication tag. The length of the encrypted plaintext is identical to the length of plaintext and the authentication tag is the verifiable redundancy, which will typically be 128 bits.
The sender transmits the ciphertext and the receiver authenticates it by validating the embedded redundancy. If the redundancy does not check out, then the ciphertext is thrown out and an error is returned. Otherwise, the receiver decrypts to obtain the plaintext, which is guaranteed safe for consumption. The beauty of authenticated encryption is that the consuming application is given a surefire method by which to validate the ciphertext, regardless of the structure and contents of the plaintext. If a ciphertext is determined to be invalid, then the application can handle this in a consistent manner, and never has to come into contact with the invalid data. With plain encryption, the application might silently chug along with corrupted data, or possibly explode in some undefined way.
Given the vast sparseness described above, it may come as a surprise that the data overhead associated with authenticated encryption is minimal compared to the guarantees it provides. As stated above, to achieve a forgery probability of 2-t we simply need t bits of redundancy embedded within the ciphertext in general. As a concrete example, consider an application that wishes to protect arbitrary binary data in 64 KiB chunks. If we store t=128 bits of redundancy per chunk, then the data overhead is only 0.024%. If the overall size of the data is 1 GiB, then the total amount of space needed for the redundancy is only 256 KiB.
The key takeaway here is that authenticated encryption gives the best bang for the buck when used on a reasonable block size. In other words, trying to apply authenticated encryption on a field-by-field basis would lead to a lot of overhead. Instead, it is best to apply it to a batch of data, such as a fixed-size chunk of data or an entire record.
For some applications, 128 bits of redundancy may be overkill. For example, consider a web application with rate limiting to prevent an adversary from sending too many requests at once. In this case, it may be acceptable to have a forgery probability of e.g. 1 in 4 billion (32 bits of redundancy).
It may be tempting to think that using redundancy longer than 128 bits leads to even more difficult odds for the adversary, but if we are using a 128-bit key then the key would become the weakest link in the chain. The adversary would simply target the key instead, which would be far simpler since it is an offline attack by nature (i.e., it does not require access to a decipher oracle).