Cryptographic modes

In a previous article, I made the bold claim that Farfalle provides the ideal abstraction for building every keyed symmetric cryptographic mode. In this article, I would like to demonstrate how we can effortlessly build a stream cipher, message authentication code and simple authenticated encryption mode on top of a Farfalle instance.

Stream cipher

A stream cipher generates an arbitrary-length stream of bits as a function of a secret key K and a public nonce N. This stream of bits, known as the keystream, has the property of being computationally indistinguishable from randomness to an adversary who does not know the secret key. To perform the actual encryption, we simply add the plaintext P and the keystream together in GF(2), which is just XOR. To build a stream cipher on top of Farfalle instance F, we plug in nonce N as input and request |P| bits of output, which are then added to P:

Stream cipher

To demonstrate, let's encrypt some text:

As shown above, the two ciphertexts appear completely uncorrelated thanks to the nonce providing diversification. It is very important that a nonce is never reused, otherwise the adversary can calculate the sum of the plaintexts simply by summing the ciphertexts together to cancel out the keystream.

Message authentication code (MAC)

A message authentication code (MAC) generates a short authentication tag T as a function of a secret key K and an arbitrary-length message M. The tag T has the property of being computationally indistinguishable from randomness. To build a MAC on top of Farfalle instance F, we plug in message M as input and request t bits of output:

Message authentication code

To demonstrate, let's calculate the MAC of P₁ and represent the authentication tag in hexadecimal. To make things interesting, let's create a P₂ such that it differs from P₁ by only a single byte and MAC that as well for comparison:

As demonstrated above, it does not matter how small of a change is made to the input message; the MAC function maps every possible input down to a much smaller space in a random-looking way. Due to the pigeonhole principle, there are an infinite number of inputs that map to each of the 2¹²⁸ output buckets. How much effort would it take to craft a message that MACs to the same tag as P₁? It would take 2¹²⁷ attempts on average!

Authenticated encryption

In any situation where an adversary can actively tamper with a message, which I would argue is the vast majority of situations, a stream cipher by itself does not provide adequate security. The biggest problem is malleability: flipping a bit of ciphertext at location n flips the corresponding bit of plaintext at location n. The adversary can flip individual bits at any location, and the recipient has no robust and consistent way of detecting that this happened. To demonstrate this, I have crafted ciphertext C₁ such that its decrypted plaintext specifies a very different transfer amount, using only E_stream(K₁, 0, P₁):

Stated another way, plain encryption provides confidentiality, but does not provide any integrity or authenticity guarantees; we have no way of determining whether the ciphertext has been tampered with. The solution is to use authenticated encryption, which provides confidentiality, integrity and authenticity guarantees. We have the following inputs:

Key K: Both the sender and receiver need to negotiate this selection out-of-band
Nonce N: A unique value per message; does not need to be random, and can be sent in the clear
Metadata A: Data that is authenticated but not encrypted, such as context or routing information; either known at both ends, or sent in the clear
Plaintext P: Data that is encrypted and authenticated

This can be as easy as combining the stream cipher and MAC together:

AEAD

The sender transmits the ciphertext and the receiver authenticates and decrypts it. If the ciphertext is not authentic, then it is thrown out and an error is returned. If the ciphertext is authentic, then it is decrypted and used. To demonstrate this, let's encrypt and authenticate P₁:

Just like with the MAC example, it would take about 2¹²⁷ attempts on average to achieve a successful forgery. Why is this? Going back to the plain stream cipher, the space of valid ciphertexts is very dense: every ciphertext is considered valid and will decipher without error. In this authenticated encryption mode, we are effectively making the space of valid ciphertexts very sparse by adding an authentication tag. Because the tag is 128 bits in length, in the space of all possible ciphertexts, approximately only 1 in every 2¹²⁸ ciphertexts is considered valid!

This mode has a number of things going for it:

Inverse-free
Encipher oracle is online
Can factor out static metadata

However, not all is perfect:

Decipher oracle must buffer up the entire plaintext/ciphertext since its authenticity must be verified before releasing it
Adversary could potentially submit many large invalid ciphertexts to waste CPU time (DoS attack)

We could work around these issues by breaking up the plaintext into smaller pieces and then encrypting/authenticating each piece by appending a counter to the nonce. This would work fine and is secure, but is there a more robust solution?

Sessions

We can use the incrementality property of Farfalle to our advantage. Consider the following mode:

At first glance, this may appear to be a totally impractical mode: do we really need to store a growing list of strings in memory? Not at all! Thanks to the incrementality property of Farfalle, strings are compressed into a fixed-size state. You could exchange a single message or a million messages and the state size remains constant. This stateful mode allows us to exchange a sequence of messages, and produce an intermediate authentication tag after each message. Thanks to the beauty of incrementality, each authentication tag authenticates not only the most recent message, but all messages up to that point in the session. This means that if any messages are dropped, duplicated, reordered or tampered with in any way, the authentication tag will not match and we can terminate the session immediately. Not only that, but the state would become corrupt as well, which scrambles all subsequent messages.

This mode solves the problems with the previous mode, and actually introduces an additional feature: a startup tag is generated at session initialization, which authenticates the nonce used to initialize the session. If an adversary tries establishing a session with a server, the tag will inevitably not match, and the server can drop the connection before any data is processed. This saves valuable CPU time.

Farfalle provides a powerful design tool. For example, consider that most applications need only process metadata at the beginning of a session. We can use this observation to simplify the mode:

Now, the metadata and nonce are combined into a single string provided at session startup. The metadata is compressed into the state, which diversifies it and also produces the startup tag. To demonstrate, let's initialize a session and wrap an identical plaintext repeatedly:

One of the nice things about a stateful session mode is that inputting a nonce once at the beginning of the session diversifies the entire session. In addition, note how each ciphertext is different despite wrapping an identical plaintext repeatedly. This is thanks to the fact that strings are incrementally accumulated into the state. This means that each authentication tag authenticates everything absorbed up to that point. It is also possible to allow for optional metadata or optional plaintext throughout a session by using frame bits; this is left as an exercise to the reader. Hopefully it is clear by now why I claimed Farfalle provides the ideal level of abstraction.