Traditionally, cryptographic modes are built on top of a hermetic fixed-width block cipher primitive. I would argue that this model puts both the primitive designer and the mode designer at a disadvantage. The primitive designer creates a hermetic fixed-width block cipher and gives it to the mode designer. The mode designer must build an inevitably ad-hoc scheme for handling variable-length inputs and outputs on top of the block cipher. The primitive designer might architect a special tweak input to help alleviate this, but this complicates the key schedule (e.g., Tweakey). Worse yet, because the primitive is hermetic, the mode designer may have to resort to clever uses of the inverse direction of the primitive to maintain efficiency (e.g., OCB3, Deoxys-I), which in turn limits the primitive designer's freedom. Or, the mode designer must try to carefully cull rounds from the primitive (e.g., AEZ's use of AES4). Or, the mode designer must resort to special tricks to minimize the number of calls to the primitive (e.g., PMAC's handling of the last block).
Taking a step back, we see that more or less every keyed symmetric cryptographic mode requires variable-length input and/or output; in other words, a random oracle. What if we move this design requirement into the primitive? At first blush, it would seem that we are merely shifting responsibilities around without actually improving the situation. On the contrary, giving the primitive designer a view of the bigger picture improves the situation significantly. The primitive designer can now design an efficient random oracle primitive on top of non-hermetic functions; cryptanalysis occurs at the primitive level as opposed to the function level. The mode designer no longer needs to resort to using the inverse direction of the primitive to be efficient, which means the primitive designer no longer needs to be concerned about the inverse direction of the function. This saves space in hardware and allows for additional design freedom in the function since the existence and performance of the inverse direction are irrelevant.
Farfalle enjoys the following features:
In this figure, only the first input string is visualized to keep it simple and compact. To handle another input string, simply skip a mask and begin compressing at mask
rollc(i + 1). Repeat this process for each input string. A complete algorithm listing follows.