SethVPN is a client-server pair providing end-to-end layer 2 and 3 virtual private network services. The protocol is secure, fully roamable and efficient. The server side has been tested on FreeBSD, while the client side has been tested on FreeBSD, Windows (UWP VpnPlugin and Wintun) and Android. I wrote SethVPN as a way to learn how VPNs work, and also because I am not satisfied with current VPN solutions. Its protocol is based on a novel Feistel construction.
The basic idea behind a VPN is to intercept frames/packets, encrypt/decrypt them, and then pass them along. The VPN software listens for activity from two sources: an Internet-facing socket and a virtual network interface.
In the case of an L3 VPN, the virtual network interface is assigned an IP address and intercepts/injects IP packets. A route is configured to direct packets to it. For example, when browsing the web on the client, the IP packets from the browser are routed to the virtual device. The VPN client reads IP packets from the device, encrypts them, then sends them to the VPN server via a regular UDP or TCP socket over the Internet. The VPN server receives the encrypted IP packets over the socket, decrypts them, then writes them to its virtual device. The opposite direction is similar: the VPN server reads IP packets from the device, encrypts them, then sends them to the VPN client. The VPN client receives the encrypted IP packets, decrypts them, then writes them to its virtual device.
The L2 VPN case is similar, except that it deals in Ethernet frames as opposed to IP packets. This enables powerful use cases such as the bridging of remote networks.
In the latest revision of my VPN software, I have eliminated the handshake entirely to achieve total statelessness. This allows for robust roaming and simplifies the protocol. SethVPN uses a special Feistel-based mode. In a nutshell, the mode is efficient and secure: a timestamp and counter provide diversification and are also encrypted. The redundancy is 128-bits in length, which means approximately 2128 online operations are required in order to forge a packet.
The client and server communicate via UDP packets. UDP is lightweight, stateless, unordered and potentially unreliable. Therefore, both the client and server must be able to exchange packets in a stateless way. For each encryption/decryption, the packet is encrypted/decrypted using the mode visualized below. If the sender's packet is outside of a +/- 15 second window relative to the receiver's clock, then it is dropped. To further protect against replay attacks, the newest packet's timestamp is cached at the receiver's end and any packet more than 3 seconds old relative to the newest packet is dropped. The 64-bit counter is initialized to a random value at startup and is incremented once per packet to ensure semantic security.
At first sight, it may seem that using a protocol in which packets can be lost is a bad thing. But actually, this is just because UDP is a very lightweight layer on top of IP and just inherits the properties of raw IP packets. A lost UDP packet does not cause any sort of malfunction: if a client application is performing TCP communications over the UDP-based VPN and a UDP packet is lost, then the TCP protocol uses its built-in mechanisms to resend the lost packet. If a client application is performing UDP communications over the UDP-based VPN and a UDP packet is lost, then it is no different than if not using a VPN at all. It is for this reason that UDP is desirable for VPN communications: it is the closest you can get to raw IP packets while still being accessible from user space.
The mode enjoys many properties:
In my FreeBSD implementation, I replaced if_tun with netgraph and externalized all network configuration. This provides better separation of concerns and generalizes the code. The client and server no longer care about whether they are processing frames or packets; instead, they simply communicate with a specified netgraph path and hook via ng_socket. This enables various scenarios without needing any special logic in the code:
Expansion tapers quickly from 48 bytes to 33 bytes:
Measurements taken on an Intel Xeon W5-2455X CPU with AVX-512:
Note that I don't measure in terms of cycles since the results still differ across architectures and require disabling a number of CPU features. Instead, I prefer real-world measurements with meaningful numbers.
To demonstrate this custom Feistel mode, I generated a random key and encrypted the plaintext "Hello, world!" 20 times in a row, all with the same timestamp:
1. 31 2D B0 01 73 26 45 3A BF 5C 4B 83 2B 5F 43 9B 1E 85 4F 3A 73 60 5B FC F4 A1 34 5F AA A4 72 B3 08 93 DD 63 FB 34 64 95 46 F6 F2 F0 81 E7 E5 ED
2. 0C 73 BF 6B 25 97 59 90 25 96 61 EF 83 0D 84 44 D4 58 DA 08 D5 5A FB 55 8D 09 67 4E 96 36 C7 80 A1 89 A9 0A 13 2F A6 2F 6C 45 C7 B8 EC DC 74 DD
3. CE 83 9B 2E 14 0B C9 0E F0 C6 32 F6 83 AB 02 0C 57 7C D6 CB 04 C3 A2 03 0C 3F 00 59 71 4C 15 41 94 B2 C1 E3 90 20 B9 0D F0 15 46 07 49 3F 07 A4
4. 35 CC 90 66 3F 44 79 AC 4F 33 72 D2 D5 94 38 04 16 37 3F 50 5E 72 B8 8A DC EB 35 2A 23 BD EE 93 57 55 C3 9B 47 3D 44 F1 E6 E0 FF 9F F2 0E 0C 2F
5. 48 FB A0 54 6C 32 AA C5 54 35 40 F1 B9 1A 45 CC 3F 40 08 FC BF C7 A7 1C 83 79 66 87 27 69 30 5F 19 16 CA 59 5D D1 7A EB 10 58 A7 E4 45 8B 9D 2D
6. 47 DA 0E D4 B8 7B D4 08 44 09 0B FE 7E 1B 66 5E AD F9 54 10 16 4A FC D2 9E DC 67 A3 51 0C 25 3E C8 37 67 56 06 23 7A B0 B8 FD 67 28 F3 13 B6 A2
7. 15 E7 AB 6E 94 64 3F C4 3D 9E 52 39 3A 05 0A B8 A4 FC 16 EF 6D 7C 84 F9 E6 D0 82 CB 23 94 71 92 EE 31 F9 93 CB 5C BE 53 77 14 F1 7C 13 B2 3A 31
8. B8 D0 68 F4 AE CC 0C 3C AF 4B 2A D2 39 52 4F DC CA CC 01 0F 65 2E B8 9D AB 86 53 3E A7 1F EB 79 C6 08 60 D9 E3 E1 51 E1 76 0E 05 53 12 6E 9F E1
9. 2A 53 79 1D FE 51 86 91 69 DB 0A A1 B4 B6 3E B6 57 C7 6B 84 D8 A4 1F 96 E0 4C 81 84 43 5E 82 2D 51 51 6A 45 F2 68 BF 23 58 7F DA 35 D6 8B 01 19
10. 3F 2B 2A BD 31 EB 99 F7 86 1F 3A 13 AE 8D 24 9E 80 EA 02 4F 84 8F 97 10 39 EF 98 9A 46 9F 35 48 3A 84 F9 7F 06 CA D6 EB 4B 38 8A 3E 01 9E B2 67
11. CD 17 4E B7 E0 7D 12 52 E2 3D 8B D3 3F B0 C2 76 56 D4 97 6F 3E BC 39 72 8E F6 EA B5 92 3D 96 42 41 ED C6 10 0F 82 28 94 97 4B B5 B8 91 CB DF D4
12. 31 63 AA 18 30 44 99 01 C4 B3 B0 BE 5C 01 F4 D6 63 C5 51 25 50 D9 52 4C 4E 3F 91 E4 5F 3C 8D 63 02 A8 D6 62 51 8F 55 AE D2 E5 EE 75 8A 59 1E 5A
13. 88 BD 9C BE FB 1E D0 C7 27 32 84 0C 86 09 C2 F4 E0 CD 0D 8B 12 1C 3C 09 4D D5 A4 AE 52 B7 32 41 40 C9 D7 BA FA 13 86 04 FA 36 C4 73 FA 76 70 44
14. 33 D7 2E 2A BF F4 7A 7C 5F 41 A3 CF CA CE 7A 64 A4 A0 E9 1B BA 90 B0 E2 90 67 08 20 63 9C 81 8F CE 2E 5E 34 09 E9 CA FA 61 53 FB BD B4 45 27 AA
15. 60 7F 52 82 61 4C 87 36 DE CE EF 33 E0 8B B5 5B C4 37 A6 52 16 DC 2D 0B 52 93 AD 39 D8 76 8E 0B 62 4D AB 58 A1 F9 74 BB 76 52 91 32 34 A1 C9 2A
16. 85 34 4E A8 69 93 81 EB 56 32 28 24 35 80 4D 72 FC F0 C0 1E 76 62 6B 81 18 97 19 BB F7 8A 16 E4 FB CB 3C 8B 8B BF 50 E3 06 EA A5 5D C4 EE E4 A3
17. 76 5C 95 CC FA 53 96 29 BF 68 ED 95 6D BA 75 E6 D4 1B EF A4 E9 92 7D 28 63 C7 7F C4 E7 9E AE D2 BE 11 39 D7 A6 6F 4D 6F ED 7D EF 34 C6 63 65 AB
18. 35 61 FF 74 B0 22 DA BD BB DA E2 E4 B6 80 EA BD 0C 8D 57 1E 15 61 03 F8 CA 83 98 A5 35 79 E7 95 45 47 AF 65 0A 87 B3 23 B3 2F 5D D2 46 D6 88 4F
19. 90 24 3F CE 46 B5 46 21 F8 4B FC 1E 30 29 36 B2 2D EA 79 65 48 8D 00 5F 07 5D F4 76 34 4E 67 4A 82 1A 8E EE B4 2A FE 8B ED F3 A1 B5 80 AE 24 97
20. DE 41 C8 AF C6 A7 6C 1A 0A C8 CE 92 13 A4 4D 3A B9 51 AB 58 CE A5 CA C9 35 55 AF 08 3F D2 C3 CE 4E BC 09 DD 93 33 E6 5B 01 FE DF 15 B3 04 9F A7
The advantage of this mode is that the timestamp and counter are encrypted; the adversary is unable to glean any useful information thanks to the semantic security provided by the mode.
The Android client is written in Kotlin and C++. The UI portion uses Jetpack Compose: