I recently had the need to transfer a large number of pictures from my phone to my server, but was unable to find an acceptable way to do it. Both MTP and Bluetooth are slow and require a close distance; SFTP clients are packed full of ads and are unable to saturate my bandwidth. I decided to write my own program to solve this problem and the result is SethFTP: a secure, powerful and efficient file transfer protocol. The protocol has been implemented on FreeBSD (server + client), Android (client) and Windows (client).
Both SCP and SFTP tend to be chatty when lots of small files are being transferred, resulting in slow speeds over high-latency links. I decided to solve this problem out of the gate by designing SethFTP to be heavily pipelined. Whether you are transferring hundreds of small files or a few large ones, the protocol-level performance remains identical. This is accomplished by coalescing multiple operations together until the message reaches a reasonable size, at which point it is encrypted and sent over the wire. Think of it as an on-the-fly 'tar'ing and un'tar'ing process.
The protocol supports robust file resumption. This means that both the client and server will compute and compare a MAC over the existing part of the file to be sure they match first; contrast this against SFTP where no such comparison is done, which can result in frankenstein files. The protocol is mode-based; see below for a complete list of modes and commands.
Returns a directory listing, including name, mode, size, access time and modification time. The results are paged using iterator iter-id and the first num-items items are returned. If recursive-flag is set to 1, then a recursive directory listing is returned. This satisfies the recursive download use case.
Returns the next num-items items from iterator iter-id.
Clears iterator iter-id.
Flushes output and exits mode.
Returns a directory listing based on the glob
function.
Flushes output and exits mode.
Returns whether a file/folder exists. If path refers to a file and it exists, then this command also returns the file size and a streamed MAC of its contents up to mac-to-length. This satisfies the file resumption use case.
Cancels tag streaming of an item. This allows the client to cancel gracefully, e.g. as soon as a mismatch is detected. If tag streaming has already completed for the specified item, then this is a no-op.
Flushes output and exits mode.
Downloads a file starting at the provided offset. The offset allows for file resumption.
Flushes output and exits mode.
Uploads a file. The append-flag parameter specifies whether the file should be created or appended to. The data is streamed in fixed-size chunks; the last chunk is always partial and indicates the end of the transfer. The server returns a status to the client approximately every second, indicating the amount of data successfully received thus far. This status message is time-based as opposed to data-based to avoid flooding the wire on fast connections.
Creates a folder. This satisfies the recursive upload use case.
Flushes output and exits mode.
Ends the session.
I experimented with supporting metadata preservation (such as timestamps) but ultimately decided against it due to the fact that such data is non-portable. For example, not every filesystem supports setting the file birth time, nor do they all store the timestamps to the same precision. Not every filesystem has the concept of 'owner' and 'group' either. If such data must be preserved, then the solution is to wrap the files in a container format (such as a tar file) and then transmit that. This protocol is data-focused.
The main activity is pretty simple and contains connectivity information.
When 'download' is tapped, the client issues the dir command to get a directory listing. I wrote a custom RecyclerView adapter and layout to render the returned listing. It allows the user to navigate between directories and select multiple files.
Once files have been selected and 'download' is tapped, the client first issues an info command for files that already exist locally, in order to check if resumption is needed. It then requests the files from the server via the download command. The main activity is periodically updated to reflect download progress. The app also maintains a notification with some basic information about the transfer.
When 'upload' is tapped, the client displays a file picker activity. In this example, I am selecting all of the files that I just downloaded from the server.
Once files have been selected and 'open' is tapped, the client first issues an info command for the selected files to check if resumption is needed. It then streams the data to the server via the upload command. The main activity is periodically updated to reflect upload progress as well as server acknowledgement. The app also maintains a notification with some basic information about the transfer.
I wrote a simple console client as well that runs in FreeBSD and Windows. It is interactive and feels a lot like an SFTP session. In this example, I am performing a glob to locate files of interest, then downloading one of them from the server. I then upload the file back to the server.
~/sethftp/client% ./sethftp
sethftp> open ...
Session established
sethftp> glob /usr/lib/*usb*
-r--r--r-- 554926 [10/28/2020 09:21:56] /usr/lib/libusb.a
-r--r--r-- 90616 [10/28/2020 09:18:46] /usr/lib/libusb.so
-r--r--r-- 90616 [10/28/2020 09:18:46] /usr/lib/libusb.so.3
-r--r--r-- 582956 [10/28/2020 09:21:56] /usr/lib/libusb_p.a
-r--r--r-- 70878 [10/28/2020 09:21:57] /usr/lib/libusbhid.a
-r--r--r-- 15912 [10/28/2020 09:18:46] /usr/lib/libusbhid.so
-r--r--r-- 15912 [10/28/2020 09:18:46] /usr/lib/libusbhid.so.4
-r--r--r-- 72622 [10/28/2020 09:21:57] /usr/lib/libusbhid_p.a
sethftp> download /usr/lib/libusb_p.a recv/libusb_p.a
4.59649e+08 bytes/second
sethftp> upload recv/libusb_p.a recv/libusb_p.a
server_files=1 server_bytes=582956
4.47075e+08 bytes/second
sethftp> quit
% ls ~/sethftp/client/recv
libusb_p.a
% ls ~/sethftp/server/recv
libusb_p.a