This files is about “how to design a good networked FS in 2022”.
1. TODO Body
1.1. What we have now.
At the moment we have three kinds of network synchronisation tools:
- Google Drive
NFS is immune to conflicts, because it relies on not storing files on the target machine. It is also quite fast and does not occupy too much space on the clients.
However, it cannot pre-cache anything, and is relying on having a fairly low ping, because it constantly performs synchronisation.
1.1.2. Google Drive
Google Drive allows selective sync, can work completely offline, but has difficulties resolving conflicts. It also requires the user to select cached files manually.
rsync makes synchronisation efficient, but is completely manual.
1.2. What is the problem?
Fundamentally, the problems are:
- “what to cache” (we do not want to choose files manually)
- “when to cache” (we do not want to waste mobile traffic)
- “how to resolve conflicts” (we do not want to be forced to merge)
For simplicity, let us assume we have a client-server infrastructure. This is not necessarily the case, but for simplicity let us start there.
As said above, main problems are synchronisation and conflicts. When do they happen?
- The user should define maximal cache size. This can be a “mount” option, like “-o,cache_size=66G”.
- Opening files for reading should be transparent :: this should not even be an issue, and should work like NFS. However, as soon as the file is opened, and it fits into the cache, it should be downloaded to the client. If it does not fit into the cache, it may be cached partially. Cached percentage can be shown in an Xattr.
- When a file is cached, opening it for reading should be still possible when the client is offline.