qri-io/qfs

Configurable Multiplexed Filesystem

b5 opened this issue · 8 comments

b5 commented

RFC 0029 requires changes to qfs to make it configurable. Using some combination of qri and qfs, we need a function that returns a qfs.Filesystem implementation from configuration data that looks like this:

Filesystems:
- type: ipfs
  options:
    path: ~/.ipfs
    api: true
    pubsub: false
- type: local
- type: http

One of the big questions is which of qri-io/config and qfs should do which parts of parsing, but before we get to that question, we know a few things:

  • The returned Filesystem will be a Mux filesystem, because it composes one or more Filesystems
  • anything that processes this configuration input will need to import all the various possible implementations, so config processing can't be defined in the base qfs package, because it's an "interface declaration package"
  • We have strong patterns for New functions that accept "Options" for configuration, like lib.NewInstance

Based on all that I think the right answer is to **move qfs.Mux into a new muxfs sub-package (github.com/qri-io/qfs/muxfs), and use the muxifs.New as the place where all this configuration happens. We should also be working to make sure every other package's configuration declaration is uniform, and can accept a map[string[interface{} set of configuration options that will be passed down from the higher configuration file. muxfs.New can do the routing to different filesystem implementations by switching on the provided type field.

Originally going to file this issue in qri, but since it only references qri/repo/buildrepo code but is implemented here, I think it makes more sense to expand on this issue.

From qri/repo/buildrepo:

https://github.com/qri-io/qri/blob/6a208223778f87682a69e57dd668004349203a43/repo/buildrepo/build.go#L88-L102

We expect local, http, cafs, and ipfs as options.

And from the same buildrepo file, a NewCAFSStore (which is stored in the mux fs under cafs) can be a ipfs, a ipfs_http, or a map.

My initial draft of muxfs.New doesn't make sense now that I understand better how it's used and what the fields are supposed to be.
mux fs fields:
ipfs - (qipfs.Filestore)
local - (localfs.Filestore)
http - (httpfs.Filestore)
cafs - any cafs.Filestore (possibilities are ipfs, ipfs_http, or map)

In buildrepo.NewFilesystem, we take a passed in store and use the store in place of the cafs filesystem in the returned mux. Is this the paradigm we should we working toward? Or will we build the cafs filesystem in our qfs.New and pull the store from the muxfs? Basically, should this qfs.New be expected to create the store itself?

If an ipfs config is passed in, should we assume this will fill the cafs and the ipfs slots? Or should there be a separate cafs section in the config?

Okay after chat with @b5 here is the game plan.

We really do want each type of filesystem to have its own field in a MuxFS, rather than obfuscating mem, ipfs (sometimes), and ipfs_http underneath cafs. "cafs" is a characteristic a filesystem can have, and one muxfs can have multiple stores that each are content addressed. We also need to update the way that qfs determines which filesystem a path is trying to access.

To that effect, muxfs.New will switch on:
ipfs
local
http
ipfs_http
mem

However, we will also need to update the qfs.PathKind function to use path prefixes to determine which of those filesystems is trying to be accessed.

Thinking about defaults:

  1. If a MuxConfig just has a Type, filed out but no Config associated with that Type, each subpackage should use a DefaultConfig
  2. We need a muxfs.OptISetPFSPath that takes a string. If it's empty, we should try to either get the ipfs path relative to the QRI_PATH and set the MuxConfig for the qipfs.Filesystem accordingly.

@b5 running into a little bit of a logic snag, and want to run my thoughts by you

I'm understanding more why we had one "cafs" section to obscure ipfs and ipfs_http.

Both ipfs and ipfs_http filesystems use the ipfs prefix. So any PathKind function will always return "ipfs" as the type, and ipfs_http would never be called.

  • Do we want users to be able to set up both an ipfs and ipfs_http options? If not, I'm proposing we figure this out during config: you can only have one ipfs option, but if you only have a ipfsApiUrl field in the config, we actually set up an ipfs_http fs rather than a typical ipfs fs.

  • If we want users to be able to set up both... how do we differentiate? One option is to expect hashes prefixed by ipfs_http to resolve via ipfs_http, and we do the hash/path adjustment in ipfs_http. However, I'm not sure if it is reasonable to expect a ipfs_http prefix as I'm not sure what context this would get used in.

b5 commented

Your hunch is right. The http implementation of the Ipfs filestore should be a configurable fallback. Users shouldn’t be able to declare a file system config type ipfs_http.

b5 commented

we'll need some sort of apiAddr config key that wants a multi address value on the ipfs filesystem type. ipfs_http configuration should be sensitive to this. The ipfs filesystem initializer should fall back to trying to connect over this address if it cannot obtain the repo lock and the value is populated.

To clarify, this "falling back" should happen in the cafs/ipfs.New function? If we can't create an ipfs fs, we should instead (inside the qipfs.New func) call cafs/ipfs_http.New and pass in a ipfs_http fs (if the apiAddr field is populated)

b5 commented

Yes exactly