Feature request: Add fsspec FileSystem support in FileSet
Closed this issue · 0 comments
The fsspec library provides a generic interface to interact with filesystems independently of where they are located. This means users can automatically search for files on remote filesystems such as with the s3fs implementation for Amazon S3 buckets, or wrap this in a CachingFileSystem
instance caching requested data locally in sparse files. For users who don't have all satellite data available locally the ability to download only the data needed can save significant time.
My specific use case is to find GLM LCFA files covering a specified time period, pass those to glmtools (ideally by passing files opened with a CachingFileSystem
instance, but passing files downloaded entirely if needed), so that I can read the resulting gridded products using satpy.
Satpy contains functionality to search for files based on file patterns, similar to what exists in a typhon Fileset, and this functionality supports fsspec FileSystem instances; however, it only supports sensors for which Satpy has a reading routine (Satpy reads gridded GLM, but not ungridded GLM LCFA). It also has ongoing work for generic filesystem support, but this will not help for GLM as long as glmtools interfacing happens outside of satpy (and the latter does not support fsspec filesystems).
Therefore, it would be very nice if the user could pass any instance implementing fsspec.spec.AbstractFileSystem
, so that the FileSet
functionality is available regardless of where the files are located.