/ateliere-goofys

a high-performance, POSIX-ish Amazon S3 file system written in Go

Primary LanguageGoApache License 2.0Apache-2.0

Goofys is a high-performance, POSIX-ish Amazon S3 file system written in Go

Build Status Github All Releases Twitter Follow Stack Overflow Questions

Overview

Goofys allows you to mount an S3 bucket as a filey system.

It's a Filey System instead of a File System because goofys strives for performance first and POSIX second. Particularly things that are difficult to support on S3 or would translate into more than one round-trip would either fail (random writes) or faked (no per-file permission). Goofys does not have an on disk data cache (checkout catfs), and consistency model is close-to-open.

‼️ Ateliere uses a fork because the official repo is not mentained.

Installation

  • For Ateliere use-case, build from source with Go 1.17 or later, and add the resulting goofys binary to service repos that uses it:
$ env GOOS=linux GOARCH=amd64 go build github.com/OwnZones/ateliere-goofys && mv ateliere-goofys goofys

Usage

$ cat ~/.aws/credentials
[default]
aws_access_key_id = AKID1234567890
aws_secret_access_key = MY-SECRET-KEY
$ $GOPATH/bin/goofys <bucket> <mountpoint>
$ $GOPATH/bin/goofys <bucket:prefix> <mountpoint> # if you only want to mount objects under a prefix

Users can also configure credentials via the AWS CLI or the AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY environment variables.

To mount an S3 bucket on startup, make sure the credential is configured for root, and can add this to /etc/fstab:

goofys#bucket   /mnt/mountpoint        fuse     _netdev,allow_other,--file-mode=0666,--dir-mode=0777    0       0

See also: Instruction for Azure Blob Storage, Azure Data Lake Gen1, and Azure Data Lake Gen2.

Got more questions? Check out questions other people asked

Benchmark

Using --stat-cache-ttl 1s --type-cache-ttl 1s for goofys -ostat_cache_expire=1 for s3fs to simulate cold runs. Detail for the benchmark can be found in bench.sh. Raw data is available as well. The test was run on an EC2 m5.4xlarge in us-west-2a connected to a bucket in us-west-2. Units are seconds.

Benchmark result

To run the benchmark, configure EC2's instance role to be able to write to $TESTBUCKET, and then do:

$ sudo docker run -e BUCKET=$TESTBUCKET -e CACHE=false --rm --privileged --net=host -v /tmp/cache:/tmp/cache OwnZones/ateliere-goofys-bench
# result will be written to $TESTBUCKET

See also: cached benchmark result and result on Azure.

License

Copyright (C) 2015 - 2019 Ka-Hing Cheung

Licensed under the Apache License, Version 2.0

Current Status

goofys has been tested under Linux and macOS.

List of non-POSIX behaviors/limitations:

  • only sequential writes supported
  • does not store file mode/owner/group
    • use --(dir|file)-mode or --(uid|gid) options
  • does not support symlink or hardlink
  • ctime, atime is always the same as mtime
  • cannot rename directories with more than 1000 children
  • unlink returns success even if file is not present
  • fsync is ignored, files are only flushed on close

Compatibility with non-AWS S3

goofys has been tested with the following non-AWS S3 providers:

  • Amplidata / WD ActiveScale
  • Ceph (ex: Digital Ocean Spaces, DreamObjects, gridscale)
  • EdgeFS
  • EMC Atmos
  • Google Cloud Storage
  • Minio (limited)
  • OpenStack Swift
  • S3Proxy
  • Scaleway
  • Wasabi

Additionally, goofys also works with the following non-S3 object stores:

  • Azure Blob Storage
  • Azure Data Lake Gen1
  • Azure Data Lake Gen2

References