/ossfs

fsspec filesystem for Alibaba Cloud (Aliyun) Object Storage System (OSS)

Primary LanguagePythonApache License 2.0Apache-2.0

OSSFS

PyPI Status Python Version License

Tests Codecov pre-commit Black

OSSFS is a Python-based interface for file systems that enables interaction with OSS (Object Storage Service). Through OSSFS, users can utilize fsspec's standard API to operate on OSS objects

Installation

You can install OSSFS via pip from PyPI:

$ pip install ossfs

Up-to-date package also provided through conda-forge distribution:

$ conda install -c conda-forge ossfs

Quick Start

Here is a simple example of locating and reading an object in OSS.

import ossfs
fs = ossfs.OSSFileSystem(endpoint='http://oss-cn-hangzhou.aliyuncs.com')
fs.ls('/dvc-test-anonymous/LICENSE')
[{'name': '/dvc-test-anonymous/LICENSE',
  'Key': '/dvc-test-anonymous/LICENSE',
  'type': 'file',
  'size': 11357,
  'Size': 11357,
  'StorageClass': 'OBJECT',
  'LastModified': 1622761222}]
with fs.open('/dvc-test-anonymous/LICENSE') as f:
...     print(f.readline())
b'                                 Apache License\n'

For more use case and apis please refer to the documentation of fsspec

Async OSSFS

Async OSSFS is a variant of ossfs that utilizes the third-party async OSS backend aiooss2, rather than the official sync one, oss2. Async OSSFS allows for concurrent calls within bulk operations, such as cat, put, and get etc even from normal code, and enables the direct use of fsspec in async code without blocking. The usage of async OSSFS is similar to the synchronous variant; one simply needs to replace OSSFileSystem with AioOSSFileSystem need to do is replacing the OSSFileSystem with the AioOSSFileSystem

import ossfs
fs = ossfs.AioOSSFileSystem(endpoint='http://oss-cn-hangzhou.aliyuncs.com')
print(fs.cat('/dvc-test-anonymous/LICENSE'))
b'                                 Apache License\n'
...

Although aiooss2 is not officially supported, there are still some features that are currently lacking. However, in tests involving the put/get of 1200 small files, the async version of ossfs ran ten times faster than the synchronous variant (depending on the pool size of the concurrency).

Task time cost in (seconds)
put 1200 small files via OSSFileSystem 35.2688 (13.53)
put 1200 small files via AioOSSFileSystem 2.6060 (1.0)
get 1200 small files via OSSFileSystem 32.9096 (12.63)
get 1200 small files via AioOSSFileSystem 3.3497 (1.29)

Contributing

Contributions are very welcome. To learn more, see the Contributor Guide.

License

Distributed under the terms of the Apache 2.0 license, Ossfs is free and open source software.

Issues

If you encounter any problems, please file an issue along with a detailed description.