/pfio

IO library to access various filesystems with unified API

Primary LanguagePythonMIT LicenseMIT

PFIO

PFIO is an IO abstraction library developed by PFN, optimized for deep learning training with batteries included. It supports

  • Filesystem API abstraction with unified error semantics,
  • Explicit user-land caching system,
  • IO performance tracing and metrics stats, and
  • Fileset container utilities to save metadata.

Dependency

  • HDFS client and libhdfs for HDFS access
  • CPython >= 3.6

Installation and Document build

Installation

$ git clone https://github.com/pfnet/pfio.git
$ cd pfio
$ pip install .

Documentation

$ cd pfio/docs
$ make html
$ open build/html/index.html

Test

$ cd pfio
$ pip install .[test]
$ pytest tests/

How to use

Please refer to the official document for more information about the usage.

Release

Check the official document for latest release procedure.

Run tests locally:

$ pip install --user -e .[test]
$ pytest

Bump version numbers in pfio/version.py .

Push and open a pull request to invoke CI. Once CI passed and the pull request merged, tag a release:

$ git tag -s X.Y.Z
$ git push --tags

Build:

$ rm -rf dist
$ pip3 install --user build
$ python3 -m build

Release to PyPI:

$ python3 -m pip install --user --upgrade twine
$ python3 -m twine upload --repository testpypi dist/*