/s3contents

A S3 backed ContentsManager implementation for Jupyter

Primary LanguagePythonApache License 2.0Apache-2.0

GFContents

A HDFS, S3 and GS backed Jupyter ContentsManager implementation via tensorflow.gfile.

It aims to a be a transparent, drop-in replacement for Jupyter standard filesystem-backed storage system.

Features

Supports local directories, HDFS, S3 and GS filesystems. Supports multiple large file download and upload.

Prerequisites

Write access (valid credentials) to an S3/GCS bucket, this could be on AWS/GCP or a self hosted S3 . Tensorflow

Installation

$ pip install gfcontents

Jupyter config

Edit ~/.jupyter/jupyter_notebook_config.py by filling the missing values:

S3, HDFS and local directory

from s3contents import GFContentsManager, HybridContentsManager
from IPython.html.services.contents.largefilemanager import LargeFileManager

c.NotebookApp.contents_manager_class = HybridContentsManager
c.HybridContentsManager.manager_classes = {
    "S3": GFContentsManager,
    "HDFS": GFContentsManager,
    "": LargeFileManager,
}
c.HybridContentsManager.manager_kwargs = {
    "S3": {
        "prefix": "s3://bucket_name/notebooks",
    },
    "HDFS": {
        "prefix": "hdfs://hdfsip:8020/user/logname",
    },
    "": {
        "root_dir": "/home/logname/",
    },
}