/hdfs

API and command line interface for HDFS

Primary LanguagePythonMIT LicenseMIT

HdfsCLI Build badge Pypi badge Downloads badge

API and command line interface for HDFS.

$ hdfscli --alias=dev

Welcome to the interactive HDFS python shell.
The HDFS client is available as `CLIENT`.

In [1]: CLIENT.list('models/')
Out[1]: ['1.json', '2.json']

In [2]: CLIENT.status('models/2.json')
Out[2]: {
  'accessTime': 1439743128690,
  'blockSize': 134217728,
  'childrenNum': 0,
  'fileId': 16389,
  'group': 'supergroup',
  'length': 48,
  'modificationTime': 1439743129392,
  'owner': 'drwho',
  'pathSuffix': '',
  'permission': '755',
  'replication': 1,
  'storagePolicy': 0,
  'type': 'FILE'
}

In [3]: with CLIENT.read('models/2.json', encoding='utf-8') as reader:
  ...:     from json import load
  ...:     model = load(reader)
  ...:

Features

See the documentation to learn more.

Getting started

$ pip install hdfs

Then hop on over to the quickstart guide. A Conda feedstock is also available.

Testing

HdfsCLI is tested against both WebHDFS and HttpFS. There are two ways of running tests (see scripts/ for helpers to set up a test HDFS cluster):

$ HDFSCLI_TEST_URL=http://localhost:50070 nosetests # Using a namenode's URL.
$ HDFSCLI_TEST_ALIAS=dev nosetests # Using an alias.

Contributing

We'd love to hear what you think on the issues page. Pull requests are also most welcome!