/intake

Intake is a lightweight package for finding, investigating, loading and disseminating data.

Primary LanguagePythonBSD 2-Clause "Simplified" LicenseBSD-2-Clause

Intake: A general interface for loading data

Logo

Build Status Coverage Status Documentation Status Join the chat at https://gitter.im/ContinuumIO/intake

Intake is a lightweight set of tools for loading and sharing data in data science projects. Intake helps you:

  • Load data from a variety of formats (see the current list of known plugins) into containers you already know, like Pandas dataframes, Python lists, NumPy arrays, and more.
  • Convert boilerplate data loading code into reusable Intake plugins
  • Describe data sets in catalog files for easy reuse and sharing between projects and with others.
  • Share catalog information (and data sets) over the network with the Intake server

Documentation is available at Read the Docs.

Status of intake and related packages is available at Status Dashboard

Weekly news about this repo and other related projects can be found on the wiki

Install

Recommended method using conda:

conda install -c conda-forge intake

You can also install using pip, in which case you have a choice as to how many of the optional dependencies you install, with the simplest having least requirements

pip install intake

and additional sections [server], [plot] and [dataframe], or to include everything:

pip install intake[complete]

Note that you may well need specific drivers and other plugins, which usually have additional dependencies of their own.

Development

  • Create development Python environment, ideally with conda. The requirements can be found in the recipe in the conda/ directory of this repo or in the sister feedstock
  • Install using pip install -e .[complete]
  • Add pytest to the environment to be able to run tests
  • Create a fork on github to be able to submit PRs.
  • We respect, but do not enforce, pep8 standards; all new code should be covered by tests.