/Retriever.jl

Julia wrapper for the Data Retriever software

Primary LanguageJuliaMIT LicenseMIT

Documentation PackageEvaluator Build Status

Retriever

Julia wrapper for the Data Retriever software.

Data Retriever automates the tasks of finding, downloading, and cleaning up publicly available data, and then stores them in a local database or as .csv files. Simply put, it's a package manager for data. This allows data analysts to spend a majority of their time in analysing rather than in cleaning up or managing data.

Installation

To use Retriever, you first need to install Retriever, a core python package.

Database Management Systems

Depending on the database management systems you wish to use, follow the Setting up servers documentation of the retriever. You can change the credentials to suit your server settings.

The Retriever Julia package depends on a few Julia packages that will be installed automatically.

Ensure that Pycall is using the same Python path where the retriever Python package is installed.

You can change that path to a desired path as below.

julia> ENV["PYTHON"]="Python path where the retriever python package is installed"
# Build Pycall to enable the use of the new path
Pkg.build("PyCall")

To install Retriever Julia package

julia> Pkg.add("Retriever")

To install from Source, download or checkout the source from the github page.

Go to Retriever.jl directory and. Run Julia.

julia> include("src/Retriever.jl")

Example of installing the Datasets

# Using default parameter as the arguments
julia> Retriever.install_postgres("iris")
 # Passing user specfic arguments
julia> Retriever.install_postgres("iris", user = "postgres",
		password="Password12!", host="localhost", port=5432)
julia> Retriever.install_csv("iris")
julia> Retriever.install_mysql("iris")
julia> Retriever.install_sqlite("iris")
julia> Retriever.install_msaccess("iris")
julia> Retriever.install_json("iris")
julia> Retriever.install_xml("iris")

Creating docs.

To create docs, first refer to the Documenter docs. To test doc locally run make.jl

julia --color=yes make.jl

or simply

julia make.jl

Using Docker

To run tests using docker

docker-compose run --service-ports retrieverj julia test/runtests.jl

To run the image interactively

docker-compose run --service-ports retrieverj /bin/bash

To test docs in docker

docker-compose run --service-ports retrieverj bash -c "cd docs && julia make.jl"

Acknowledgments

Development of this software is funded by the Gordon and Betty Moore Foundation's Data-Driven Discovery Initiative through Grant GBMF4563 to Ethan White and started as Shivam Negi's Google Summer of Code