Our analytics library to quickly get our data scientists up to speed, on the python platform
User documentation can be found at https://bigdatarepublic.github.io/bdr-analytics-py/
Installation is done through the pip command line utility.
pip install bdranalytics
Some notebooks in the notebooks
folder use spark. Check the spark documentation for running jupyter with a spark contet.
But in short, for windows
set PYSPARK_DRIVER_PYTHON_OPTS=notebook
set PYSPARK_DRIVER_PYTHON=jupyter
[spark_install_dir]\bin\pyspark
And for nix
export PYSPARK_DRIVER_PYTHON_OPTS=notebook
export PYSPARK_DRIVER_PYTHON=jupyter
[spark_install_dir]/bin/pyspark
To contribute, please fork or branch from master
and submit a pull-request.
Guidelines for an acceptable pull-request:
- PEP8 compliant code
- At least one line of documentation per class, function and method.
- Tests covering edge cases of your code.
To create the development environment with conda, run:
conda env create -f environment-dev.yml
source activate bdranalytics-dev
To run all tests:
source activate bdranalytics-dev python setup.py test
To create a dist from a local checkout (when developing on this module):
source activate bdranalytics-dev python setup.py sdist
This uses the setup.py script directly, useful for testing how the dist will be installed without creating the dist.
To just install the package and main dependencies from a local checkout (when going to use this module):
python setup.py install
To update html files:
source activate bdranalytics-dev
cd doc
make clean && make source && make html