Some cool stuff with ML orchestrated by Docker and Luigi, presented by Pweave.
download_data
. Download data.process_data
. Process data. Generate features. Make Train/Test split.train_models
. Train models. Train linear regression and lightgbm on Train dataset.evaluate_models
. Evaluate models. Calculate metric performance on Test dataset for both models. Plot some charts.make_report
. Make report. Present results of the whole pipeline.
-
Build docker images
bash build-task-images.sh 0.1
-
Run pipeline, write logs to output file
docker-compose up orchestrator |& tee ./output.log
-
Clean containers
bash docker-clean.sh
- Create base docker image with most of the libraries and add layers to it instead of building each time from
python:3.6-slim
. Currently takes about 90 sec to build images on clean system from scratch. - Use more sophisticated ML algorithms; Use more feature engineering; Use parameter tuning.