Master
Development
If you use the code from this repository, please cite the original paper:
@inproceedings{DBLP:conf/bpm/RizziSFGKM19,
author = {Williams Rizzi and
Luca Simonetto and
Chiara Di Francescomarino and
Chiara Ghidini and
T{\~{o}}nis Kasekamp and
Fabrizio Maria Maggi},
title = {Nirdizati 2.0: New Features and Redesigned Backend},
booktitle = {Proceedings of the Dissertation Award, Doctoral Consortium, and Demonstration
Track at {BPM} 2019 co-located with 17th International Conference
on Business Process Management, {BPM} 2019, Vienna, Austria, September
1-6, 2019},
pages = {154--158},
year = {2019},
url = {http://ceur-ws.org/Vol-2420/paperDT8.pdf},
timestamp = {Fri, 30 Aug 2019 13:15:06 +0200},
biburl = {https://dblp.org/rec/bib/conf/bpm/RizziSFGKM19},
bibsource = {dblp computer science bibliography, https://dblp.org}
}
Django backend server for machine learning on event logs.
The docker build is available @ https://hub.docker.com/r/nirdizatiresearch/predict-python/ in any case if you prefer to setup your environment on your own you can refer the Dockerfile.
On first run to setup the database, you can run:
docker-compose run server python manage.py migrate
To run the project:
docker-compose up redis server scheduler worker
To access a generic remote Django server you can use the ssh tunneling functionality as shown in the following sample:
ssh -L 8000:127.0.0.1:8000 <user>@<host>
If you are familiar with docker-compose the docker-compose file is available, otherwise if you use PyCharm as IDE run the provided configurations.
Finally, from the command line you can use the following sample commands to interact with our software.
Start server with
python manage.py runserver
Run tests with one of the following
python manage.py test
./manage.py test
NB: always run a redis-server in background if you want your server to accept any incoming post requests!
Start by running migrations and adding sample data
python manage.py migrate
python manage.py loaddata <your_file.json>
Start jobs from command line
curl --request POST \
--header 'Content-Type: application/json' \
--data-binary '{
"type": "classification",
"split_id": 1,
"config": {
"encodings": ["simpleIndex"],
"clusterings": ["noCluster"],
"methods": ["randomForest"],
"label": {"type": "remaining_time"},
"encoding": {"prefix_length": 3, "generation_type": "only", "padding": "zero_padding"}
}
}' \
http://localhost:8000/jobs/multiple
Creating a single split options.
- $SPLIT_TYPE has to be one of
split_sequential
,split_random
,split_temporal
,split_strict_temporal
. By defaultsplit_sequential
. test_size
has to be from 0 to 1. By default 0.2
curl --request POST \
--header 'Content-Type: application/json' \
--data-binary '{
"type": "single",
"original_log": 1,
"config": {
"test_size": 0.2,
"split_type": $SPLIT_TYPE
}
}' \
http://localhost:8000/splits/
Prediction methods accept configuration for sklearn classification/regression methods.
The Job config must contain a dict with only the supported options for that method.
The dict name must take the format "type.method". For classification randomForest this would be classification.randomForest
.
Advanced configuration is optional. Look at jobs/job_creator.py
for default values.
For example, the configuration for classification KNN would have to be like:
curl --request POST \
--header 'Content-Type: application/json' \
--data-binary '{
"type": "classification",
"split_id": 1,
"config": {
"encodings": ["simpleIndex"],
"clusterings": ["noCluster"],
"methods": ["knn"],
"classification.knn": {
"n_neighbors": 5,
"weights": "uniform"
},
"label": {"type": "remaining_time"},
"encoding": {"prefix_length": 3, "generation_type": "up_to", "padding": "no_padding"}
}
}' \
http://localhost:8000/jobs/multiple
Log encoding and labelling can be tested before prediction. It supports all the same values as classification and regression jobs but the method and clustering.
curl --request POST \
--header 'Content-Type: application/json' \
--data-binary '{
"type": "labelling",
"split_id": 5,
"config": {
"label": {"type": "remaining_time"},
"encoding": {"prefix_length": 3, "generation_type": "up_to", "padding": "no_padding"}
}
}' \
http://localhost:8000/jobs/multiple
This project allows documentation to be built automatically using sphinx. All the documentation-related files are in the docs/ folder, structured as:
└── docs/
├── build/
│ ├── doctrees/
│ └── html/
├── source
│ ├── _static/
│ ├── _templates/
│ ├── api/
│ ├── readme/
│ ├── conf.py
│ └── index.rst
├── generate_modules.sh
└── Makefile
in the html/ the built html files are placed, whereas in the source/ there are all the source files. The _static/ contains the images used in the final html files, as the logo: place eventual screenshots etc. here. The api/ contains all the files used for automatically fetching docstrings in the project, you shouldn't edit them as they are all replaced when re-building the documentation. The readme/ folder contains the .rst copies of the readmes used, when updating the project's readme, please also update those accordingly. The conf.py contains all the sphinx settings, along with the theme used (sphinx-rtd-theme).
The index.rst file is the documentation entry point, change this file to change the main documentation
structure as needed. After updating the docstrings in the project, please re-run the generate_modules.sh script,
that simply uses the sphinx-apidoc
command to re-create the api .rst files.
Finally, the Makefile is used when building the entire documentation, please run a make clean
make html
when you want updated docs.
To summarize, after changing docstrings or the readme.rst files, simply run:
sh generate_modules.sh
make clean
make html
Documentation is also hosted on readthedocs.com and built automatically after each commit on the master or development branch, make sure to have the api files updated in advance.
As this project detects when a compatible GPU is present in the system and tries to use it, please add a
CUDA_VISIBLE_DEVICES=0
flag as an environment variable if you encounter problems.
- @stebranchi Stefano Branchi
- @dfmchiara Chiara Di Francescomarino
- @TKasekamp Tõnis Kasekamp
- @mrsonuk Santosh Kumar
- @fmmaggi Fabrizio Maggi
- @WilliamsRizzi Williams Rizzi
- @HitLuca Luca Simonetto
- @Musacca Musabir Musabayli