This is an approach to automatically classify SEPA-Bookings to one of seven predefined classes.
The booking-classifier uses a Support Vector Machine that was trained on a set of several bookings.
Note: the words 'classify' and 'categorize' are used synonymously below.
Also part of this project is a 'comparison'-package which compares the performance of Naive Bayes Classifier, K-Nearest-Neighbor and Support Vector Machine on a provided training set. Consider the training set is not part of this repository due to privacy reasons.
comparison:
- comparison of different classifier algorithms
- hyperparameter estimation with grid search
- plotter for plotting confusion matrix
file_handling:
- reading from and writing to training set
flaskr:
- REST-API and templates
resources:
- ML model and extracted features
To start the booking classifier simply start the 'flaskr/app' module. The API offers following routes:
Route | Description | HTTP methods | Message body |
---|---|---|---|
/howto | Root path and common description | GET | - |
/categorize | Categorizes a booking and returns the category and probability. | POST | {"booking_date":" ","valuta_date":" ","text":" ","usage":" ","creditor_id":" ","owner":" ","iban":" ","bic":" ","amount':" "} |
/inputform | HTML form for manually categorize a booking. After submitting the categorize path is used. | GET | - |
/addbooking | Adds a booking to the training set and retrains the classifier model. Note: Needs the training set which is not part of this repository! |
POST | {"category":" ", "booking_date":" ","valuta_date":" ","text":" ","usage":" ","creditor_id":" ","owner":" ","iban":" ","bic":" ","amount':" "} |
/feedback | Adds an unrecognized booking to the training set and retrains the classifier model. The category must be provided by the user. Note: Needs the training set which is not part of this repository! |
POST | {"category":" ", "booking_date":" ","valuta_date":" ","text":" ","usage":" ","creditor_id":" ","owner":" ","iban":" ","bic":" ","amount':" "} |
For '/feedback' there also need to be a running monogodb at localhost to temporarily save the bookings.