A program that allows you to classify arbitrary data ("records") into a tree structure of categories and calculate certain aggregations on them.
The program takes as input a configuration YAML file. This file defines:
- the data source (csv file, etc.)
- the data fields of each record
- the category tree definition
- the aggregations to be calculated
- the output format (json, html, etc)
Every category is assigned a "condition" which is a boolean expression that determines whether a certain record matches that specific category.
Currently, the program supports reading records from a local CSV file, and produces as output either a JSON file or an HTML file containing the output data and a simple UI ("Category Explorer") that allows the use to "navigate" throughout the category tree.
To run the program simply execute the JAR file with one single command line parameter pointing to the configuration YAML file. Any further customizations are defined within the configuration file itself. E.g.:
# java -jar record-classifier.jar configuration.yml
Every category has a condition which is a boolean expression in a simple domain specific language, for example:
someField == 'some value' && ! someOtherField contains 'something'
Equality
someField = 'some value'
someField equalsIgnoreCase 'sOmE vAluE'
Substring matching
someField startsWith 'abc'
someField endsWith 'xyz'
someField contains 'klm'
Logical conjuction
someField == 'abc' && someOtherField == 'xyz'
Logical disjunction
someField == 'abc' || someOtherField == 'xyz'
Logical negation
! someField == 'abc'
Expressions are by default evaluated from left to right. To change the order of evaluation parentheses can be used.
someField == 'abc' && (someOtherField == 'klm' || someOtherField == 'xyz')
true
false
In the examples directory you can find some examples with dummy data and configuration.
This is a screenshot of what the generated UI can look like:
- Add regular expression operator
- ParseException should contain meaningful messages
- Date formatting
If you have any question or need help please email me at: giannis.tsakiris [at] gmail.com