Improve calculated columns logic/flow
Opened this issue · 0 comments
junaidzm13 commented
A few of the issues:
- missing data in string functions is used as 'null', which isn't what a user would expect.
- we try casting Boolean values to Int at some places which is not gonna work.
- some of the clauses have incorrect data types.
- math based clauses i.e. SubtractClause do not handle non-numeric types (should error gracefully)
- LessThanClause.apply is creating a GreaterThanClause object
Also current error handling is inconsistent, e.g. if an error occurs on Double type we return Double.NaN, but in other places we return a string error message regardless of the data type. Which could mean we're assigning string values to a column with non-string data type. Which would eventually blow up.
Major reason for all the above is that we have low unit-test coverage. This ticket deals with simplifying the logic, introducing better typing support, improving error handling and massively increasing the test coverage.