[BUG] Row based output incorrect when using satisfies check and assertion with upper bound < 1
arsenalgunnershubert777 opened this issue · 3 comments
Describe the bug
When using satisfies check, the columnar row based output seems unexpected based on the assertion being passed in. This specifically occurs when assertion has bound where upper bound < 1.
To Reproduce
Steps to reproduce the behavior:
- Create custom check using satisfies, with some sql column condition.
- Pass in an assertion function with bounds where the upper bound < 1
- Run check on input dataframe where some rows pass and some rows fail the column condition.
- The row based output when calling rowLevelResultsAsDataFrame will show all rows as false/fail
Code:
Check(CheckLevel.Error, id.value)
.satisfies(
sqlColumnCondition,
"name",
(d: Double) => d > 0 && d < 1.0
)
Output row based dataframe:
+-----+------+------+
|index|values|result|
+-----+------+------+
| 1| blue| false|
| 2| green| false|
| 3| blue false|
| 4| red| false|
| 5|purple| false|
+-----+------+------+
- However, if the assertion bounds is adjusted where the upper bound < 1.1 (instead of 1), then the row based results look correct
Code:
Check(CheckLevel.Error, id.value)
.satisfies(
sqlColumnCondition,
"name",
(d: Double) => d > 0 && d < 1.1
)
Output row based dataframe (this is correct behavior):
+-----+------+------+
|index|values|result|
+-----+------+------+
| 1| blue| true|
| 2| green| true|
| 3| blue| true|
| 4| red| false|
| 5|purple| false|
+-----+------+------+
Expected behavior
The row based output should show rows that passed and rows that failed based on the columnCondition and shouldn’t be impacted by the assertion. The row based output shouldn’t show every row as false when there are certain rows that passed the columnCondition. The correct example is the one shown directly above.
Screenshots
N/A
Additional context
This row output issue may be due to this line from Verification result constraintResultToColumn. I'm not sure if that line is needed for some other functionality.
Also, the overall verification result check status (Success or Error) seems to be working correctly.
Thanks for the help!
- Result column's value is not based on assertion
Result column's value is depends on sqlCondition
Hi @Sat30 thanks for the response, can you clarify what you mean by those bullet points?
yes the row level result should be dependent on sqlCondition only, but when changing the assertionFunction the result gets affected when it shouldn't be
Thank you so much for reporting this issue. It has been fixed as part of PR #553
We will be releasing this to Maven as part of our next release cycle.