Question - "threw error" <> 0 but no failures in plot
jonysafi opened this issue · 1 comments
Hi,
First, great job on the package. I have been testing it the whole week and want to adopt it for a lot of things I am doing.
I have the following scenario:
I am building a set of rules to validate a dataframe (source is csv file).
The dataframe structure is:
> str(df)
'data.frame': 585 obs. of 6 variables:
$ dt_collect : POSIXct, format: "2019-09-20" "2019-09-21" "2019-09-22" "2019-09-23" ...
$ attrib : chr "A" "A" "A" "A" ...
$ num1 : num 0.99 1 1.01 0.99 0.99 1 1 0.96 0.94 0.95 ...
$ num2 : num 21.9 22.4 22.7 22.8 22.9 ...
$ num3 : num 2.22 2.3 2.2 2.19 2.15 2.56 2.75 2.96 3 2.97 ...
$ num4 : num 22.1 22.6 22.8 22.9 22.9 ...
The rules look like:
rules <- validator(is_unique(attrib, dt_collect),
length(unique(dt_collect)) >= 365,
in_range(dt_collect, min = min(dt_collect), max = max(dt_collect)),
is.POSIXct(dt_collect),
is.character(attrib),
is.numeric(num1),
is.numeric(num2),
is.numeric(num3),
is.numeric(num4))
When I run the confront:
out <- confront(df, rules)
I see the below:
Object of class 'validation'
Call:
confront(dat = df, x = rules)
Rules confronted: 9
With fails : 0
With missings: 0
Threw warning: 0
**Threw error : 1**
The df_out <- as.data.frame(summary(out)) shows the below:
name items passes fails nNA error warning expression
1 V1 585 585 0 0 FALSE FALSE is_unique(attrib, dt_collect)
2 V2 1 1 0 0 FALSE FALSE length(unique(dt_collect)) >= 365
3 V3 585 585 0 0 FALSE FALSE in_range(dt_collect, min = min(dt_collect), max = max(dt_collect))
4 V4 0 0 0 0 **TRUE** FALSE is.POSIXct(dt_collect)
5 V5 1 1 0 0 FALSE FALSE is.character(attrib)
6 V6 1 1 0 0 FALSE FALSE is.numeric(num1)
7 V7 1 1 0 0 FALSE FALSE is.numeric(num2)
8 V8 1 1 0 0 FALSE FALSE is.numeric(num3)
9 V9 1 1 0 0 FALSE FALSE is.numeric(num4)
The plot(out) comes in green.
My question is: What threw an error? is the is.POSIXct? if so (where TRUE is mentioned) then what is the error and why it is not reported clearly?
The dt_collect comes as chr from the csv file, I can use as.Date or as.POSIXct to work with is. However, both is.Date and is.POSIXct in the rules shows the same behavior mentioned above.
Thank you,
Hi, please try
errors(out)
to get the explicit error message. You can also get the error thrown immediately, by doing
confront(df, rules, raise="all")