smnorris/bcfishobs

slight inconsistencies

Opened this issue · 3 comments

Are the (very) minor changes to qa_summary.csv outputs due to imprecise queries, or is it changes to the source data?

A diff of two qa_summary.csv files generated with the same input data does show changes - something in the queries is not quite consistent:

match_type,n_distinct_events,n_observations
A. matched - stream; within 100m; lookup,54359,171666lookup,54355,171666
B. matched - stream; within 100m; closest stream,6869,20368
C. matched - stream; 100-500m; lookup,4444,30257lookup,4447,30257
D. matched - waterbody; construction line within 1500m; lookup,11903,116754lookup,11901,116754
E. matched - waterbody; construction line within 1500m; closest,1386,16450closest,1389,16450
TOTAL MATCHED,78961,355495
F. unmatched - less than 1500m to stream,1745,5339
G. unmatched - more than 1500m to stream,103,725

Yet if the queries are run several times in succession, 3rd and 4th outputs are both consistent with 2nd.

But after a fresh download there are more inconsistencies. The total number of observations is consistent but the number of distinct events created shifts slightly.