man-group/ArcticDB

Cannot re-use QueryBuilder if date range also specified

Closed this issue · 0 comments

Describe the bug

If you make two read calls, both with a date range specified, and re-use the query builder for both calls, you'll get the error: "ArcticException: In QueryBuilder.prepend: Date range and Resample only supported as first clauses in the pipeline". This error has only started as of version 4.5.0.

Steps/Code to Reproduce

df = pd.DataFrame({"A": [0, 1, 2]}, index=pd.date_range("2024-01-01", "2024-01-03"))
lib.write("dummy", df)

q = QueryBuilder()
q = q[q["A"] > 1]

lib.read("dummy", date_range=(datetime(2024, 1, 2), datetime(2024, 1, 3)), query_builder=q)
lib.read("dummy", date_range=(datetime(2024, 1, 2), datetime(2024, 1, 3)), query_builder=q)

Expected Results

The second call gives the error "ArcticException: In QueryBuilder.prepend: Date range and Resample only supported as first clauses in the pipeline". This doens't feel intuitive - the read call shouldn't modify the QueryBuilder passed by the user. Perhaps a solution would be to clone the existing QueryBuilder and prepend the DateRangeClause onto that?

For now, I can get around this issue by doing:

q = QueryBuilder()
q = q[q["A"] >= 1]
q.prepend(QueryBuilder().date_range((datetime(2024, 1, 2), datetime(2024, 1, 3))))

and not specifying a date_range in the lib.read calls. But would be good if a fix was made. Thanks!

test_lazy_collect_twice_with_date_range in test_lazy_dataframe.py is xfailed on this issue, and should have the pytest mark removed if #1703 is merged before this is fixed.

OS, Python Version and ArcticDB Version

Python: 3.9.17 (main, Jun 6 2023, 20:11:04)
[GCC 9.4.0]
OS: Linux-5.4.0-51-generic-x86_64-with-glibc2.31
ArcticDB: 4.5.0

Backend storage used

MINIO

Additional Context

No response