Maria-Liakata-NLP-Group/long

Bug: prep_gptchat_by_user_source fails for unknown reason

Opened this issue · 1 comments

Expected behaviour

Each call to source.get_aggregation should return a new independent dataframe. The order of the tests should not matter.

How to reproduce

As a minimal example, this example passes:

for thread_id in [211, 97, 41]:
    by_thread_all = source.get_aggregation(
        entity_group_by="thread_id",
        time_grouper=day_grouper,
        entity_permitted_values=[thread_id],
    )

However, by simply reordering the list, the example fails:"

for thread_id in [97, 41, 211]:
    by_thread_all = source.get_aggregation(
        entity_group_by="thread_id",
        time_grouper=day_grouper,
        entity_permitted_values=[thread_id],
    )

Actual Result

The second example above results in ValueError: all keys need to be the same shape which is raised by the underlying grouped.count() method.

Currently test_prep_gptchat_by_user_source is marked with @pytest.mark.xfail