Aarhus-Psychiatry-Research/psycop-common

feat: ValidatedFrame when we need to return frames

Closed this issue · 0 comments

Makes it much easier to:

  • Interpret the code statically
  • Ensure column names match what is expected
  • Document further expectations of the data
    • Column exists
    • Column type
    • Other checks which we can implement (is unique etc.)

Downsides:

  • Makes the code more rigid

Alternatives:

  • Pandera, but the API is weird, and doesn't support Polars

Implementation

ValidatedFrame is a dataclass with a post_init that runs all the validation. This means:

  • Check that all attrs with col_name exist
  • Check that all attrs with _schema match the schema