Roadmap for Dataframe API
MarcoGorelli opened this issue · 1 comments
MarcoGorelli commented
Let's try to get a roadmap together. I'm a little worried that at the current pace it'll take another year until we can publish a non-beta spec. That's too long. So let's zoom out and think about what we'd like to achieve, and if this can help re-prioritise.
Here's some milestones I'd like to aim for:
- by the end of the year: merge (or achieve some other resolution) on the following topics:
- ✅ have a Scalar class
- ✅ cross-dataframe column comparisons
- by February 2024
- tag the first non-beta version
- by April 2024, make sure the spec and
dataframe-api-compat
are complete enough that it's possible to rewrite the majority of some dataframe-consuming library using the standard - by November 2024, have production-ready implementations of the standard for all libraries involved
If we want to achieve the above, then we need to turn things around. In particular, this may mean not getting lost in details - in particular, I suggest punting on:
- propagation of persistedness (including whether this should be done at all)
- whether
Scalar.__bool__
is allowed to raise
and leaving these implementation-specific for now. If we do manage to turn things around and make good progress on the roadmap above, we could (and probably should!) bring these up again