CI and packaging suggestions
Opened this issue · 3 comments
Some suggestions on packaging and CI, for discussion.
development flow
- move to GitHub flow,
main
plusdev
, and release on GitHub trigger - requires changing the GHA trigger in
release.yml
workflow toon: release: types: [published]
(that's it) - ensure CI triggers on PR to
main
, I think that's anyway how it is at current
packaging
- I would move all information from
setup.py
topyproject.toml
, it is easier to maintain there - should
__version__
not be exported bysequence
, e.g., insequence.__init__
? Right now the version number issequence.__version__.__version__
, but it is espected by a single__version__
afaik
testing
I would refactor the tests to make the extension contract simpler, i.e., not require users to add a test, at the same level of coverage. This could be done as follows:
SequenceTestSuite
is already great, but it requires the developer to add a test in a file somewhere.- instead, it should automatically collect all sequences from
sequentium
, so developers need to add only the sequence class - the
ground_truth
data etc could be added to the sequence class itself, as it is specific to the sequence class. It is useful not just for testing, but can be used for caching, in cases where numbers are hard to compute.- on a side note, sequences can have a cache mode and a compute mode which could be switched up to configuration. State-of-art frameworks which handle sequences - e.g., digits of pi - such as wolfram/mathematica do this as well, it's a good idea.
requires changing the GHA trigger in release.yml workflow to on: release: types: [published] (that's it)
This trigger make the release happens when the PR is closed (and mergerd) ?
I would move all information from setup.py to pyproject.toml, it is easier to maintain there
Yes I was thinking in doing that and possibly use poetry to release
should version not be exported by sequence, e.g., in sequence.init? Right now the version number is sequence.version.version, but it is espected by a single version afaik
Probably it is better just to have the version number in the init.py
About Testing:
My idea was more or less as you said: there is a file (yml or toml or python script) where you add just the informations about your sequence (name, class name, is_finite, ground_truth) and then a script create all the tests using the SequenceTestSuite, i.e. for every sequence a sequenceTestSuite is triggered. But i didn't menage to make it happen. So you have to add your test class extending from SequenceTestSuite
.
This trigger make the release happens when the PR is closed (and mergerd) ?
No, it happens when you go in the "releases" manager and press the green button on a release, details here:
https://docs.github.com/en/repositories/releasing-projects-on-github/managing-releases-in-a-repository
It's a separate, graphical user interface based workflow, so it is near impossible to trigger by accident (e.g., merge/push/git console magic).
But i didn't menage to make it happen.
I wrote machinery for this idea precisely in scikit-base
, it's fully automated and applicable to any base class with a few lines, see example in #45 (tsbootstrap
). In scikit-base
, the magic happens in skbase.testing
boilerplate, it collects all classes from the package and runs them through a test class via pytest
fixtures.
We use this in sktime
and skpro
for regular testing, and 3rd parties running the check_estimator
utility for checking 3rd party module extensions use this under the hood, so it's quite near the end of the debugging curve (although "optimization" is still ongoing).