Remarks + possible improvements/extentions from v1 PRs
Opened this issue · 2 comments
Compiled feedback, remarks, and the suggested possible improvements and extensions, gained from the v1 PR reviews.
Misc
Remove pandas
dep to make pure python package
"worth considering using the stdlib CSV writer here to avoid the v heavyweight pandas dependency"
#13 (comment)
Interface
Always pass extendable run_context
object to Validations
/Models
during run
This will replace the specific artefact_directory
functionality, and instead the artefact_directory
will be made available (if the user sets it) through the context object.
Agreed to implement this once we come across the next piece of context beyond just artefact_directory
. I think I would call it run_context
to make it clear it is specific to the running and not specific to configuring Validations
or Models
.
See #17 (comment) and #17 (review)
Make store functionality be fully implementable by user
Probably by passing a callback function into run
, or by returning the results instead of storing them within run
#16 (comment)
Move validation
instantiation outside of the loop so each is only instantiated once per run
"Is there an argument for putting the validation_spec.make() outside of the loop over model specs? I would have thought that a single validation instance could be shared by all models. Also building the validation environment could be an expensive step so we'd want to avoid repeating unnecessarily"
#13 (comment)
Stronger typing and validation
Docs
In example usage emove validations as factories in favour of just using bare functions
Example project structure
I think this is a worthy addition. We could have a few different "example projects" within the docs. Let's do this once we've got some real world example usage.
#18 (review)
More detailed docs on model/validation registration
Another thought - it might be useful to think about how to 'package' a benchmark built on kotsu.
i.e. if you wanted to develop some ML/data science benchmark and make the code public, so that others
can run / extend it, and want to use kotsu to build it, is there some standardised project structure that
would work for this use case?
@alex-hh yea that would be super useful. Will mull it over. (Was thinking about it already, thought about packaging up a benchmark as a pypi package, but without having all the deps pinned exactly, some (more) anxiety creeps in that the benchmarks wouldn't be reproducible. Will mull some more. Super open to ideas on it so do share if stuff appears to ya)