mitre/menelaus

Update examples to use "source" jupyter notebooks

Closed this issue · 1 comments

  • One jupyter notebook per module (data_drift, concept_drift)
    • The jupyter notebook has pretty formatting, in-line display of tables, figures, etc. We should be able to include the html files via sphinx.
      • Confirm that including an nbconverted html file is easy to do via sphinx.
        • Seems to be possible to include notebooks directly via nbsphinx, which will execute the notebooks upon creating the docs.
        • nbconvert can be used to convert notebooks into notebooks, which would allow us to filter out certain tags from a given notebook and automatically generate a new one. Might end up needlessly elaborate..
      • Figure out what all nb_extensions we need and how to include them in the setup.cfg
    • each jupyter notebook uses the "tag" feature to specify cells corresponding to example (e.g. "all_examples" and "ADWIN" tags)
    • use nbconvert with the tag option to convert the single jupyter notebook
      • Mock up a script that can be added to the pipeline to do this conversion. May be able to base this on an example. Remember to mark this with @pytest.mark.no_cover so that it doesn't inflate the coverage statistics.
    • Switch the .py scripts to jupyter, using the tags.
  • Update the README and setup.cfg, with instructions on how to use:
    • default install includes the visualization dependencies necessary to run each example.py script
    • barebones install is bare minimum dependencies, no visuals.
    • dev install includes the above, but also sphinx and everything else.
    • For the non-dev configs, look into not downloading e.g. docs, tests, etc., as they're a waste of bandwidth.

This lets us maintain the examples in jupyter notebooks and have them be pretty, so that someone can just read the documentation and see what stuff does, while also allowing the user to run stuff without forcing them to install jupyter.

It looks like this isn't so easy to do within "a pipeline" proper, since that runner isn't and shouldn't be in a position to write to the repo without a good bit of magic. We probably also don't want to burn through our free minutes of GH actions on this.

For now, writing up the current scripts as notebooks, and then using nbconvert to spit out .py files and committing both should be sufficient. If they diverge, we can overwrite as we go if needed.

Maybe we can add a README.md to the examples and/or docs folder that contains the example nbconvert commands.