/repnots

Primary LanguagePythonMIT LicenseMIT

Analysis Framework

Design

The first idea was to have only a series of notebooks which are providing an exhaustive description of an analysis.

PB: Reproducibility / Claritity interface.

  • How to integrate containers in jupyter ?

    > Should we have a specific homemade container per notebook

    >> This hinder part of the interest of containers being “atomical”

    > Only use conda environments (ie one conda environments per notebooks)

    >>> Maybe can be a structure could be, one singularity container with all the conda environments of the analysis.

  • How to deal with computing ressources demanding parts of the analysis: in notebook ? as snakemake rules ?
  • Structure: Maybe each analysis notebooks could be separeted in a:
    • prior: All requirements needed for the analysis
    • main: The actual analysis
    • Most ressources demanding computations > In snakemake only ? > easyer when migrating to cluster.
  • Testing:

    I see 2 ways of testing:

    • Using classical pytest with against output files created by a particular notebook
    • Using papermill and scrapbook from nteract (this look more cumbersome, ie have to add code to the notebook cells we want to keep the output)
    • Another option would be to perform test on the jupytext files (not the notebooks) ?

Dev

Premises [2/2] :0.1:

  • [X] Setup a snakemake file launching a jupyter notebooks with papermill using a configuration file
  • [X] Define an primary organisation for the jupyter notebooks:
    • 1 for summary
    • 1 for data downloads
    • 1 for general report/figures/tables

Add html export :0.2:

Add singularity support [2/2] :0.3:

  • [X] Add container functionality (singularity) in a notebook
  • [X] Add container functionality in a snakemake rule

    > Still fighting with install of singularity new version (install process looks ugly and dependending on .bashrc !)

    > Fixed with AUR, check ~/org/my_manual.org

Use CookieCutter and add first documentation :0.3:

Using CookieCutter and Sphinx

Add preliminary tests :0.3.1:

Implement jupytext light script as templates :0.3.2:

Centralize papermill rule in snakemake file :0.3.3:

  • Actually, the rules added using jinja templates could be replaced by one “papermill rule” which would run papermill on each notebook. Check how to refactor that.

    > Now implemented in advanced_snakefile in templates

    > But maybe better in case we want to trace the input/outputs of each notebooks through snakemake later on.

    >> So let both option in form of a basic_ and advanced_snakemake file.

NEXT

  • Create an index/webpage linkin all html ( a jupyter notebook ? a django server ?)
  • Integrate voila server ?

Metadata

Data

Bibliography