drivendataorg/nbautoexport

Configure "exclude" for the clean command

jayqi opened this issue · 0 comments

jayqi commented

Sometimes a user may have other files in their notebooks directory that are intentional. There isn't a good way for clean to anticipate arbitrary files, so we'll need a way for users to specify files to exclude/ignore.


Potential implementation

Here is a potential interface that lets users specify globs:

Passing in ad hoc to a clean command:

nbautoexport clean notebooks/ --exclude images/* --exclude README.md

Configuring in a way to be reusable:

nbautoexport configure notebooks/ -f script -b extension --clean-exclude images/* --clean-exclude README.md
{
  "export_formats": ["script"],
  "organize_by": "extension",
  "clean": {
      "exclude": [
        "images/*",
        "README.md"
      ]
    }
}

Then for a file tree that looks like:

notebooks
├──0.1-ejm-data-exploration.ipynb
├── script
│   └── 0.1-ejm-data-exploration.py
│   └── 0.2-ejm-features-creation.py
└── html
    └── 0.1-ejm-data-exploration.html
└── README.md
└── images
    └── plot.jpg

notesbooks/script/0.2-ejm-features-creation.py, 0.1-ejm-data-exploration.html/html/0.1-ejm-data-exploration.html will be marked for deleting.

notebooks/README.md and notebooks/images/plot.jpg will be safe.