Quick tutorial on deep learning for Introduction to Data Science
Two notebooks showing simple deep learning models;
-
Regression
-
Image classification
Note: deep learning environments are large, if you want to speed up your conda commands, you can use [mamba]. See Installation steps 1-4 of this repo
In order to set up the necessary environment:
- Download the repository by clicking the big green [code] button on the github page, and downloading the ZIP archive.
Note: Repositories act as distributed backups and make collaboration on programming projects more efficient, so learning to use git and github together is really useful in research.
-
Unzip the archive to an easy-to-locate place
-
Open the anaconda prompt
-
Print your current directory to see where the prompt starts
pwd
-
Change the directory to your easy-to-locate-place
cd easy-to-locate-place
-
create a local environment stored in
.env
with the help of conda:conda config --set channel_priority strict
conda env create -f environment.lock.yml -p ./.env
Note: the following command assumes you left channel_priority as its default
conda config --set channel_priority flexible
-
activate the new environment with:
conda activate ./.env
-
run jupyter lab
jupyter lab
Then take a look in the notebooks
folders.
Note: I left this here for students to see how to properly utilize conda, but it is not necessary for the tutorial.
- Always keep your abstract (unpinned) dependencies updated in
environment.yml
and eventually insetup.cfg
if you want to ship and install your package viapip
later on. - Create concrete dependencies as
environment.lock.yml
for the exact reproduction of your environment with:For multi-OS development, consider usingconda env export -p ./.env -f environment.lock.yml
--no-builds
during the export. - Update your current environment with respect to a new
environment.lock.yml
using:conda env update -p ./.env -f environment.lock.yml --prune
├── LICENSE.txt <- License as chosen on the command-line.
├── README.md <- The top-level README for developers.
├── configs <- Directory for configurations of model & application.
├── data
│ ├── external <- Data from third party sources.
│ ├── interim <- Intermediate data that has been transformed.
│ ├── processed <- The final, canonical data sets for modeling.
│ └── raw <- The original, immutable data dump.
├── environment.yml <- The conda environment file for reproducibility.
├── environment.lock.yml <- The _pinned_ conda environment file for emergencies.
├── models <- Trained and serialized models, model predictions,
│ or model summaries.
├── notebooks <- Jupyter notebooks. Naming convention is a number (for
│ ordering), the creator's initials and a description,
│ e.g. `1.0-fw-initial-data-exploration`.
├── pyproject.toml <- Build configuration. Don't change! Use `pip install -e .`
│ to install for development.
├── references <- Data dictionaries, manuals, and all other materials.
├── reports <- Generated analysis as HTML, PDF, LaTeX, etc.
│ └── figures <- Generated plots and figures for reports.
├── setup.cfg <- Declarative configuration of your project.
└── src
└── deep_learn_tutorial <- Actual Python package where the main functionality goes.
This project has been set up using PyScaffold 4.5 and the dsproject extension 0.7.2.