IDP Radio
Getting Started
-
Clone this repository
-
Use the
download_dataset.sh
script to download the datasets you required. The default behavior of the script will result in downloading the "Chexpert Dev Dataset" and the "Chextxray14-NIH_256" dataset. Additional information about the datasets can be found in the wiki under Data Set & Preprocessing Methods
- Use
-f
to overwrite the existing dataset - Use
-s
to download the server versions ("Chexpert Full Dataset" and "Chextxray14-NIH_256") - Use
-a
to download all datasets and all their versions ("Chexpert Dev Dataset", "Chexpert Full Dataset", "Chextxray14-NIH_256" and "Chextxray14-NIH_512")
- Setup your environment variables in the .env file:
- Example for .env file is provided as ".env.example" in the repository's root folder.
- The easiest and most reliable way to set this up is to copy the example file using
cp .env.example .env
and edit only theGOOGLE_APPLICATION_CREDENTIALS
variable - Required variables are:
GOOGLE_APPLICATION_CREDENTIALS
: The path to the service account key json that gives access to the "idp-models" bucket on Google Cloud Storage. This will be required to upload/download models that have been trained and should be added to the pipeline as well as for executing the test pipeline. Additional information on the process for experiment logging can be found in the wiki under Experiment Logging. To get the service account key either contact @Kristian.Schwienbacher or if you have access to the "idp-server-1" go to the repository's root folder and navigage intogcp_auth
. There you will also find the service account key.CHEXPERT_DATASET_DIRECTORY
: The path to the chexpert data set that should be used in notebooks and scripts.CHESTXRAY14_DATASET_DIRECTORY
: The path to the chexpert data set that should be used in notebooks and scripts.- All paths can be absolut or releative to the repository's root folder
- Important: Make sure that the paths you provide for the datasets as environment variables match the directory that they are downloaded to. This should be the case for the
.env.example
file.
Deploy using Docker
-
Clone this repository
-
Install Docker
-
Build the Docker Image:
docker build --tag idp-radio .
This will build a docker image based on the tensorflow image, install all dependecies and download the dataset into the container.
- Start and run commands in the Docker container:
docker run -d -v $PWD:/srv/idp-radio-1 --name radio --gpus all idp-radio
-d
runs the container in detached mode-v
mounts the src folder into the docker container--name radio
sets the name of the container toradio
--gpus all
enables all GPUs- The working directory inside the container is
/srv/idp-radio-1
- To access Jupyter Lab/Notebook or Tensorboard you need to get the public url for the ngrok tunnel. The most comfortable way to get this is to look at the logs of the docker container using:
docker logs radio
There you will find an entry from where you can copy & paste the public url for the service you'd like to access. The entry is generated every 5 minutes and should look like this:
Retrieving open ngrok tunnels...
URLs open for the following services: [('jupyterlab', 'http://abcde12345.ngrok.io'), ('tensorboard', 'http://abcde12345.ngrok.io'), ('jupyternotebook', 'http://abcde12345.ngrok.io')]
- To start an interative bash to run commands in the container use:
docker exec -it radio /bin/bash
To run with root access inside the conatiner use:
docker exec --user root -it radio /bin/bash
Run the demo application
Locally
-
Clone this repository
-
Install the dependencies
pip install -r requirements.txt
- Run the application
PYTHONPATH=/path/to/repo streamlit run app/main.py
- The application will run on port
8501
of your local machine
Using docker
-
Clone this repository
-
Install Docker
-
Build the docker image
docker build . -f Dockerfile-streamlit --tag idp1-demo
- Start the docker container
docker run -v $PWD:/srv/idp-radio-1 --name radio idp1-demo
- The application will run on port
8501
of your local machine