NOTE: All commands listed in this readme assume your working directory is the project root.
All files for the benchmarks are contained in the benchmarks directory.
To get everything working, you only need to make edits in whisper_benchmarks.py.
All lines of interest in this file are marked with a #TODO
comment.
You can run the tests located in tests/test_benchmarks.py with the following command:
pytest
Once all the tests are green, you should be good to go. Now you can run the benchmarks by invoking:
python3 -m benchmarks
This should take 2-3 minutes.
Once the benchmarks are done, you can visualize the results.
Open the notebook in benchmark_analysis.ipynb and execute all cells.
You should now see some plots with your results.
Make sure to checkout the
solutions/benchmark
branch before you continue.git stash git checkout solutions/benchmark
The KISZ CONDENSOR is a web based tool that can summarize audio files in text form.
It uses OpenAI Whisper to transcribe the audio and the Huggingface summary
pipeline with long-t5
to create a summary.
The frontend is implemented in HTML5 with bootstrap CSS and jquery.
The backend uses fastapi.
Everything is orchestrated using docker compose.
All files for this part of the exercise can be found in condensor.
Open app.py.
In this file you will find the summarize
function, which handles the POST requests made from the frontend.
As in the previous exercise, your task is to fill in the missing TODOs.
You can run the tests located in tests/test_app.py with the following command:
pytest
Once all the tests are green, you should be good to go. Now you can run the app by invoking:
uvicorn condensor.app:app
This should start up a web-server on port 8000. (If you are working on one of the vms, VS-Code should automatically forward this port to your local machine.)
Now you can open your browser and navigate to: http://localhost:8000
You can test the app, by using the example file provided in benchmarks/examples/10-min-talk.mp3.
We have preinstalled docker, docker compose and the NVIDIA Container Toolkit on the laptops and vms. So for now we will just focus on the configuration and usage of docker and docker compose.
We have provided a complete docker-file for you, but the docker-compose.yaml is missing a few important lines.
Use the following command to build or pull all necessary containers and start them:
docker compose up --always-recreate-deps --build
If you exposed the correct port, you should now be able to access the app on port 8000, as before: http://localhost:8000
Make sure to shut down your local development server first, to free the port for use with docker.
If you did not configure the GPU everything will still work, except much slower. To check that the GPU is in fact used in your container, you can run the following command:
nvidia-smi
or
watch -n 1 nvidia-smi
to continuously monitor the GPU.
Alternatively you can also use
nvidia-smi --query-gpu=utilization.gpu,memory.used --format csv --loop=1
which will just print GPU utilization and memory usage.
Now, upload a file and keep a look on Memory-Usage and GPU-Util. These will tell you if something (your app) is using the GPU. If nothing happens, double check the configuration in your docker compose file.
As you should have seen in Step 3, docker does not automatically pass a GPU to your container. You need to tell docker to use the nvidia runtime and which gpus you want to pass to the container.
Go to https://docs.docker.com/compose/gpu-support/ to learn how to pass a GPU to a docker container in docker compose.
Repeat steps 2 and 3 to restart the container and check that it now actually used the GPU.
docker compose up --always-recreate-deps --build
Before you continue checkout the solutions/docker branch
git stash git checkout solutions/docker
Alternatively make sure you have specified a port range in your docker-compose file, such as:
ports:
- "8000-8010:8000"
You can start multiple containers with
docker compose up --scale app=3
and access them at port 8000, 8001, ...