Inspired by Andrej Karpathy's work for the podcast of Lex Fridman, I made the transcripts for my favorite The CSS Podcast show.
Here is the summary of the process:
- Parse the RSS of the podcast to get the episode audio URLs, episode titles, and episode slugs.
- Use OpenAI Whisper's large model to transcribe the audios (just give the model the audio URLs and it will handle the rest).
- Parse the output of the model and print them in a human-readable format.
To find out more, visit this Jupyter Notebook.
Created by masonjnguyen.com. Released under the MIT license.
This project uses the document template by Nextra.