ComedyBot is a tool designed to transcribe audio files and analyze them for laughter. It utilizes advanced GPT models to perform these tasks, providing a user-friendly interface for metadata input and outputting results in a specified directory.
- OSX: Minimum version Sonoma 14.6.1
- Homebrew: Recommended for package management. Install Homebrew
- Python: Tested with Python 3.12
- GitHub: Ensure GitHub is installed and you have an active GitHub account
-
Open Terminal: Launch the terminal application on your Mac.
-
Clone the Project:
git clone git@github.com:harryf/comedybot.git
-
Navigate to the Project Directory:
cd comedybot
-
Create a Python Virtual Environment:
- Use the built-in Python 3.x method:
python3 -m venv venv
- Use the built-in Python 3.x method:
-
Activate the Virtual Environment:
- Run the following command:
source venv/bin/activate
- Run the following command:
-
Install the Required Packages:
- Use pip to install the dependencies listed in
requirements.txt
:pip install -r requirements.txt
- Use pip to install the dependencies listed in
-
Prepare Input and Output Folders:
- Create an
input
and anoutput
folder, for example, on your Desktop.
- Create an
-
Add Audio Files:
- Copy your audio files (supported formats: .wav, .mp3, .m4a) into the
input
folder.
- Copy your audio files (supported formats: .wav, .mp3, .m4a) into the
-
Run the Transcription and Analysis:
-
Ensure you are in the
comedybot
directory and the virtual environment is active. -
Execute the following command, replacing
<yourname>
with your actual username:python ./comedy_set_analysis/audio_transcription_agent/audio_transcript_process.py -i /Users/<yourname>/Desktop/input -o /Users/<yourname>/Desktop/output
-
This process will download necessary GPT models, transcribe the audio, and analyze it for laughter. A GUI will prompt you to provide metadata, and the results will be saved in the
output
folder.
-
-
View the Transcript:
-
Copy the file
./transcript_viewer/index.html
into one of the output folders on your Desktop, for example:cp ./transcript_viewer/index.html /Users/<yourname>/Desktop/output/20241205_Comedy_Brew/index.html
-
Navigate to the output directory using the terminal:
cd /Users/<yourname>/Desktop/output/20241205_Comedy_Brew/
-
Start a simple HTTP server:
python3 -m http.server
-
Open your web browser and go to http://localhost:8000 to view the transcript.
-
Note: The process may take a significant amount of time depending on the length of the audio file.
- Ensure your system meets all the requirements before proceeding with the installation.
- The initial download of GPT models can be large, so ensure you have sufficient disk space and a stable internet connection.
- Whisper models are stored in the
~/.cache/whisper/
directory on your system. You may want to check this location if you need to manage disk space.
python -m spacy download en_core_web_md