/comedybot

Primary LanguagePython

ComedyBot

ComedyBot is a tool designed to transcribe audio files and analyze them for laughter. It utilizes advanced GPT models to perform these tasks, providing a user-friendly interface for metadata input and outputting results in a specified directory.

Requirements

  • OSX: Minimum version Sonoma 14.6.1
  • Homebrew: Recommended for package management. Install Homebrew
  • Python: Tested with Python 3.12
  • GitHub: Ensure GitHub is installed and you have an active GitHub account

Installation

  1. Open Terminal: Launch the terminal application on your Mac.

  2. Clone the Project:

    git clone git@github.com:harryf/comedybot.git
  3. Navigate to the Project Directory:

    cd comedybot
  4. Create a Python Virtual Environment:

    • Use the built-in Python 3.x method:
      python3 -m venv venv
  5. Activate the Virtual Environment:

    • Run the following command:
      source venv/bin/activate
  6. Install the Required Packages:

    • Use pip to install the dependencies listed in requirements.txt:
      pip install -r requirements.txt

Usage

  1. Prepare Input and Output Folders:

    • Create an input and an output folder, for example, on your Desktop.
  2. Add Audio Files:

    • Copy your audio files (supported formats: .wav, .mp3, .m4a) into the input folder.
  3. Run the Transcription and Analysis:

    • Ensure you are in the comedybot directory and the virtual environment is active.

    • Execute the following command, replacing <yourname> with your actual username:

      python ./comedy_set_analysis/audio_transcription_agent/audio_transcript_process.py -i /Users/<yourname>/Desktop/input -o /Users/<yourname>/Desktop/output
    • This process will download necessary GPT models, transcribe the audio, and analyze it for laughter. A GUI will prompt you to provide metadata, and the results will be saved in the output folder.

  4. View the Transcript:

    • Copy the file ./transcript_viewer/index.html into one of the output folders on your Desktop, for example:

      cp ./transcript_viewer/index.html /Users/<yourname>/Desktop/output/20241205_Comedy_Brew/index.html
    • Navigate to the output directory using the terminal:

      cd /Users/<yourname>/Desktop/output/20241205_Comedy_Brew/
    • Start a simple HTTP server:

      python3 -m http.server
    • Open your web browser and go to http://localhost:8000 to view the transcript.

Note: The process may take a significant amount of time depending on the length of the audio file.

Additional Information

  • Ensure your system meets all the requirements before proceeding with the installation.
  • The initial download of GPT models can be large, so ensure you have sufficient disk space and a stable internet connection.
  • Whisper models are stored in the ~/.cache/whisper/ directory on your system. You may want to check this location if you need to manage disk space.

Further Stuff

python -m spacy download en_core_web_md