/Text-from-Audio-for-Justice

Takes an audio recording and transcribes it into text document so that an user can annotate it. The app then can process the annotated document into audio clips and a document with hyper-links to the clips.

Primary LanguagePython

Text-from-Audio-for-Justice

  • Clean up audio file using Audacity
  • Make tools:
    • Clean up audio file using pydub and sox see video below
    • Record file locations
    • Make segments from transcription file
    • Make training files from transcription and other files:
      • text: utt_id word1 word2 word3..
      • segments: utt_id file_id start_time end_time
      • wav.scp: file_id path/file
      • utt2spk: utt_id spkr
      • spk2utt: spkr utt_id1 utt_id2 utt_id3
    • Transcribe speaker files
    • Spike - compare original transcription with speaker transcription
    • Make output rts, word etc files to be used for training and communication

Useful Links

Getting Started

  • Install ffmpeg
  • Install docker if not already installed
  • Download Kaldi docker image (message/email for link)
  • Docker load image
  • List docker images and containers
  • Give docker maximum resources to run
  • Run taj

Tools

taj transcribe
    --audio_input_folder
    --output_folder

With:
    wav.scp: chunk file paths (files extracted from segments)
    text: (line for each chunk)
    segments: (links text to chunk file and start and end time)

taj chunk_speaker
    --audio_input_path
    --speech_segmentation_path
    --output_folder

taj clean_up
    --audio_input_folder (original recording)
    --audio_output_folder

taj convert
    --type (either rts, pdf, doc)
    --online_folder (url of online folder)
    --chunks_text_path
    --output_folder

taj create_test_data
    --input_folder
    --output_folder
    --audio_input_folder (original recording(s))

taj retrain
    --input_folder
    --audio_input_folder (original recording(s))