Paper | Website | Dataset - Raw | Dataset - Summarized
WavePulse is an end-to-end framework for recording, transcribing, and analyzing radio livestreams in real-time. It processes multiple concurrent audio streams into timestamped, speaker-diarized transcripts and enables content analysis through state-of-the-art AI models.
- Record multiple concurrent radio streams
- Convert speech to text with advanced speaker diarization
- Classify content (political vs non-political, ads vs content)
- Detect emerging narratives and track their spread
- Analyze sentiment and opinion trends
- Interactive visualization dashboard
- Linux-based OS (tested on Ubuntu 20.04+)
- Python 3.8 or higher
- CUDA-enabled GPU (1 per 50 streams)
- At least 16GB RAM
- Storage space proportional to number of streams (2GB/stream - only transcription)
- Hugging Face account with API token
- Google AI Studio account with API key
- FFmpeg
- CUDA toolkit (if using GPU)
- Other Python packages (specified in requirements.yml)
# Clone repository
git clone https://github.com/mittalgovind/wavepulse.git
cd wavepulse
# Set up conda environment
conda env create -f requirements.yml
conda activate wavepulse
# Install ffmpeg
sudo apt update && sudo apt install ffmpeg
- Get a Hugging Face token at https://huggingface.co/settings/tokens
- Accept model agreements:
- Get a Google AI Studio API key at https://ai.google.dev/aistudio
Edit assets/weekly_schedule.json
:
[
{
"url": "https://example.com/stream",
"radio_name": "WXYZ",
"time": [
["08:00", "14:00"],
["17:00", "22:00"]
],
"state": "NY"
}
]
Full pipeline with all features:
python src/wavepulse.py --diarize \
--hf-token YOUR_HUGGINGFACE_TOKEN \
--gemini-api-key YOUR_GEMINI_KEY
Without content classification:
python src/wavepulse.py --diarize \
--hf-token YOUR_HUGGINGFACE_TOKEN \
--stop-classification
Basic recording and transcription only:
python src/wavepulse.py --stop-classification
All Parameters : ARGS.md
Control individual components with these flags:
--stop-recording
: Disable audio recording--stop-transcription
: Disable transcription--stop-classification
: Disable content classification--diarize
: Enable speaker diarization (requires HF token)
assets/
├── data/
│ ├── recordings/ # Raw audio files
│ ├── audio_buffer_*/ # Processing buffers
│ └── transcripts/
│ ├── unclassified_buffer/ # Raw transcripts
│ └── classified/ # Processed transcripts
└── analytics/
├── transcripts/ # Analysis input
└── sentiment_analysis/ # Analysis results
# First copy transcripts to analytics input folder as a clean copy
cp -r assets/data/transcripts/classified/* assets/analytics/transcripts/
# Run analysis
python src/analytics/sentiment/sentiment_analysis.py
python src/analytics/sentiment/calculate_metrics.py
# Results appear in assets/analytics/sentiment_analysis/
Summarize your transcripts
cd src/analytics/track_narratives
export GCP_API_KEY=YOUR_GCP_KEY
python ../summarizer.py \
-i /path/to/raw/transcripts \
-o /output/path
Create index (Vector) database and merge if you have multiple indices
python embed_summaries.py -i /path/to/summarized_text_files --batch_size 10
python merge_embeddings.py -i . -o merged_temp.h5
Talk to your database using top 5 most relevant retrieved summaries.
python run_rag.py --k 5
For the election specific claim in the first case study of paper, see this file
- Audio recorder captures streams in 30-minute segments
- Transcriber processes audio files through ASR pipeline
- Content classifier categorizes transcript segments
- Analytics pipeline processes classified transcripts
- Results viewable through analytics dashboard
- Stream connection errors: Check URL validity and network connection
- GPU memory errors: Reduce concurrent transcription threads
- Missing transcripts: Verify ffmpeg installation and audio format
- Classification delays: Check Google API quotas and key validity
- Check logs in
logs/
directory - Open an issue on GitHub
- See documentation in
docs/
This project is licensed under the Apache 2.0 License - see LICENSE and NOTICE.
If you use WavePulse or any of its components, please cite:
@article{mittal2024wavepulse,
title={WavePulse: Real-time Content Analytics of Radio Livestreams},
author={Mittal, Govind and Gupta, Sarthak and Wagle, Shruti and Chopra, Chirag and DeMattee, Anthony J and Memon, Nasir and Ahamad, Mustaque and Hegde, Chinmay},
journal={arXiv preprint arXiv:2412.17998},
year={2024},
archivePrefix={arXiv},
primaryClass={cs.IR},
eprint={2412.17998}
}