WavePulse: Real-time Content Analytics of Radio Livestreams

Paper | Website | Dataset - Raw | Dataset - Summarized

Overview

WavePulse is an end-to-end framework for recording, transcribing, and analyzing radio livestreams in real-time. It processes multiple concurrent audio streams into timestamped, speaker-diarized transcripts and enables content analysis through state-of-the-art AI models.

Features

Record multiple concurrent radio streams
Convert speech to text with advanced speaker diarization
Classify content (political vs non-political, ads vs content)
Detect emerging narratives and track their spread
Analyze sentiment and opinion trends
Interactive visualization dashboard

Requirements

Minimum System Requirements

Linux-based OS (tested on Ubuntu 20.04+)
Python 3.8 or higher
CUDA-enabled GPU (1 per 50 streams)
At least 16GB RAM
Storage space proportional to number of streams (2GB/stream - only transcription)

API Access

Hugging Face account with API token
Google AI Studio account with API key

Software Dependencies

FFmpeg
CUDA toolkit (if using GPU)
Other Python packages (specified in requirements.yml)

Quick Start

1. Installation

# Clone repository
git clone https://github.com/mittalgovind/wavepulse.git
cd wavepulse

# Set up conda environment
conda env create -f requirements.yml
conda activate wavepulse

# Install ffmpeg
sudo apt update && sudo apt install ffmpeg

2. API Setup

Get a Hugging Face token at https://huggingface.co/settings/tokens
Accept model agreements:
Get a Google AI Studio API key at https://ai.google.dev/aistudio

3. Configure Radio Sources

Edit assets/weekly_schedule.json:

[
  {
    "url": "https://example.com/stream",
    "radio_name": "WXYZ",
    "time": [
      ["08:00", "14:00"], 
      ["17:00", "22:00"]
    ],
    "state": "NY"
  }
]

4. Start WavePulse

Full pipeline with all features:

python src/wavepulse.py --diarize \
  --hf-token YOUR_HUGGINGFACE_TOKEN \
  --gemini-api-key YOUR_GEMINI_KEY

Without content classification:

python src/wavepulse.py --diarize \
  --hf-token YOUR_HUGGINGFACE_TOKEN \
  --stop-classification

Basic recording and transcription only:

python src/wavepulse.py --stop-classification

All Parameters : ARGS.md

Component Control

Control individual components with these flags:

--stop-recording: Disable audio recording
--stop-transcription: Disable transcription
--stop-classification: Disable content classification
--diarize: Enable speaker diarization (requires HF token)

File Structure

assets/
├── data/
│   ├── recordings/              # Raw audio files
│   ├── audio_buffer_*/          # Processing buffers
│   └── transcripts/
│       ├── unclassified_buffer/ # Raw transcripts
│       └── classified/          # Processed transcripts
└── analytics/
    ├── transcripts/            # Analysis input
    └── sentiment_analysis/     # Analysis results

Analytics Pipeline

Run Sentiment Analysis

# First copy transcripts to analytics input folder as a clean copy
cp -r assets/data/transcripts/classified/* assets/analytics/transcripts/

# Run analysis
python src/analytics/sentiment/sentiment_analysis.py
python src/analytics/sentiment/calculate_metrics.py

# Results appear in assets/analytics/sentiment_analysis/

Track Narratives

Summarize your transcripts

cd src/analytics/track_narratives
export GCP_API_KEY=YOUR_GCP_KEY
python ../summarizer.py \
      -i /path/to/raw/transcripts \
      -o /output/path

Create index (Vector) database and merge if you have multiple indices

python embed_summaries.py -i /path/to/summarized_text_files --batch_size 10
python merge_embeddings.py -i . -o merged_temp.h5

Talk to your database using top 5 most relevant retrieved summaries.

python run_rag.py --k 5

For the election specific claim in the first case study of paper, see this file

Process Flow

Audio recorder captures streams in 30-minute segments
Transcriber processes audio files through ASR pipeline
Content classifier categorizes transcript segments
Analytics pipeline processes classified transcripts
Results viewable through analytics dashboard

Troubleshooting

Common Issues

Stream connection errors: Check URL validity and network connection
GPU memory errors: Reduce concurrent transcription threads
Missing transcripts: Verify ffmpeg installation and audio format
Classification delays: Check Google API quotas and key validity

Getting Help

Check logs in logs/ directory
Open an issue on GitHub
See documentation in docs/

License

This project is licensed under the Apache 2.0 License - see LICENSE and NOTICE.

Citation

If you use WavePulse or any of its components, please cite:

@article{mittal2024wavepulse,
    title={WavePulse: Real-time Content Analytics of Radio Livestreams},
    author={Mittal, Govind and Gupta, Sarthak and Wagle, Shruti and Chopra, Chirag and DeMattee, Anthony J and Memon, Nasir and Ahamad, Mustaque and Hegde, Chinmay},
    journal={arXiv preprint arXiv:2412.17998},
    year={2024},
    archivePrefix={arXiv},
    primaryClass={cs.IR},
    eprint={2412.17998}
}

NYU-DICE-Lab/WavePulse