A simple tool to detect whether an audio file was generated by NotebookLM.
At Listen Notes, we've encountered a growing number of spammers submitting fake, NotebookLM-generated podcasts to our platform.
We hoped the NotebookLM team would provide a tool to help detect NotebookLM-generated audio. However, after a week of back-and-forth emails, we lost patience.
It's now Friday (Oct 4, 2024), and since we won't hear back from the NotebookLM team until next week, we decided to put together this simple script. Luckily, it seems to work!
$ pip install -r requirements.txt
To detect whether an audio file is AI-generated or human-produced, run the following command:
$ python notebooklm_detector.py --action predict --file_path [filename].mp3
You’ll see output like this:
$ The audio is: AI Generated
or
$ The audio is: Human
You can train the model and regenerate model.pkl
by following these steps:
- Place NotebookLM-generated audio files (mp3, wav, or mp4) in the datasets/ai/ folder.
- Place human-produced audio files in the datasets/human/ folder.
To train the model, run:
$ python notebooklm_detector.py --action train --dataset_path datasets