In this project, we are taking an audio input (keyword) from the user and trying to find this keyword in a audio book.
In order to be able to run the code follow the following steps:
-
Clone this repo to your local machine.
-
Create a new environment and use
requirements.txt
for dependencies. -
Download an audio book from Librevox Recordings
If you want to just test the code you can get Alice's Advantures which is relatively small in size.
-
Unzip the audio book and make sure that you put it under
data
directory and deleted the zipped file. -
Run
main.py
and follow the directions. Note initial set-up might take approximately 15 minutes. after the first run, you can skip the set-up then you can start searching keywords.
Under the notebooks directory you will find the notebook report
where we explained some technical aspects of the project. It will give you more in dept and step by step explanation of the projects.
-
In this project, I used
speech_recognition
library where they provide public access to the Google Cloud Speech API. But their public service has the limitations. For example they don't process longer audio files. This is the reason that we divide the audio file into smaller chunks and then process them which extends the set-up time significantly. -
When we match the keyword with the part of the audio we first converted everthing into text and then search match. More direct approach could be more efficient and quick. However, right now I don't how this can be achievable.
-
Text matching is a very strict process and we tried to loosen this with
fuzzywuzzy
library. With this library we make sure that more flexible search is possible. However, ideally for such app we could add search from the context and relevancy. I think this can be achievable but I didn't experimented such feature yet. -
Right now silent detection is mechanical (We give the minimum dFBS number and min_silence parameters by hand). Unfortunately, this leads that for each chapter, we should find the best parameters. It might be very interesting to see ML solving this problem.
-
Instead of keyword search for some book context search might be interesting.