/SubMe

Cloud-based solution to automatically generate and translate subtitles for video files

Primary LanguageJupyter NotebookMIT LicenseMIT

Deprecated

Hello everyone, sorry I haven't had time to maintain this project. I have moved on to working on Faster-Whisper instead, which also does translation and transcribing of video files, please do check it out if is what you are looking for. It should be quite similar to SubMe feature wise.

















SubMe

A cloud deployment of Autosub on Google Colab. It leverages Google's cloud computing clusters and GPU to automatically generate subtitles for uploaded video files in various languages. Audio is first pre-processed using ffmpeg and ffmpeg-normalize. Speech recognition is then conducted on the pre-processed audio using Auditok, after which ffmpeg will be used to splice the audio files into individual segments. The segments will be transcribed using Google's Cloud Speech-to-Text API, then translated into your desired language using py-googletrans.

Why use our tool?

Firstly, its free! Other similar products, such as Veed, are paid services with restrictions on video length. We offer a completely free service with no restrictions on video length.

Secondly, our tool significantly speeds up the runtime of Autosub. A study has shown that despite being a free service, the GPU-enabled Google Colab environment (which is what we are using now) is significantly faster than a GPU-enabled MacBook Pro, Lenovo Legion and Lenovo Thinkpad, contributing to greater time savings.

Thirdly, without the need to provision and maintain expensive hardware, users will also enjoy greater cost savings. Moreover, you will no longer have to worry about overheated machines or exorbitant electricity bills.

Fourthly, we have an easy to use interface. No need to concern yourself with the complex technology of voice extraction and language translation. Just click and few buttons and you’re set.

Lastly, our tool will also help the less fortunate, such as those with hearing disabilities, to decipher what is being spoken in videos and movies.

Getting Started

  • Click on Open In Colab to open the notebook in Google Colab.
  • Follow the instructions on the notebook

Uploading from Google Drive

When you encounter the above screen, click on the link and a new tab will be opened.

Click allow. A code will be generated. Copy the code and paste it into the empty field in the notebook. Press Enter.

After successfully mounting Google Drive, you will be able to access your directory by clicking on File on the left.

Output

When uploading, you will have to specify the extension of the output files (srt, ass, sub, json or txt). The output files can be found in the same directory as the uploaded video file. For example, if you uploaded a video file from /content/drive/My Drive, the subtitle file will also be found here.

When downloading, the output files can be found in /content/drive/My Drive/Torrent.

Updates

10 June 2021 - Subtitling/captioning of video is now possible. Just put the same language code for source and destination language and autosubs will generate subtitles for your video in its original language.

Authors

We are two Singaporean university undergraduates with curiosity and passion for social causes. In particular, we are interested in using technology to benefit the less fortunate people in our society.

Acknowledgments