/scrivener

Video Transcript Summarizer

Primary LanguagePythonMIT LicenseMIT

SCRIVENER

Python GitHub issues GitHub-closed-issues pylint DOI GitHub license Lines of code GitHub pull-requests Open in Visual Studio Code GitHub pull-requests-closed language_count Repo-size codecov Contributors GitHub release (latest by date) AutoPep8 YouTube

Table of Contents

Introduction

Scrivener is a video transcript summarizer for Youtube videos. Youtube is one of the most used website. A lot of people use the captions to understand the language of the video. In our project we aim to create a transcript summarizer which accepts a youtube URL link, collects the caption at every sentence and then provides the summary of the complete video. Our goal is to make the summarizer as accurate as possible and to add various other features. Our second goal of the project is to create a summarizer which can summarize the youtube videos which have captions disabled. Our project can be further expanded for numerous applications. This document provides a major perspective for the users to understand and take up the project as an Open source software and add on multiple features. Also, the document aids the developers in understanding the code and acts as a reference point for starting the project.

The complete development was achieved using the Python3 technology and it is recommended that the next set of developers who take up this project have these technologies installed and keep them running before proceeding further.

Demo

The project is deployed on both Streamlit cloud and Heroku.

How to get MonkeyLearn API key

  1. Go to https://monkeylearn.com/sentiment-analysis-online/
  2. Enter your requirements and sign up. The API key is associated with a limited number of queries per month (1000). You can use one of the pretrained model like we did or you can create your own model as well!

Steps for Execution

  1. Clone the Git repository.
  2. Run pip install -r requirements.txt
  3. Open Command Prompt and change the directory to the location of cloned repository.
  4. Run the command python -m streamlit run ./source/scrivener_user_interface.py
  5. Next, open your browser and type in localhost:8501 in the search bar to open the webUI of the application.
  6. The UI typically looks as shown below and here you have a choice between URL, file or normal text input.

License

This project is licensed under the terms of the MIT license. Please check License for more details.

Contributions

Please see our CONTRIBUTING.md for instructions on how to contribute to the project by completing some of the issues.

Version 1.4 Contributions

  • Enhanced product quality by improving the summarization model.
  • Greatly improved summary formatting to improve readability
  • Provided Sentiment Analysis of the generated summary
  • Improved Heroku deployment

Future Scope

This project as we believe contains an exciting stream of possibilities. You will be working into domains such as Natural Language Processing, Web Development, Digital Signal Processing, and Information Retrieval. Some of the possibilities you can explore are the following:

  • Is there any way in which a sentence in the summary can point back to the video where it was talked about?
  • Currently our application supports youtube videos and videos with .mp4 extension. Can you provide support for other video formats?
  • Can you perform summarization for videos in languages other than English
  • Can you generate summary of Podcasts or other audiofiles?
  • Is there a way to summarize videos for specific time frames?
  • How can we deliver summaries in the form of audio?
  • How do we diversify by expanding to a Chrome Extension and a Discord Bot?

These are some of the fascinating topics we thought for you, but you should not limit yourself to these points! Play around with our repo! See what motivates you.

As a bonus point, we have taken great care to adhere to the SE principles, so you have tools integrated for code coverage, style checking, code formatting, continuous testing and integration, and as a cherry on icing many meaningful testcases. So you can focus more on delivering state-of-the-art features!

Team Members

Acknowledgements

We would like to thank Professor Dr Timothy Menzies for helping us understand the process of building a good Software Engineering project. We would also like to thank the teaching assistants Xiao Ling, Andre Lustosa, Kewen Peng, Weichen Shi for their support throughout the project.