/classifAI-engine

API for classifAI. Provides audio transcription, question categorization, and summarization.

Primary LanguagePythonGNU General Public License v3.0GPL-3.0


Logo

ClassifAI Engine

ClassifAI engine is a RESTful API that provides the heavy lifting for classifAI through audio transcription, question categorization, and insights.

Explore the docs »

Visit Portal · Report Bug · Request Feature · Project Information

Stargazers Issues GPL-License GitHub Actions Workflow Status Website


Table of Contents
  1. About The Project
  2. Getting Started
  3. Usage
  4. Roadmap
  5. Contributing
  6. License
  7. Contact
  8. Acknowledgments

About The Project

ClassifAI engine provides the heavy lifting for classifAI. It is a RESTful API that provides the following services:

  • Transcription of video and audio into text
  • Categorization of questions
  • Engagement insights
  • Turning reports into PDFs or .docx files

Built With

Flask Torch OpenAI Redis LLaMA

Getting Started

To get a local copy up and running follow these simple steps.

For more instructions please see the documentation

Prerequisites

Installation

  1. Clone the repo

    git clone https://github.com/TCU-ClassifAI/classifAI-engine.git
    cd classifAI-engine
  2. Install and run Redis

     sudo apt-get install redis-server 
     redis-server
  3. Install Python packages (it is reccomended you use a venv)

    pip install -r src/requirements.txt -r src/requirements-dev.txt
  4. Include your huggingface API key in either your environment variables or in a .env file in the root directory

    HF_TOKEN=your_key_here

    You must also accept the Hugging Face terms and conditions:

    • visit hf.co/pyannote/speaker-diarization-3.1 and accept user conditions
    • visit hf.co/pyannote/segmentation-3.0 and accept user conditions
  5. Launch the API

     python src/app.py
  6. Launch your worker (for asynchronous tasks)

     rq worker -c config.worker_config

    (More information on RQ)

  7. Include your preferred summarization/categorization model through config.py (optional)

     SUMMARIZATION_MODEL = "gpt4"
     CATEGORIZATION_MODEL = "gemma"

    You must launch your own model separely and include the API endpoint in your .env file

     LLAMA_API=your_model_endpoint

For Llama, please see the repository here

Testing

curl http://localhost:5000/healthcheck should return OK

Usage

Analyze an Audio File

analyze

  • URL: /analyze

  • Method: POST

  • Data Params:

    • file (file)
    • url (string)
  • Example:

    curl -X POST -H "Content-Type: application/json" -d '{"url": "https://www.youtube.com/watch?v=t4yWEt0OSpg"}' http://localhost:5000/analyze
    curl -X POST -F "file=@<path_to_your_audio_file>" http://localhost:5000/analyze
  • Success Response: 200

{
  "job_id": "0bc133cb-f519-40a1-96c6-46d2cfe9e4ad",
  "message": "Analysis started"
}

Get Analysis Status

  • URL: /analyze/<job_id>

  • Method: GET

  • Example (preferred):

    curl http://localhost:5000/analyze/0bc133cb-f519-40a1-96c6-46d2cfe9e4ad
  • Alternative Example (legacy support):

    curl http://localhost:5000/analyze/?job_id=0bc133cb-f519-40a1-96c6-46d2cfe9e4ad
  • Success Response: 200

  • Example Content:

{
  "meta": {
    "job_id": "0bc133cb-f519-40a1-96c6-46d2cfe9e4ad",
    "job_type": "analyze",
    "message": "Analysis finished",
    "progress": "finished",
    "title": "General Relativity Explained in 7 Levels of Difficulty"
  },
  "result": {
    "transcript": [
      {
        "end_time": 11149,
        "speaker": "Speaker 0",
        "start_time": 7740,
        "text": "General relativity is a physics theory invented by Albert Einstein. "
      },
    ],
    "questions": [
      {
        "question": "What is general relativity?",
        "level": 1,
      },
    ],
    "summary": "General relativity is a physics theory invented by Albert Einstein. It describes how gravity works in the universe. "
  }
}

For more examples, please refer to the Documentation

Summarization

summarize

  • URL: /summarize

  • Method: POST

  • Data Params:

    • text (string)
  • Example:

    curl -X POST -H "Content-Type: application/json" -d '{"text": "This is the transcript that I want to have summarized."}' http://localhost:5000/summarize/
  • Success Response: 200 OK

    • Content:
      "This is the summary of the text that was passed in.", 200

    Alternatively, you can pass in a transcript like so:

    [
      {
        "end_time": 2301,
        "speaker": "Speaker 0",
        "start_time": 1260,
        "text": "Why did you bring me here? "
      },
      {
        "end_time": 4263,
        "speaker": "Main Speaker",
        "start_time": 3242,
        "text": "I dont like going out. "
      }
    ]

    So the request would look like this:

    curl -X POST -H "Content-Type: application/json" -d '{"transcript": [{"end_time": 2301,"speaker": "Speaker 0","start_time": 1260,"text": "Why did you bring me here? "},{"end_time": 4263,"speaker": "Main Speaker","start_time": 3242,"text": "I dont like going out. "}]}' http://localhost:5000/summarize/

Roadmap

  • Add Transcription Service

    • Use Whisper for transcription
    • Integrate WhisperX for faster transcription and diarization
    • Use Redis for better asynchronous processing
    • Add support for YouTube videos
    • Add support for more languages (unofficial)
  • Add Summarization Service

  • Add Question Categorization Service

    • Update service- Using Fine-Tuned version of Llama
    • Generate summaries of the transcript
    • Add support for more question types
    • Add support for more languages (works for summaries, not categorization..)
  • Add Engagement Insights Service

    • Add support for more languages

See the open issues for a full list of proposed features (and known issues).

Contributing

Contributions are what make the open source community such an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated.

If you have a suggestion that would make this better, please fork the repo and create a pull request. You can also simply open an issue with the tag "enhancement". Don't forget to give the project a star! Thanks again!

License

This project is licensed under the GNU GPLv3 License - see the LICENSE file for details. The GNU GPLv3 License is a free, copyleft license for software and other kinds of works.

Note that this license only applies to the engine. Please see the classifAI portal for more information on the license for the portal.

Contact

Learn About the Team

Project Link: https://github.com/TCU-ClassifAI/classifAI

View the Portal: https://classifai.tcu.edu/

Acknowledgments

(back to top)