LCT_Hack_Yakutiya_2023

Summarizer

A class that allows summarizing the input text. It is used ONLY for document summarization.

T5 Summarizer - Language model for summarizing Russian texts. We use it to compress large documents.

Retriever

A class that allows searching for relevant texts in response to the user's query. The search is based on the SBERT model and the FAISS indexer with the L2 metric. Also, the Retriever implements the classification of query types into the following categories: Chatbot interactions, information about upcoming events in the city, weather, traffic, irrelevant queries (meaningless, obscene).

SBERT - Language model for constructing semantic vectors (embeddings) for user queries and documents. CatBoost - Gradient boosting model used for determining the theme of questions and detecting irrelevant and obscene queries. FAISS - Nearest neighbors search algorithm.

Chat

A class that allows conducting a dialogue based on a given question and documents from the database.

FRED T5 Q&A - Language model for answering user questions based on provided documents.