This repository contains a Python script for performing topic modeling on multilingual text data. It supports both Russian (using the DeepPavlov model) and other languages (using a multilingual BERT model). The code can automatically detect the language of the input text and choose the appropriate model for topic modeling.
- Supports text analysis using different topic modeling techniques.
- Multilingual support for both Russian and other languages.
- User-friendly web interface for easy interaction.
- Visualization of topics in a graph format.
To get started with TM_graph using Docker, follow these steps:
-
Clone this repository to your local machine:
-
Clone this repository to your local machine:
git clone https://github.com/Likich/TM_graph.git
-
Navigate to the directory containing the Dockerfile of the cloned repository:
cd TM_graph
- Build the Docker image:
docker build -t tm_graph .
- Run the container from the image:
docker run -d -p 5000:8080 --name tm_graph_container tm_graph
- Open a web browser and access the server at
http://localhost:5000
.
Note: The above command assumes that the application runs on port 5000 within the container. Adjust the port mappings as necessary.
- Load your text data into the server using the web interface.
- Choose your preferred topic modeling method (e.g., BERT or LDA) and the number of topics.
- Analyze and visualize your text data based on the selected method.
- Explore the topics in the graph.
The server provides multilingual support. It can automatically detect the language of the input text and select the appropriate model (DeepPavlov for Russian or multilingual BERT for other languages).
Make_graph_from_TM.py
: Python script for creating the topic graph.TM.py
: Contains the code for topic modeling and text analysis.app.py
: The main server application script.Preprocess.py
: Supports preprocessing for text analysis.lemmatizator_stem.py
: Provides lemmatization support.static/
andtemplates/
: Contain static files and HTML templates for the web interface.
The application is fully Dockerized for easy setup and deployment. Follow the Docker instructions above to build and run the application inside a Docker container.
Likich
Feel free to use and modify this server for your text analysis needs!
Note: Ensure that Docker is installed and running on your machine before building and running the Docker container.