This project is focused on topic extraction from news. User has a possibility to process lots of news information in the compressed form.
The goal of this project is to give the user the possibility to process lots of news information in the compressed form. The model we choose to explore was LDA (Latent Dirichlet Allocation). It is a soft-clustering algorithm. It is a natural choice for topic modeling, as usually, our texts consisting of more than just one topic.
Yo can get dataset which we used for training our model here
Since we used library pattern it is possible to faсe the error during installation:
EnvironmentError: mysql_config not found
Solution:
- Ubuntu/Debian based distros:
sudo apt-get install libmysqlclient-dev
or
sudo apt install default-libmysqlclient-dev
- Arch based distros:
Install libmysqlclient from AUR
- For Windows/MacOS/other distros you should find your way to install mysqlclient.
Olesia Tretiak | Hermann Yavorskyi |
---|---|
olesyat | wardady |
Ukrainian Catholic University. Ukraine. Lviv. 2020. Artificial Intelligence course. © 2019 Olesya Tretyak, Hermann Yavorskyi.