This repository includes:
- parser's code for https://tatar-inform.tatar/ website
- Attention-Based Aspect Extraction and LDA models comparison on topic modeling task with collected dataset
You can find the full dataset, word2vec embeddings and model weights in [Download].
$ pip install -r requirements.txt
$ jupyter notebook
If you use the code, please consider citing original paper:
@InProceedings{he-EtAl:2017:Long2,
author = {He, Ruidan and Lee, Wee Sun and Ng, Hwee Tou and Dahlmeier, Daniel},
title = {An Unsupervised Neural Attention Model for Aspect Extraction},
booktitle = {Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)},
month = {July},
year = {2017},
address = {Vancouver, Canada},
publisher = {Association for Computational Linguistics}
}
- Attention Based Aspect Extraction implementation from https://github.com/alexeyev/abae-pytorch
- List of Tatar stopwords from https://github.com/aliiae/stopwords-tt