Language Models Repository

Current repository contains experiments on language modeling for text classification.

Steps to be done

BIQRNN fasttext/w2v (time, results, loss plot)
BILSTM fasttext/w2v (time, results, loss plot)
VAE fasttext/w2v (latent space clustering (if possible), create example transition from negative to positive comment) (time, results, loss plot)
Optimization with hyperopt
Experiment with ELMo embeddings (not sure how yet (? put directly to input without )) pretrained/fine-tuned (if possible)

Look at fasttext work in case of unlemmatized input for the best performing models above (prepare this input)
Try hierarchical attention network
Create new verification with 500 samples (50 positive, other - negative)
Look at the distribution how length depends on the wrong/right classification result
Change fit to fit generator + add batches generation

##Notes

most of the examples in test set (15 samples from vk group with negative comments) were misclassified due to the ORG tag
pretrained LM for russian: https://github.com/ppleskov/Russian-Language-Model

Stephen Merity, Nitish Shirish Keskar, and Richard Socher. Regularizing and optimizing LSTM language models. CoRR, abs/1708.02182, 2017. URL http://arxiv.org/abs/1708.02182.
J. Howard and S. Ruder. Universal language model fine-tuning for text classification. Association for Computational Linguistics (ACL), 2018.
Bradbury, J., Merity, S., Xiong, C., and Socher, R. Quasi-Recurrent Neural Networks. arXiv preprint arXiv:1611.01576, 2016.
Kutuzov A., Kuzmenko E. (2017) WebVectors: A Toolkit for Building Web Interfaces for Vector Semantic Models. In: Ignatov D. et al. (eds) Analysis of Images, Social Networks and Texts. AIST 2016. Communications in Computer and Information Science, vol 661. Springer, Cham
A. Odena and I. Goodfellow, “Tensorfuzz: Debugging neural networks with coverage-guided fuzzing,” arXiv preprint arXiv:1807.10875, 2018.
Karpathy, A.; Johnson, J.; and Li, F.-F. 2015. Visualizing and understanding recurrent networks. arXiv preprint.
Stephen Merity, Nitish Shirish Keskar, and Richard Socher. An analysis of neural language modeling at multiple scales. arXiv:1803.08240, 2018.