This is a survey on deep learning models for text classification and will be updated frequently with testing and evaluation on different datasets.
Natural Language Processing tasks ( part-of-speech tagging, chunking, named entity recognition, text classification, etc .) has gone through tremendous amount of research over decades. Text Classification has been the most competed NLP task in kaggle and other similar competitions. Count based models are being phased out with new deep learning models emerging almost every month. This project is an attempt to survey most of the neural based models for text classification task. Models selected, based on CNN and RNN, are explained with code (keras and tensorflow) and block diagrams. The models are evaluated on one of the kaggle competition medical dataset.
Update: Non stop training and power issues in my geographic location burned my motherboard. By the time i had to do 2 RMAs with ASROCK and get the system up and running, the competition was over :( but still i learned a lot.
- Download and install anaconda3 say at
~/Programs/anaconda3
- create a virtual environment using
cd ~/Programs/anaconda3 && mkdir envs
andcd envs && ../bin/conda create -p ~/Programs/anaconda3/envs/dsotc-c3 python=3.6 anaconda
. - Do activate the environment
source /home/bicepjai/Programs/anaconda3/envs/dsotc-c3/bin/activate dsotc-c3
- Install
~/Programs/anaconda3/envs/dsotc-c3/bin/pip
usingconda install pip
(anaconda has issues with using pip so use the fill path) - Execute command
pip install -r requirements.txt
for installing all dependencies - For enabling jupyter extensions
jupyter nbextensions_configurator enable --user
- For enabling configuration options
jupyter contrib nbextension install --user
- Some extensions to enable
Collapsible Headings
,ExecuteTime
,Table of Contents
Now we should be ready to run this project and perform reproducible research. The details regarding the machine used for training can be found here
Version Reference on some important packages used
- Keras==2.0.8
- tensorflow-gpu==1.3.0
- tensorflow-tensorboard==0.1.8
Details regarding the data used can be found here
This project is completed and the documentation can be found here. The papers explored in this project
- Convolutional Neural Networks for Sentence Classification, Yoon Kim (2014)
- A Convolutional Neural Network for Modelling Sentences, Nal Kalchbrenner, Edward Grefenstette, Phil Blunsom (2014)
- Medical Text Classification using Convolutional Neural Networks, Mark Hughes, Irene Li, Spyros Kotoulas, Toyotaro Suzumura (2017)
- Very Deep Convolutional Networks for Text Classification, Alexis Conneau, Holger Schwenk, Loïc Barrault, Yann Lecun (2016)
- Rationale-Augmented Convolutional Neural Networks for Text Classification, Ye Zhang, Iain Marshall, Byron C. Wallace (2016)
- Multichannel Variable-Size Convolution for Sentence Classification, Wenpeng Yin, Hinrich Schütze (2016)
- MGNC-CNN: A Simple Approach to Exploiting Multiple Word Embeddings for Sentence Classification Ye Zhang, Stephen Roller, Byron Wallace (2016)
- Generative and Discriminative Text Classification with Recurrent Neural Networks, Dani Yogatama, Chris Dyer, Wang Ling, Phil Blunsom (2017)
- Deep Sentence Embedding Using Long Short-Term Memory Networks: Analysis and Application to Information Retrieval, Hamid Palangi, Li Deng, Yelong Shen, Jianfeng Gao, Xiaodong He, Jianshu Chen, Xinying Song, Rabab Ward
- Multiplicative LSTM for sequence modelling, Ben Krause, Liang Lu, Iain Murray, Steve Renals (2016)
- Hierarchical Attention Networks for Document Classification, Zichao Yang, Diyi Yang, Chris Dyer, Xiaodong He, Alex Smola, Eduard Hovy (2016)
- Recurrent Convolutional Neural Networks for Text Classification, Siwei Lai, Liheng Xu, Kang Liu, Jun Zhao (2015)
- Ensemble Application of Convolutional and Recurrent Neural Networks for Multi-label Text Categorization, Guibin Chen1, Deheng Ye1, Zhenchang Xing2, Jieshan Chen3, Erik Cambria1 (2017)
- A C-LSTM Neural Network for Text Classification
- Combination of Convolutional and Recurrent Neural Network for Sentiment Analysis of Short Texts, Xingyou Wang, Weijie Jiang, Zhiyong Luo (2016)
- AC-BLSTM: Asymmetric Convolutional Bidirectional LSTM Networks for Text Classification, Depeng Liang, Yongdong Zhang (2016)
- Character-Aware Neural Language Models, Yoon Kim, Yacine Jernite, David Sontag, Alexander M. Rush (2015)
- more paper-implementations on the way ...