- Motivation: Efficiency
- Provides a lower dimensional semantic representation of the news article.
- Provides a lower dimensional semantic representation of the news article.
- Motivation: Interpretibility
- Provides topic-based features
- Provides a lower dimensional semantic representation of the news article.
(1) We provide interpretability by incorporating Bayesian topic modelling and inferring topic compositions in news articles as added features for classification.
(2) Our model works in the data scarcity scenario where only textual content is available.
(3) We keep our model efficient by coupling a deep architecture (VAE) to LDA.
-
$\mathcal{D}$ : dataset, -
$\mathcal{D}_{tr}$ : training set, -
$\mathcal{D}_{te}$ : test set -
$N$ : Number of samples, indexed by$i$ . -
$V$ : The set of vocabulary detected by word2vec. -
$n_f$ : Number of latent features obtained from encoder. -
$w$ : word2vec dimension. $L = \max {l_i:i = 1,\dots, N }$ -
$l_i$ : Length of sample$i$ (number of words). -
$t_i^{(j)}$ : Word$j$ in sample$i$ . -
$\lambda_1$ : Regularization parameter ($=0.05$ ). -
$\lambda_2$ : Regularization parameter ($=0.3$ ). -
$K$ : Number of topics.
VAE:
Layer | Output Shape | Param # | Other Setting |
---|---|---|---|
Input | [(None, |
0 | |
Embedding | (None, |
Non-trainable (word2vec) | |
Bi. LSTM | (None, |
activation='tanh' | |
Bi. LSTM | (None, |
activation='tanh' | |
Dense | (None, |
activation='tanh' | |
Dense ( |
(None, |
activation='tanh' | |
Sampling | (None, |
0 |
Layer | Output Shape | Param # | Other Setting |
---|---|---|---|
Input | [(None, |
0 | |
Dense | (None, |
activation='tanh' | |
Repeat Vector | (None, |
0 | |
LSTM | (None, |
activation='tanh' | |
LSTM | (None, |
activation='tanh' | |
Time Dist. | (None, |
activation='softmax' |
Layer | Output Shape | Param # | Other Setting |
---|---|---|---|
Input | [(None, |
0 | |
Dense | (None, |
activation='tanh' | |
Dense | (None, |
activation='tanh' | |
Output | (None, 1) | activation='sigmoid' |