/interpretable_wordembedding_classifiers

Notebook demonstrating how to interpret results from classifiers built from wordembeddings

Primary LanguageJupyter Notebook

Interpretable Wordembedding Classifiers

Introduction

Wordembeddings can be increadibly useful as features in classification models. Word embeddings are dense low dimensional representations of words, and are created in such a way that words used in similar context receive a similar mathematical representation. Unlike bag-of-words approaches, wordembeddings can capture semantic relationships between words and are fairly light-weight compared to embeddings from more complex models like ELMO and BERT.

Despite their usefulness and high performance, wordembeddings can often be difficult to use in classification tasks where it is vital that researchers are able to explain the data generating process, not just provide an accurate prediction. The dimensions of the word embeddings are in and of themselves not interpretable, and therefore researchers often shy away from using them.

In this notebook I will present a simple framework to add interpretability to classifier models based on word embeddings, and a simple process for working our way back from predictions made by a classifier that takes wordembeddings as input to individual words or terms that the model seem to be using to discriminate your classes.

The approach can fairly simply be summarised as follows:

  • Train a classifier that takes document level embeddings as input.
  • Create a vocabulary matrix with ndividual word embeddings for all the words found in your corpus.
  • Run the vocabulary matrix through the trained algorithm and pick out the individual words the algorithm feels the most certain belong to each class.

In the parts that follow I will demonstrate how this can be done by training a classifier to distinguish whether a newspaper headline came from a politics article or a sports article.

Get News Headlines from The Guardian API

We firstly, need to get our data. The Guardian has a really good API for this purpose; so we can pull out all politics and sports articles from the Guardian between two arbitary points in time, say march and may 2018.

from GuardianAPI import GuardianAPI
import pandas as pd
gapi = GuardianAPI()
api_key: ········
sports_json = gapi.get_content({'section':'sport', 'from-date':'2018-03-1', 'to-date':'2018-05-30', 'code':'uk', 'page-size':50})
politics_json = gapi.get_content({'section':'politics', 'from-date':'2018-03-1', 'to-date':'2018-05-30', 'code':'uk', 'page-size':50})
df = pd.DataFrame(sports_json + politics_json)
df.sample(5)
id type sectionId sectionName webPublicationDate webTitle webUrl apiUrl isHosted pillarId pillarName
401 sport/2018/may/06/justify-kentucky-derby-mende... article sport Sport 2018-05-06T00:08:06Z Justify reigns in the rain at Kentucky Derby a... https://www.theguardian.com/sport/2018/may/06/... https://content.guardianapis.com/sport/2018/ma... False pillar/sport Sport
26 sport/2018/may/28/marco-trungelliti-road-trip-... article sport Sport 2018-05-28T21:13:10Z Marco Trungelliti makes 500-mile family road t... https://www.theguardian.com/sport/2018/may/28/... https://content.guardianapis.com/sport/2018/ma... False pillar/sport Sport
2207 politics/2018/mar/11/labour-mps-should-not-app... article politics Politics 2018-03-11T11:15:13Z Labour MPs should not appear on Russia Today, ... https://www.theguardian.com/politics/2018/mar/... https://content.guardianapis.com/politics/2018... False pillar/news News
2205 politics/2018/mar/11/scottish-labour-conferenc... article politics Politics 2018-03-11T15:43:53Z Scottish Labour conference backs Corbyn on Brexit https://www.theguardian.com/politics/2018/mar/... https://content.guardianapis.com/politics/2018... False pillar/news News
316 sport/2018/may/11/guildford-extend-record-run-... article sport Sport 2018-05-11T18:28:33Z Chess: Guildford extend their record unbeaten ... https://www.theguardian.com/sport/2018/may/11/... https://content.guardianapis.com/sport/2018/ma... False pillar/sport Sport

Pre-Trained Word Embeddings

We now need to get pre-trained word embeddings from gensims api and then create an embedding that represent each headline, by taking the average embedding of all the words used in each headline.

# Call in depenencies
import gensim.downloader as api
import re
import numpy as np
from nltk.corpus import stopwords
# Get glove embeddings
glove_vectors = api.load('glove-wiki-gigaword-300')
# Get a set of stopwords
StopWords = set(stopwords.open('english'))
def text_preprocess(text, word_vectors, StopWords):
    tmp_txt = text.lower() # lowercase all text
    tmp_txt = re.sub(r'\!', ' ! ', tmp_txt) # add a a space to each exclamation mark to ensure they get a token
    tmp_txt = re.sub(r'\?', ' ? ', tmp_txt) # add a a space to each question mark to ensure they get a token
    if stopwords:
        tokens = [w for w in tmp_txt.split() if w in word_vectors.vocab and w not in StopWords]
    else:
        tokens = [w for w in tmp_txt.split() if w in word_vectors.vocab]
    return tokens
def doc_vector(tokens, word_vectors):
    doc_vec = np.mean(np.array([word_vectors[word]
                                 for word in tokens]), axis = 0)
    return doc_vec
def concat_texts(corpus, word_vectors):
    out_matrix = np.empty((len(corpus), 300), float)
    for row,txt in enumerate(corpus):
        out_matrix[row] = doc_vector(txt, word_vectors)
    return out_matrix
headlines_processed = [text_preprocess(x, glove_vectors, StopWords) for x in df.webTitle]
headline_matrix = concat_texts(headlines_processed, glove_vectors)
headline_matrix.shape
(2293, 300)

Split data into Training and Test Set

# Call dependency
from sklearn.model_selection import train_test_split
# Split data into train and test
X_train, X_test, y_train, y_test = train_test_split(headline_matrix, 
                                                    df.sectionId, 
                                                    test_size=0.2, 
                                                    random_state=123)
print('Training Examples: ', X_train.shape, "\n"
      'Test Examples: ', X_test.shape)
Training Examples:  (1834, 300) 
Test Examples:  (459, 300)

Get Vocabulary Matrix

We then need to get a matrix of individual words used in the model, that we can use to work our way back from a model prediction to a set of terms that explain those predictions. We can do this by tokenising all the data in our corpus and put these tokens into a set.

vocab = set()
for headline in df.webTitle:
    vocab.update([w.lower() for w in headline.split() if w.lower() in glove_vectors.vocab])
vocab = list(vocab)
vocab_matrix = np.array([glove_vectors[w] for w in vocab])

Train Models

from sklearn.metrics import classification_report
import matplotlib.pyplot as plt

Support Vector Machine

Support Vector Machines attempts to find a line or hyperplane that separates the datapoints from each class with the largest possible margin.

One way to understand what our model is picking up on is to calculate the distance of each of the words in our vocabulary to the decision plane. Words that are far from the decision plane are presumably more discriminative, and should therefore give us a clue as to what the model is picking up on. Similarly, words that are very close to the separating line should be words that are not discriminatory; meaning either general vocabulary or terms typically used in both domains.

We can get the distance of each word from the decision boundary by taking the decision function value we get from running the word through the trained algorithm divided by the l2 norm of the coefficient vector.

Below is a two-dimentional illustration of what we expect to see, but the intuition generalises to multiple dimensions

png

Train and Fit Model

from sklearn.svm import SVC
svm = SVC(kernel = 'linear', C = 1, class_weight='balanced')
svm.fit(X_train, y_train)
SVC(C=1, break_ties=False, cache_size=200, class_weight='balanced', coef0=0.0,
    decision_function_shape='ovr', degree=3, gamma='scale', kernel='linear',
    max_iter=-1, probability=False, random_state=None, shrinking=True,
    tol=0.001, verbose=False)

Predict on unseen data

y_pred = svm.predict(X_test)
print(classification_report(y_test, y_pred))
              precision    recall  f1-score   support

    politics       0.91      0.97      0.94       158
       sport       0.98      0.95      0.97       301

    accuracy                           0.96       459
   macro avg       0.95      0.96      0.95       459
weighted avg       0.96      0.96      0.96       459

Calculate distance from each word to the decision boundary

dist = svm.decision_function(vocab_matrix) / np.linalg.norm(svm.coef_)
distance_df = pd.DataFrame({'word':vocab, 'distance':dist})

Sort Vocabulary Words by distance to the decision boundary

distance_df.sort_values('distance', ascending = False).head(10)
word distance
2582 sevens 2.561792
82 a-league 2.385136
2350 rugby 2.384202
1203 nrl 2.268408
1135 paralympic 2.252786
213 championship 2.213799
2177 cycling 2.199589
4190 tennis 2.159167
3435 tournament 2.144038
755 squad 2.125921

We can clearly see that the words that are the furthest from the decision boundary on the positive side are all sports related words like rugby, paralympic and tournament. This suggest that these words and words similar to them are particlarly important for the model to recognise a headline as a sports headline.

When looking at the words that are the furthest away from the decision boundary on the negative side, these are, as suspected political terms like minister, labour and parliament. Again this tells us that our model are focusing on terms that are similar to these when deciding whether to classify a headline as a politics headline.

distance_df.sort_values('distance', ascending = True).head(10)
word distance
803 minister -2.222333
3697 labour -2.092312
1654 parliament -2.006268
2021 liberal -1.884801
4103 tory -1.868131
508 party -1.847066
2063 conservatives -1.814579
2584 chancellor -1.798007
5290 tories -1.710996
5216 conservative -1.624608

We can look at words that are close to the decision boundary. Here we find as hypothesised a set of words that are either not associated with either politics or sports, like credentials or barrles, or associated with both, like appeal or affected.

distance_df.loc[distance_df.distance.between(-0.01, 0.01)].head(10)
word distance
18 launch 0.001324
117 appeal -0.008476
145 vital 0.009059
195 bale -0.002664
303 affected -0.005008
329 mcnicholl -0.009333
354 credentials 0.001708
473 barrel -0.005074
518 anything 0.007034
571 aims 0.007861

Logistic Regression

We can make a similar form of analysis using a logistic regression. However, rather than calculating the distance of all our vocabulary words from the decision boundary, we predict the class probability of each word.

png

Train Model

# Dependency
from sklearn.linear_model import LogisticRegression
lr = LogisticRegression()
lr.fit(X_train, y_train)
LogisticRegression(C=1.0, class_weight=None, dual=False, fit_intercept=True,
                   intercept_scaling=1, l1_ratio=None, max_iter=100,
                   multi_class='auto', n_jobs=None, penalty='l2',
                   random_state=None, solver='lbfgs', tol=0.0001, verbose=0,
                   warm_start=False)
y_pred = svm.predict(X_test)
print(classification_report(y_test, y_pred))
              precision    recall  f1-score   support

    politics       0.91      0.97      0.94       158
       sport       0.98      0.95      0.97       301

    accuracy                           0.96       459
   macro avg       0.95      0.96      0.95       459
weighted avg       0.96      0.96      0.96       459

Get probability score for each vocabulary word

prob_df = pd.DataFrame({'word':vocab, 'probability':lr.predict_proba(vocab_matrix)[:,0]})
prob_df.sort_values('probability', ascending = True).head(10)
word probability
2582 sevens 0.000000e+00
3435 tournament 2.220446e-16
213 championship 2.220446e-16
2350 rugby 8.881784e-16
82 a-league 1.998401e-15
4410 finals 1.998401e-15
3602 champions 3.530509e-14
1203 nrl 3.819167e-14
4190 tennis 4.796163e-14
755 squad 7.638334e-14

Again we are seeing near identical results to what we saw with the support vector machines. All terms with a low probability of being a politics headline are sports terms.

Of the words given a high probability for politics article, we see the same politics terms popping up as with the svm.

prob_df.sort_values('probability', ascending = False).head(10)
word probability
803 minister 1.0
3697 labour 1.0
1654 parliament 1.0
2021 liberal 1.0
508 party 1.0
2063 conservatives 1.0
4103 tory 1.0
1034 parliamentary 1.0
5290 tories 1.0
435 prime 1.0

Random Forests

As you may have realised by now, this approach should work for any classifier that can output a probability score, be that a simple decision tree, a naive bayes or a deep neural network.

As a final demonstration, I will run the task through a Random Forest Classifier.

from sklearn.ensemble import RandomForestClassifier
rf = RandomForestClassifier(criterion = 'entropy', max_depth = 15)
rf.fit(X_train, y_train)
RandomForestClassifier(bootstrap=True, ccp_alpha=0.0, class_weight=None,
                       criterion='entropy', max_depth=15, max_features='auto',
                       max_leaf_nodes=None, max_samples=None,
                       min_impurity_decrease=0.0, min_impurity_split=None,
                       min_samples_leaf=1, min_samples_split=2,
                       min_weight_fraction_leaf=0.0, n_estimators=100,
                       n_jobs=None, oob_score=False, random_state=None,
                       verbose=0, warm_start=False)
y_pred = rf.predict(X_test)
print(classification_report(y_test, y_pred))
              precision    recall  f1-score   support

    politics       0.91      0.87      0.89       158
       sport       0.93      0.95      0.94       301

    accuracy                           0.92       459
   macro avg       0.92      0.91      0.91       459
weighted avg       0.92      0.92      0.92       459
tree_df = pd.DataFrame({'word':vocab ,'probability':rf.predict_proba(vocab_matrix)[:,0]})
tree_df.sort_values('probability', ascending = False).head(10)
word probability
3697 labour 0.93
2892 labor 0.92
1654 parliament 0.90
106 lawmakers 0.90
2028 political 0.90
5216 conservative 0.88
508 party 0.87
2021 liberal 0.87
2063 conservatives 0.86
634 reform 0.86
tree_df.sort_values('probability', ascending = True).head(10)
word probability
213 championship 0.07
2372 games 0.07
4049 matches 0.07
4410 finals 0.07
2020 scoring 0.09
3202 beating 0.09
2485 softball 0.12
2247 beat 0.12
3764 thrilling 0.12
3974 prix 0.13

Conclusion

In this notebook I have proposed and demonstrated a simple approach to add interpretability to any classifier taking word embeddings as features. This simple approach allows you to sense check that what your classifier picks up is reasonable, and at least to some extent explain the data generating process.

All though this approach is not directly applicable to regression problems, you can do something fairly similar... but that is for another post.