This is an open source implementation of the nonlinear mapping between embedding sets used in this paper:
- D Newman-Griffis and A Zirikly, "Embedding Transfer for Low-Resource Medical Named Entity Recognition: A Case Study on Patient Mobility". In Proceedings of BioNLP 2018, 2018.
The included demo.sh
script will download two small sets of embeddings, learn a demonstration mapping between them, and calculate changes in nearest neighbors.
External
- Tensorflow (we used version 1.3.1)
- NumPy
Internal (frozen copies of all included in the lib
directory)
This implementation learns a nonlinear mapping function from a source set of embeddings to a target set, based on shared keys (pivots). The embeddings do not have to be of the same dimensionality, but must have keys in common.
The process follows three steps:
Pivot terms used in the mapping process may be selected from the set of keys present in both the source and target embeddings in one of two ways:
- Frequent keys: the top N keys by frequency in the target corpus are used as pivots.
- Random/all keys: a random subset of N shared keys (or all shared keys, if N is unspecified) is used as pivots.
Pivot terms are divided into k folds. For each fold, a nonlinear projection is learned as follows:
- Construct a feed-forward DNN, taking source embeddings as input and generating output of the same size as target embeddings. Model parameters include:
- Number of layers
- Activation function (tanh or ReLU)
- Dimensionality of hidden layers (by default, same as target embedding size)
- Use minibatch gradient descent to train over each shared key in the training set
- Loss function is batch-wise MSE between output embeddings and reference target embeddings
- Optimization with Adam
- After each epoch (all shared keys in training set), evaluate MSE on held-out set
- When held-out MSE stops decreasing, stop training and revert to previous best model parameters
Getting the final projection of source embeddings into target embedding space is a two-step process:
- Take the projection function learned for each trained fold and project all source embeddings
- Average all k projections to yield final projection of source embeddings
This repository also includes the code used to calculate changes in nearest neighbors after the learned mapping is applied, in nn-analysis
.
nearest_neighbors.py
Tensorflow implementation of nearest neighbor calculation by cosine distancenn_changes.py
script to calculate how often nearest neighbors change after the mapping is learned
If you use this software in your own work, please cite the following paper:
@inproceedings{Newman-Griffis2018BioNLP,
author = {Newman-Griffis, Denis and Zirikly, Ayah},
title = {Embedding Transfer for Low-Resource Medical Named Entity Recognition: A Case Study on Patient Mobility},
booktitle = {Proceedings of BioNLP 2018},
year = {2018}
}