/hyperbolic

NLP models in hyperbolic space

Primary LanguagePython

hyperbolic

NLP models in hyperbolic space

this is a collection of work i've done to investigate the use of hyperbolic space for NLP tasks. I'm building on top of @dpressel 's baseline project.

I'm using word embeddings generated from Leimeister (2018)

Overall Learnings


  • learning the softmax translation as shown by Ganea 2018 is required.
  • double precision is needed to successfully learn.
  • I found relu activations to be helpful in sequential tagging.
  • A manifold abstraction would be helpful, as the code gets quite messy and doesn't lend itself to changing manifolds easily.

Tagging


  • basic tagger in the Poincare Ball, leveraging Ganea 2018

  • the hyperboloid RNN has a vanishing gradient and needs more exploration.

  • bidirectional models in the hyperboloid need more investigation. I thought I would have seen a larger gain from this similar to euclidean space.

Classification


I used Ganea's work to apply it to the AG news dataset, but was never certain it worked correctly.

references

https://arxiv.org/pdf/1805.09112.pdf
https://arxiv.org/pdf/1805.08207.pdf
https://arxiv.org/pdf/1806.03417.pdf

https://github.com/lateral/minkowski/blob/master/python/hyperboloid_helpers/manifold.py
https://github.com/geomstats/geomstats/blob/master/geomstats/riemannian_metric.py
https://github.com/dpressel/baseline