This repo contains code and data for the CoDE model, presented as a poster at the Vector Semantics for Discourse and Dialogue workshop of the 2019 IWCS conference.
Through the years, a restricted number of authors have tackled the non-trivial problem of encoding syntactic information in distributional representations by injecting dependency-relation knowledge directly into word embeddings. Although such representations should bring a clear advantage in complex representations, such as at the phrasal and sentence level, these models have been tested mainly through word-word similarity benchmarks or with rich neural architecture. Outside the embeddings domain, the APT model has offered an effective resource for modelling compositionality via syntactic contextualization. In this work, we present a novel model, built on top of GloVe, to reduce APT representations to a low- dimensionality dense dependency-based vectors, that showcase APT-like composition ability. We then propose a detailed investigation of the nature of these representations, as well as their usefulness and contribution in semantic composition.
CoDE vectors used for experiment in the poster can be downloaded here
CoDE is a GloVe based method, built over a Keras implementation. Future versions of the model will be hosted elsewhere