/ripped

A framework for bootstrapping natural language understanding in novel, data-poor domains.

Primary LanguagePython

RIPPED

Recursive Intent Propagation using Pretrained Embedding Distances

RIPPED is a framework for bootstrapping natural language understanding in data-poor domains. It uses distance computations between pretrained sentence embeddings as a means to propagate the few labels we have through unlabeled space. This provides much higher accuracy in challenging classification domains, in particular those that are many-class, full of domain-specific language, or containing less than 10 labeled examples per class.

This repository contains the code used in writing my honors thesis.

Complete README COMING SOON...

Dependencies

Download datasets

Quick Use

Visualisation