Fuzzy Set Embedding (FUSE) is a general representation learning framework for Fuzzy sets, which are generalizsation of crisp sets and can be used to model semantic concepts or objects with an inherent characterization of volume and ambiguity. We divide the project into several stages, and in each stage we target a different application of FUSE to compare against traditional vector and geometry(box)-based embeddings; we also develop the theory behind the measure-theoretic approximation of fuzzy set from a general non-parametric simple approximation to a parametric distribution based approximation (Gumbel distribution).
under the src
directory, we include codes for different applications:
- Taxonomy Expansion, under the
taxonomy
subdirectory. - Knowledge Graph Embedding, under the
knowledge_graph
subdirectory. - Language Modeling, Word Embedding, under the
langauge_modeling
subdirectory.
This part corresponds to our ACL submission: FUSE: Measure-Theoretic Fuzzy Set Embedding for Taxonomy Expansion. The dataset in this case follows from Box-Taxo, and the original data can be obtained from SemEval-2016 Task 13: Taxonomy Extraction Evaluation. Processed data is also under data
folder, obtained from STEAM.
For knowledge graph, we base partly the code from Fuzzy-QE and BetaE. The dataset should be downloaded from here and put it under data
folder. We are still updating this folder to add more baseline models and to improve existing model.
The directory structure should be like [PROJECT_DIR]/data/NELL-betae/train-queries.pkl.
This part corresponds to our planned submission to Neurips, Measure-Theoretic Representation of Fuzzy Sets, which covers a more in-depth experimental and theoretical development of FUSE for various applications in neuro-symbolic reasoning. This part is currently under construction.