JLUtangchuan/text2shape

Learn embedding for text and shape.

Python

text2shape

Project is based on this paper.

Setup

Dataset

The dataset can be find here. Download and add to data folder:

Text Descriptions
Solid Voxelizations: 32 Resolution

Dependencies

pynrrd 0.4.2 :

pip install pynrrd

spacy-2.3.0 :

pip install spacy
python3 -m spacy download en_core_web_sm

language-tool-python-2.2.3 :

pip install language-tool-python

Getting started

Preprocessing for descriptions

remove descriptions with more than max_length words (default 96 words)
preprocessing description (each word/symbol is seperated by space)
vocabulary gets filled with words that appear more than twice

python3 preprocessing/run_preprocessing.py data/captions.tablechair.csv data/full_preprocessed.captions.csv data/full_voc.csv

python3 preprocessing/run_preprocessing_primitives.py data/primitives.v2/ "shape" data/vic_primitives primitives_voc.csv

Learning embeddings

set configuration in config/cfg.yaml

python3 train.py config/cfg.yaml

Retrievals

define which retrievals and further configs within config/cfg_retrieval.yaml
possibile retrievals:
text 2 text (t2t)
text 2 shape (t2s)
shape 2 text (s2t)
shape 2 shape (s2s)

python3 retrieval.py config/cfg_retrieval.yaml

run T-SNE

set configuration in config/cfg_tsne.yaml

python3 t-SNE.py config/cfg_tsne.yaml

result is found in results as tsne.png