Renderers are Good Zero-Shot Representation Learners: Exploring Diffusion Latents for Metric Learning

Code for paper "Renderers are Good Zero-Shot Representation Learners: Exploring Diffusion Latents for Metric Learning" and joint work with David Shustin.

Code

Notebooks for training and retrieval evaluation (+ reproduction of key figures) are available at train.ipynb and run.ipynb, respectively. Training code relies on our adaptation SimCLR/ of a PyTorch implementation of SimCLR to take in multiple embeddings from different views instead of transformations of single images.

Data

For convenience, we uploaded the precomputed EfficientNet and Shap-E embeddings for the dataset of 300 scenes (20 images per scene) to Google Drive. Our precomputed database of embeddings can be found here.

To replicate these embeddings, the original ShapeNet SRN Cars dataset can be found here (maintained by the authors of PixelNeRF). The code for EfficientNet is available in torchvision, while the code for Shap-E is available here (maintained by OpenAI), where this notebook is particularly helpful.

michaelwilliamtang/golden-retriever

Renderers are Good Zero-Shot Representation Learners: Exploring Diffusion Latents for Metric Learning

Code

Data