Evolving Interpretable Visual Classifiers with Large Language Models (ECCV 2024)

This is the code for the paper Evolving Interpretable Visual Classifiers with Large Language Models by Mia Chiquier, Utkarsh Mall and Carl Vondrick.

Quickstart

Clone recursively:

git clone --recurse-submodules https://github.com/mchiquier/llm-mutate.git

After cloning:

cd llm-mutate
export PATH=/usr/local/cuda/bin:$PATH
bash setup.sh  # This may take a while. Make sure the llm-mutate environment is active

Detailed Installation

Clone this repository with its submodules.
Install the dependencies. See Dependencies.
Install vLLM. See vLLM.

Cloning this Repo

git clone --recurse-submodules https://github.com/mchiquier/llm-mutate.git

Dependencies

First, create a conda environment using setup_env.sh. To do so, just cd into the llm-mutate directory, and run:

export PATH=/usr/local/cuda/bin:$PATH
bash setup_env.sh
conda activate llmmutate

vLLM

Launch a server that hosts the LLM with vLLM in a tmux.

python -m vllm.entrypoints.openai.api_server --model meta-llama/Llama-2-70b-chat-hf --tensor-parallel-size 8 --chat-template ./examples/template_chatml.jinja --trust-remote-code

Datasets

Download the iNaturalist dataset (specifically, 'train' and 'validation') from : https://github.com/visipedia/inat_comp/tree/master/2021 Update the path to the parent folder of train and val in config.py : self.dataset_path = YOUR_PATH.

Download the two KikiBouba datasets here:

KikiBouba_v1: https://drive.google.com/file/d/1-a4FRS9N1DLf3_YIYq8150zN1uilapft/view?usp=sharing KikiBouba_v2: https://drive.google.com/file/d/17ibF3tzFiZrMb9ZnpYlLEh-xmWkPJpNH/view?usp=sharing

Remember to update the dataset_path in the config.py file.

Running LLM-Mutate on iNaturalist

Running this will automatically launch both pre-training and joint-training on the iNaturalist dataset for the Lichen synset. To change the synset, modify the config file. See paper for explanation of pretraining/joint training.

python src/llm-mutate.py

Running LLM-Mutate on YOUR dataset!

All you need to do to run LLM-Mutate on your dataset is update the dataset path in the config.py file and also make sure that this path points to a folder that has the following folder structure.

dataset-root/
│
├── train/
│   ├── class 1/
│   │   ├── img1.png 
│   │   ├── img2.png 
...
├── val/
│   ├── class 1/
│   │   ├── img1.png 
│   │   ├── img2.png

NOTE: By default, we only run on the first 5 classes in your folder structure for time efficiency. If you'd like to change this, modify the num_classes attribute in config.py.

python src/llm-mutate.py

Inference

Specify what experiment you'd like to evaluate in the config.py file in the experiment attribute. You can pick from: zero_shot, clip_scientific, clip_common, cbd_scientific, cbd_common, ours. Note that for the KikiBouba datasets there is no difference between scientific and common as there is only one class name per class.

python src/inference.py

Citation

If you use this code, please consider citing the paper as:

@article{chiquier2024evolving,
  title={Evolving Interpretable Visual Classifiers with Large Language Models},
  author={Chiquier, Mia and Mall, Utkarsh and Vondrick, Carl},
  journal={arXiv preprint arXiv:2404.09941},
  year={2024}
}