/geometry-of-truth-replication

replication of geometry of truth

Primary LanguageJupyter Notebook

This is a replication of geometry-of-truth by Samuel Marks and Max Tegmark. Here is the original code and paper.

In addition to replicating the main results for llama-1/2, this code also works for mistral.

Getting started

Run ./setup.sh. (~5 mins to fetch the right version of pytorch etc).
Download HF transformer models you want to use.
Open replication.ipynb. Edit model_path and model_names as needed.
Get the activations for the model of interest (one-time; ~30 mins to do all datasets).
Then run rest of code.

Models

Models used here were quantized to 4 bits by TheBloke using AutoGPTQ. Quantization did not effect the results. Code was run on a Paperspace A6000, but you could use a smaller machine. These are the models I tested:
llama-1
llama-2
mistral