Pinned Repositories
sae-rm
Using SAE's to interpret Reward Models (RM)
sparse_coding
Optimal-Policies-Tend-To-Seek-Power
Code for the paper "Optimal Policies Tend To Seek Power"
alignment-research-dataset
A dataset of alignment research and code to reproduce it
STFT_wifi_physical_fingerprint
white-box
Tools for understanding how transformer predictions are built layer-by-layer
conditionalGaussionRecreation
scrape-lesswrong
dictionary_learning
gpt-2
fork of nshepperd's fork of openai's gpt2
loganriggs's Repositories
loganriggs/sae-circuits
loganriggs/dictionary_learning
loganriggs/sae-rm
Using SAE's to interpret Reward Models (RM)
loganriggs/sparse_coding
loganriggs/neuron-interpretability
loganriggs/resume
My latest Resume
loganriggs/STFT_wifi_physical_fingerprint
loganriggs/gpt-neox
An implementation of model parallel autoregressive transformers on GPUs, based on the DeepSpeed library.
loganriggs/white-box
Tools for understanding how transformer predictions are built layer-by-layer
loganriggs/alignment-research-dataset
A dataset of alignment research and code to reproduce it
loganriggs/scrape-lesswrong
loganriggs/Optimal-Policies-Tend-To-Seek-Power
Code for the paper "Optimal Policies Tend To Seek Power"
loganriggs/minimal_module
loganriggs/papers
loganriggs/gpt-2
fork of nshepperd's fork of openai's gpt2
loganriggs/conditionalGaussionRecreation
loganriggs/zero_shot_learning
loganriggs/light_game