dtch1997
Mechanistic interpretability researcher. Interested in interpreting multimodal foundation models
dtch1997's Stars
ollama/ollama
Get up and running with Llama 3.2, Mistral, Gemma 2, and other large language models.
pdm-project/pdm
A modern Python package and dependency manager supporting the latest PEP standards
PAIR-code/lit
The Learning Interpretability Tool: Interactively analyze ML models to understand their behavior in an extensible and framework agnostic interface.
explosion/sense2vec
🦆 Contextually-keyed word vectors
kyegomez/BitNet
Implementation of "BitNet: Scaling 1-bit Transformers for Large Language Models" in pytorch
pfnet/pfrl
PFRL: a PyTorch-based deep reinforcement learning library
octo-models/octo
Octo is a transformer-based robot policy trained on a diverse mix of 800k robot trajectories.
vikashplus/robohive
A unified framework for robot learning
danijar/crafter
Benchmarking the Spectrum of Agent Capabilities
mees/calvin
CALVIN - A benchmark for Language-Conditioned Policy Learning for Long-Horizon Robot Manipulation Tasks
facebookresearch/r3m
Pre-training Reusable Representations for Robotic Manipulation Using Diverse Human Video Data
davidbau/baukit
p-lambda/incontext-learning
Experiments and code to generate the GINC small-scale in-context learning dataset from "An Explanation for In-context Learning as Implicit Bayesian Inference"
conglu1997/v-d4rl
Challenges and Opportunities in Offline Reinforcement Learning from Visual Observations
nrimsky/CAA
Steering Llama 2 with Contrastive Activation Addition
KihoPark/linear_rep_geometry
roeehendel/icl_task_vectors
younggyoseo/apv
FLAIROx/jaxirl
Contains JAX implementation of algorithms for inverse reinforcement learning
steering-vectors/steering-vectors
Steering vectors for transformer language models in Pytorch / Huggingface
RLAgent/factor-world
Decomposing the Generalization Gap in Imitation Learning for Visual Robotic Manipulation (2023)
EleutherAI/elk-generalization
Investigating the generalization behavior of LM probes trained to predict truth labels: (1) from one annotator to another, and (2) from easy questions to hard
tdmpc2/tdmpc2-eval
Evaluation of TD-MPC2.
kevinzakka/dm_env_wrappers
Standalone library of frequently-used wrappers for dm_env environments.
etaoxing/kitchen-shift
KitchenShift: Evaluating Zero-Shot Generalization of Imitation-Based Policy Learning Under Domain Shifts
wusche1/CAA_hallucination
Public reposetory for code and results of parts of "Steering Llama 2 via Contrastive Activation Addition" by Rimsky, Gabrieli, Schulz et al.
etaoxing/domain-shift-benchmark
Joshuaclymer/GENIES
Generalization Analogies: A Testbed for Generalizing AI Oversight to Hard-To-Measure Domains
oelin/context-free-planning
Finding feasible solutions to planning problems using generative context-free grammars.
ethanluoyc/jam
Jam - JAX models