LuhanMikaelson

Pinned Repositories

self-ablating-transformers
A self-modeling transformer with an auxiliary output head that is an ablation mask for itself, in a second forward pass
Language:Jupyter Notebook31
emili
EMILI (Emotionally Intelligent Listener) adds emotion tags sourced from video to your OpenAI API calls
Language:Python26
ARENA_3.0
Language:Jupyter Notebook00
Deception-RepE
Language:Jupyter Notebook00
emotion-tune
SPAR Summer 2024: Improving RLHF with Emotion-based Feedback
Language:Python00
self-ablating-transformers
A self-modeling transformer with an auxiliary output head that is an ablation mask for itself, in a second forward pass
Language:Jupyter Notebook00
self-modeling-ResNet_CIFAR
Code for replicating the experiments reported on in Unexpected Benefits of Self-Modeling in Neural Systems
Language:Python00
werewolf-bench
Benchmarking AI deception with One Night: Ultimate Werewolf game
Language:Jupyter Notebook10
spinningup
An educational resource to help anyone learn deep reinforcement learning.
Language:Python10.4k 230 2892.3k
WashBench
Summarization Relevance Benchmark for Large Language Models
Language:Jupyter Notebook20

LuhanMikaelson's Repositories

LuhanMikaelson/werewolf-bench
Benchmarking AI deception with One Night: Ultimate Werewolf game
Language:Jupyter Notebook10
LuhanMikaelson/ARENA_3.0
Language:Jupyter Notebook00
LuhanMikaelson/Deception-RepE
Language:Jupyter Notebook00
LuhanMikaelson/emotion-tune
SPAR Summer 2024: Improving RLHF with Emotion-based Feedback
Language:Python00
LuhanMikaelson/self-ablating-transformers
A self-modeling transformer with an auxiliary output head that is an ablation mask for itself, in a second forward pass
Language:Jupyter Notebook00
LuhanMikaelson/self-modeling-ResNet_CIFAR
Code for replicating the experiments reported on in Unexpected Benefits of Self-Modeling in Neural Systems
Language:Python00