apartresearch

Artificial intelligence will change the world. Our mission is to ensure this happens safely and to the benefit of everyone.

Pinned Repositories

ai-psychology-starter
Code templates to get started as an AI psychologist
Language:Jupyter Notebook5 0 00
aisafetyideas
💡 The web app CI/CD for aisafetyideas.com
Language:Svelte8 0 353
deepdecipher
🦠 DeepDecipher: An open source API to MLP neurons
Language:Rust9 2 1010
evaluations-starter
How to get started in evaluations and demonstrations research for dangerous capabilities
5 2 21
Integer_Addition
✱ Understanding the underlying learning dynamics of simple tasks in Transformer networks
Language:Jupyter Notebook14 2 21
interpretability-starter
🧠 Starter templates for doing interpretability research
64 0 01
mechanisticinterpretability
A repository for awesome resources in mechanistic interpretability
5 0 00
Neuron2Graph
Tools for exploring Transformer neuron behaviour, including input pruning and diversification.
Language:Jupyter Notebook19 2 15
readingwhatwecan
📚📚📚📚📚📚📚📚📚 Reading everything
Language:CSS13 0 03
specificityplus
👩‍💻 Code for the ACL paper "Detecting Edit Failures in LLMs: An Improved Specificity Benchmark"
Language:Python20 2 243

apartresearch's Repositories

apartresearch/interpretability-starter
🧠 Starter templates for doing interpretability research
64 0 01
apartresearch/specificityplus
👩‍💻 Code for the ACL paper "Detecting Edit Failures in LLMs: An Improved Specificity Benchmark"
Language:Python20 2 243
apartresearch/Neuron2Graph
Tools for exploring Transformer neuron behaviour, including input pruning and diversification.
Language:Jupyter Notebook19 2 15
apartresearch/Integer_Addition
✱ Understanding the underlying learning dynamics of simple tasks in Transformer networks
Language:Jupyter Notebook14 2 21
apartresearch/readingwhatwecan
📚📚📚📚📚📚📚📚📚 Reading everything
Language:CSS13 0 03
apartresearch/deepdecipher
🦠 DeepDecipher: An open source API to MLP neurons
Language:Rust9 2 1010
apartresearch/aisafetyideas
💡 The web app CI/CD for aisafetyideas.com
Language:Svelte8 0 353
apartresearch/ai-psychology-starter
Code templates to get started as an AI psychologist
Language:Jupyter Notebook5 0 00
apartresearch/evaluations-starter
How to get started in evaluations and demonstrations research for dangerous capabilities
5 2 21
apartresearch/mechanisticinterpretability
A repository for awesome resources in mechanistic interpretability
5 0 00
apartresearch/Research-Augmentation-Hackbook
Language:Python5 1 00
apartresearch/3cb
3cb: Catastrophic Cyber Capabilities Benchmarking of Large Language Models
Language:Python4 4 1
apartresearch/AIS-cost-effectiveness
Cost-effectiveness models, tools, and results for various AI safety field-building programs.
Language:Python2 0 00
apartresearch/Interpreting-Learned-Feedback-Patterns
✱ Interpreting learned feedback patterns in large language models
Language:Jupyter Notebook2 0 71
apartresearch/othelloscope
Interpretability Hackathon 2.0 entry
Language:Jupyter Notebook2 0 30
apartresearch/scheduling-widget
📆 Showcases specific times in local time zones
Language:HTML2 0 00
apartresearch/hackathon-utils
😎 Code to run hackathons efficiently
Language:HTML1 1 0
apartresearch/ICML2024MI
🌍 Website for NeurIPS2023MI
Language:CSS1 1 02
apartresearch/n2g
Tools for exploring Transformer neuron behaviour, including input pruning and diversification.
Language:Jupyter Notebook1 0 00
apartresearch/paper-website
🌍 Website template for academic papers
Language:JavaScript1 0 01
apartresearch/scale-llm-24
🌍 Website for the Scaling Laws workshop
Language:CSS1 0 0
apartresearch/seqcont_circuits
✱ Interpreting how similar sequence continuation tasks share internal representations ✱
Language:Jupyter Notebook1 0 10
apartresearch/task-standard
🚨 METR Task Standard fork for the Code Red Hackathon
Language:TypeScript1 0 0
apartresearch/Verified_addition
Language:Jupyter Notebook1 2 00
apartresearch/.github
apartresearch/Apart-Evals
apartresearch/GPT-4-Chat-UI
GPT-4 frontend with open source Next.js template.
Language:JavaScript0 0
apartresearch/open
🌍 Repository to update our open data
0 0
apartresearch/team-sync-lab
Language:TypeScript
apartresearch/town_hall_avatar
Uses ChatGPT to simulate a townhall discussion between avatars
Language:Python0 0