liondw

Designer interested in AI safety communication. Working on the Signal-Alignment project to create educational resources for the AI alignment community.

Sydney

Pinned Repositories

Signal-Alignment
An initiative to create concise and widely shareable educational resources, infographics, and animated explainers on the latest contributions to the community AI alignment effort. Boosting the signal and moving the community towards finding and building solutions.
190
HeuristicImperatives
Reduce suffering, increase prosperity, increase understanding. A proposed framework to address the Control Problem.
Language:Python00
RLHI
Reinforcement Learning with Heuristic Imperatives - Finetuning LLMs for Post-Conventional Moral Intuition
Language:Python00
HeuristicImperatives
Reduce suffering, increase prosperity, increase understanding. A proposed framework to address the Control Problem.
Language:Python13925
RLHI
Reinforcement Learning with Heuristic Imperatives - Finetuning LLMs for Post-Conventional Moral Intuition
Language:Python6220

liondw's Repositories

liondw/Signal-Alignment
An initiative to create concise and widely shareable educational resources, infographics, and animated explainers on the latest contributions to the community AI alignment effort. Boosting the signal and moving the community towards finding and building solutions.
19
liondw/RLHI
Reinforcement Learning with Heuristic Imperatives - Finetuning LLMs for Post-Conventional Moral Intuition
liondw/HeuristicImperatives
Reduce suffering, increase prosperity, increase understanding. A proposed framework to address the Control Problem.

liondw

Pinned Repositories

Signal-Alignment

HeuristicImperatives

RLHI

HeuristicImperatives

RLHI

liondw's Repositories

liondw/Signal-Alignment

liondw/RLHI

liondw/HeuristicImperatives