ai-fail-safe

Pinned Repositories

gene-drive
a project to ensure that all child processes created by an agent "inherit" the agent's safety controls
1 1 00
honeypot
a project to detect environment tampering on the part of an agent
1 1 00
life-span
a project to ensure an artificial agent will eventually reach the end of its existence
1 1 00
mulligan
a library designed to shut down an agent exhibiting unexpected behavior providing a potential "mulligan" to human civilization; IN CASE OF FAILURE, DO NOT JUST REMOVE THIS CONSTRAINT AND START IT BACK UP AGAIN
1 1 00
safe-reward
a prototype for an AI safety library that allows an agent to maximize its reward by solving a puzzle in order to prevent the worst-case outcomes of perverse instantiation
Language:Python8 1 30

ai-fail-safe's Repositories

ai-fail-safe/safe-reward
a prototype for an AI safety library that allows an agent to maximize its reward by solving a puzzle in order to prevent the worst-case outcomes of perverse instantiation
Language:Python8 1 30
ai-fail-safe/gene-drive
a project to ensure that all child processes created by an agent "inherit" the agent's safety controls
1 1 00
ai-fail-safe/honeypot
a project to detect environment tampering on the part of an agent
1 1 00
ai-fail-safe/life-span
a project to ensure an artificial agent will eventually reach the end of its existence
1 1 00
ai-fail-safe/mulligan
a library designed to shut down an agent exhibiting unexpected behavior providing a potential "mulligan" to human civilization; IN CASE OF FAILURE, DO NOT JUST REMOVE THIS CONSTRAINT AND START IT BACK UP AGAIN
1 1 00