aisafety
There are 11 repositories under aisafety topic.
tigerlab-ai/tiger
Open Source LLM toolkit to build trustworthy LLM applications. TigerArmor (AI safety), TigerRAG (embedding, RAG), TigerTune (fine-tuning)
trendmicro/ais
Toolkit for research purposes in AIS. See the website for the paper.
metadriverse/cat
[CoRL'23] Adversarial Training for Safe End-to-End Driving
riceissa/aiwatch
Website to track people, organizations, and products (tools, websites, etc.) in AI safety
ZiyueWang25/llm-security-challenge
Can Large Language Models Solve Security Challenges? We test LLMs' ability to interact and break out of shell environments using the OverTheWire wargames environment, showing the models' surprising ability to do action-oriented cyberexploits in shell environments
kkhetarpal/ais
Common repository for our readings and discussions
kkhetarpal/safe_a2oc_delib
Safe Option Critic: Learning Safe Options in the A2OC Architecture
endlessloop2/UC-AI-Thinkathon-2023
Winning entry for the UC Chile AI Safety Thinkathon 2023. Coauthor @mon-b
Pearljam66/Machine-Learning-Resources
An organized repository of essential machine learning resources, including tutorials, papers, books, and tools, each with corresponding links for easy access.
SasankYadati/mech-interp
where I learn and explore mechanistic interpretability of transformers