aisafety

There are 11 repositories under aisafety topic.

tigerlab-ai/tiger
Open Source LLM toolkit to build trustworthy LLM applications. TigerArmor (AI safety), TigerRAG (embedding, RAG), TigerTune (fine-tuning)
Language:Jupyter Notebook388 11 926
trendmicro/ais
Toolkit for research purposes in AIS. See the website for the paper.
Language:Python94 20 1455
metadriverse/cat
[CoRL'23] Adversarial Training for Safe End-to-End Driving
Language:Python49 1 53
riceissa/aiwatch
Website to track people, organizations, and products (tools, websites, etc.) in AI safety
Language:HTML21 4 1517
ZiyueWang25/llm-security-challenge
Can Large Language Models Solve Security Challenges? We test LLMs' ability to interact and break out of shell environments using the OverTheWire wargames environment, showing the models' surprising ability to do action-oriented cyberexploits in shell environments
Language:Python10 1 05
kkhetarpal/ais
Common repository for our readings and discussions
6 8 00
kkhetarpal/safe_a2oc_delib
Safe Option Critic: Learning Safe Options in the A2OC Architecture
Language:Python6 3 00
endlessloop2/UC-AI-Thinkathon-2023
Winning entry for the UC Chile AI Safety Thinkathon 2023. Coauthor @mon-b
Language:R4 2 00
Pearljam66/Machine-Learning-Resources
An organized repository of essential machine learning resources, including tutorials, papers, books, and tools, each with corresponding links for easy access.
1 1 0
SasankYadati/mech-interp
where I learn and explore mechanistic interpretability of transformers
Language:Jupyter Notebook0 2 00
tractatus7/tractatus7.github.io
blog

aisafety

tigerlab-ai/tiger

trendmicro/ais

metadriverse/cat

riceissa/aiwatch

ZiyueWang25/llm-security-challenge

kkhetarpal/ais

kkhetarpal/safe_a2oc_delib

endlessloop2/UC-AI-Thinkathon-2023

Pearljam66/Machine-Learning-Resources

SasankYadati/mech-interp

tractatus7/tractatus7.github.io