llm-security
There are 70 repositories under llm-security topic.
pathwaycom/llm-app
Ready-to-run cloud templates for RAG, AI pipelines, and enterprise search with live data. 🐳Docker-friendly.⚡Always in sync with Sharepoint, Google Drive, S3, Kafka, PostgreSQL, real-time data APIs, and more.
Giskard-AI/giskard
🐢 Open-Source Evaluation & Testing for AI & LLM systems
NVIDIA/garak
the LLM vulnerability scanner
verazuo/jailbreak_llms
[CCS'24] A dataset consists of 15,140 ChatGPT prompts from Reddit, Discord, websites, and open-source datasets (including 1,405 jailbreak prompts).
protectai/llm-guard
The Security Toolkit for LLM Interactions
msoedov/agentic_security
Agentic LLM Vulnerability Scanner / AI red teaming kit
mariocandela/beelzebub
A secure low code honeypot framework, leveraging AI for System Virtualization.
EasyJailbreak/EasyJailbreak
An easy-to-use Python framework to generate adversarial jailbreak prompts.
chawins/llm-sp
Papers and resources related to the security and privacy of LLMs 🤖
deadbits/vigil-llm
⚡ Vigil ⚡ Detect prompt injections, jailbreaks, and other potentially risky Large Language Model (LLM) inputs
R3DRUN3/sploitcraft
🏴☠️ Hacking Guides, Demos and Proof-of-Concepts 🥷
liu00222/Open-Prompt-Injection
This repository provides implementation to formalize and benchmark Prompt Injection attacks and defenses
phantasmlabs/phantasm
Toolkits to create a human-in-the-loop approval layer to monitor and guide AI agents workflow in real-time.
yevh/TaaC-AI
AI-driven Threat modeling-as-a-Code (TaaC-AI)
ZenGuard-AI/fast-llm-security-guardrails
The fastest && easiest LLM security guardrails for AI Agents and applications.
arekusandr/last_layer
Ultra-fast, low latency LLM prompt injection/jailbreak detection ⛓️
raga-ai-hub/raga-llm-hub
Framework for LLM evaluation, guardrails and security
lakeraai/pint-benchmark
A benchmark for prompt injection detection systems.
pdparchitect/llm-hacking-database
This repository contains various attack against Large Language Models.
llm-platform-security/SecGPT
SecGPT: An execution isolation architecture for LLM-based systems
microsoft/BIPIA
A benchmark for evaluating the robustness of LLMs and defenses to indirect prompt injection attacks.
NaniDAO/ie
intents engine
azminewasi/Awesome-LLMs-ICLR-24
It is a comprehensive resource hub compiling all LLM papers accepted at the International Conference on Learning Representations (ICLR) in 2024.
briland/LLM-security-and-privacy
LLM security and privacy
RomiconEZ/llamator
Framework for testing vulnerabilities of large language models (LLM).
sinanw/llm-security-prompt-injection
This project investigates the security of large language models by performing binary classification of a set of input prompts to discover malicious prompts. Several approaches have been analyzed using classical ML algorithms, a trained LLM model, and a fine-tuned LLM model.
SEC-CAFE/handbook
安全手册,企业安全实践、攻防与安全研究知识库
leondz/lm_risk_cards
Risks and targets for assessing LLMs & LLM vulnerabilities
LostOxygen/llm-confidentiality
Whispers in the Machine: Confidentiality in LLM-integrated Systems
llm-platform-security/chatgpt-plugin-eval
LLM Platform Security: Applying a Systematic Evaluation Framework to OpenAI's ChatGPT Plugins
TrustAI-laboratory/Learn-Prompt-Hacking
This is The most comprehensive prompt hacking course available, which record our progress on a prompt engineering and prompt hacking course.
google/litmus
Litmus is a comprehensive LLM testing and evaluation tool designed for GenAI Application Development. It provides a robust platform with a user-friendly UI for streamlining the process of building and assessing the performance of your LLM-powered applications.
dapurv5/awesome-red-teaming-llms
Repository accompanying the paper https://arxiv.org/abs/2407.14937
lakeraai/chainguard
Guard your LangChain applications against prompt injection with Lakera ChainGuard.
levitation-opensource/Manipulative-Expression-Recognition
MER is a software that identifies and highlights manipulative communication in text from human conversations and AI-generated responses. MER benchmarks language models for manipulative expressions, fostering development of transparency and safety in AI. It also supports manipulation victims by detecting manipulative patterns in human communication.
jiangnanboy/llm_security
利用分类法和敏感词检测法对生成式大模型的输入和输出内容进行安全检测,尽早识别风险内容。The input and output contents of generative large model are checked by classification method and sensitive word detection method to identify content risk as early as possible.