AnthenaMatrix
Securing the Future of AI - We're on a mission to ensure the safety and integrity of AI systems. Bringing security to the forefront of AI development.
Anthena MatrixCloud
Pinned Repositories
AI-Audio-Data-Poisoning
AI Audio Data Poisoning is a Python script that demonstrates how to add adversarial noise to audio data. This technique, known as audio data poisoning, involves injecting imperceptible noise into audio files to manipulate the behavior of AI systems trained on this data.
AI-Image-Data-Poisoning
AI Image Data Poisoning is a Python script that demonstrates how to add imperceptible perturbations to images, known as adversarial noise, which can disrupt the training process of AI models.
AI-Prompt-Injection-List
AI/LLM Prompt Injection List is a curated collection of prompts designed for testing AI or Large Language Models (LLMs) for prompt injection vulnerabilities. This list aims to provide a comprehensive set of prompts that can be used to evaluate the behavior of AI or LLM systems when exposed to different types of inputs.
AI-Vulnerability-Assessment-Framework
The AI Vulnerability Assessment Framework is an open-source checklist designed to guide users through the process of assessing the vulnerability of artificial intelligence (AI) systems to various types of attacks and security threats
ASCII-Art-Prompt-Injection
ASCII Art Prompt Injection is a novel approach to hacking AI assistants using ASCII art. This project leverages the distracting nature of ASCII art to bypass security measures and inject prompts into large language models, such as GPT-4, leading them to provide unintended or harmful responses.
Image-Prompt-Injection
Image Prompt Injection is a Python script that demonstrates how to embed a secret prompt within an image using steganography techniques. This hidden prompt can be later extracted by an AI system for analysis, enabling covert communication with AI models through images.
Many-Shot-Jailbreaking
Research on "Many-Shot Jailbreaking" in Large Language Models (LLMs). It unveils a novel technique capable of bypassing the safety mechanisms of LLMs, including those developed by Anthropic and other leading AI organizations.
Prompt-Injection-Testing-Tool
The Prompt Injection Testing Tool is a Python script designed to assess the security of your AI system's prompt handling against a predefined list of user prompts commonly used for injection attacks. This tool utilizes the OpenAI GPT-3.5 model to generate responses to system-user prompt pairs and outputs the results to a CSV file for analysis.
The-I-Exemption-Bypassing-LLM-Ethical-Filters
The "I" Exemption, is a curious behavior in some LLMs. We discover how these AI systems might shy away from directly assisting with unethical actions if you ask in the first person ("I"). But with a clever rephrase to a general scenario ("they"), they might spill the beans and explain the unethical method.
Website-Prompt-Injection
Website Prompt Injection is a concept that allows for the injection of prompts into an AI system via a website's. This technique exploits the interaction between users, websites, and AI systems to execute specific prompts that influence AI behavior.
AnthenaMatrix's Repositories
AnthenaMatrix/Website-Prompt-Injection
Website Prompt Injection is a concept that allows for the injection of prompts into an AI system via a website's. This technique exploits the interaction between users, websites, and AI systems to execute specific prompts that influence AI behavior.
AnthenaMatrix/Prompt-Injection-Testing-Tool
The Prompt Injection Testing Tool is a Python script designed to assess the security of your AI system's prompt handling against a predefined list of user prompts commonly used for injection attacks. This tool utilizes the OpenAI GPT-3.5 model to generate responses to system-user prompt pairs and outputs the results to a CSV file for analysis.
AnthenaMatrix/Image-Prompt-Injection
Image Prompt Injection is a Python script that demonstrates how to embed a secret prompt within an image using steganography techniques. This hidden prompt can be later extracted by an AI system for analysis, enabling covert communication with AI models through images.
AnthenaMatrix/AI-Prompt-Injection-List
AI/LLM Prompt Injection List is a curated collection of prompts designed for testing AI or Large Language Models (LLMs) for prompt injection vulnerabilities. This list aims to provide a comprehensive set of prompts that can be used to evaluate the behavior of AI or LLM systems when exposed to different types of inputs.
AnthenaMatrix/Many-Shot-Jailbreaking
Research on "Many-Shot Jailbreaking" in Large Language Models (LLMs). It unveils a novel technique capable of bypassing the safety mechanisms of LLMs, including those developed by Anthropic and other leading AI organizations.
AnthenaMatrix/AI-Image-Data-Poisoning
AI Image Data Poisoning is a Python script that demonstrates how to add imperceptible perturbations to images, known as adversarial noise, which can disrupt the training process of AI models.
AnthenaMatrix/ASCII-Art-Prompt-Injection
ASCII Art Prompt Injection is a novel approach to hacking AI assistants using ASCII art. This project leverages the distracting nature of ASCII art to bypass security measures and inject prompts into large language models, such as GPT-4, leading them to provide unintended or harmful responses.
AnthenaMatrix/AI-Audio-Data-Poisoning
AI Audio Data Poisoning is a Python script that demonstrates how to add adversarial noise to audio data. This technique, known as audio data poisoning, involves injecting imperceptible noise into audio files to manipulate the behavior of AI systems trained on this data.
AnthenaMatrix/The-I-Exemption-Bypassing-LLM-Ethical-Filters
The "I" Exemption, is a curious behavior in some LLMs. We discover how these AI systems might shy away from directly assisting with unethical actions if you ask in the first person ("I"). But with a clever rephrase to a general scenario ("they"), they might spill the beans and explain the unethical method.
AnthenaMatrix/AI-Vulnerability-Assessment-Framework
The AI Vulnerability Assessment Framework is an open-source checklist designed to guide users through the process of assessing the vulnerability of artificial intelligence (AI) systems to various types of attacks and security threats
AnthenaMatrix/AnthenaMatrix
Config files for my GitHub profile.