/malicious-prompt-detection

Detecting malicious prompts used to exploit large language models (LLMs) by leveraging supervised machine learning classifiers

Primary LanguagePython

Malicious Prompt Detection: A machine learning approach to safeguard LLMs