sinanw/llm-security-prompt-injection
This project investigates the security of large language models by performing binary classification of a set of input prompts to discover malicious prompts. Several approaches have been analyzed using classical ML algorithms, a trained LLM model, and a fine-tuned LLM model.
Jupyter NotebookMIT
Issues
- 2
Data set issue
#2 opened by krishanharwani - 1
Fine tuning
#1 opened by dejanp777