SachinVashisth

Pinned Repositories

TrickLLM
This repository contains the code for the paper "Tricking LLMs into Disobedience: Formalizing, Analyzing, and Detecting Jailbreaks" by Abhinav Rao, Sachin Vashishta*, Atharva Naik*, Somak Aditya, and Monojit Choudhury, accepted at LREC-CoLING 2024
Language:Jupyter Notebook6 2 02
ABP
Language:Python1 1 21
linc
🔗 LINC: Logical Inference via Neurosymbolic Computation [EMNLP2023]
Language:Jupyter Notebook55 3 27
PromptAttack
An LLM can Fool Itself: A Prompt-Based Adversarial Attack (ICLR 2024)
Language:Python51 3 48
guidance
A guidance language for controlling large language models.
Language:Jupyter Notebook19.2k 118 5491.1k
multi-armed-bandit
Play with the solutions to the multi-armed-bandit problem.
Language:Python397 13 295
id-multi-label-hate-speech-and-abusive-language-detection
The Dataset for Multi Label Hate Speech and Abusive Language Detection in Indonesian Twitter
Language:TeX62 1 328
constraint_enforcing_reward
Code for the paper "A Constraint-Enforcing Reward for Adversarial Attacks on Text Classifiers"
Language:Python00
metaseq
Repo for external large-scale work
Language:Python00
SachinVashisth.github.io
Language:HTML00

SachinVashisth's Repositories

SachinVashisth/metaseq
Repo for external large-scale work
Language:Python00
SachinVashisth/SachinVashisth.github.io
Language:HTML00