yawen-d
Concordia AI; MPhil in Machine Learning at Cambridge; Former Visiting Research Student at @HumanCompatibleAI, UC Berkeley.
yawen-d's Stars
normster/llm_rules
RuLES: a benchmark for evaluating rule-following in language models
aiverify-foundation/moonshot-data
Contains all assets to run with Moonshot Library (Connectors, Datasets and Metrics)
aiverify-foundation/moonshot-ui
Web UI for moonshot
IS2Lab/S-Eval
S-Eval: Automatic and Adaptive Test Generation for Benchmarking Safety Evaluation of Large Language Models
kevinyaobytedance/llm_eval
LLM evaluation.
EleutherAI/cookbook
Deep learning for dummies. All the practical details and useful utilities that go into working with real models.
GraySwanAI/nanoGCG
A fast + lightweight implementation of the GCG algorithm in PyTorch
RZFan525/Awesome-ScalingLaws
A curated list of awesome resources dedicated to Scaling Laws for LLMs
princeton-nlp/tree-of-thought-llm
[NeurIPS 2023] Tree of Thoughts: Deliberate Problem Solving with Large Language Models
karpathy/LLM101n
LLM101n: Let's build a Storyteller
aiverify-foundation/moonshot
Moonshot - A simple and modular tool to evaluate and red-team any LLM application.
lakeraai/pint-benchmark
A benchmark for prompt injection detection systems.
MetaGLM/zhipuai-sdk-python-v4
prompt-security/ps-fuzz
Make your GenAI Apps Safe & Secure :rocket: Test & harden your system prompt
EleutherAI/lm-evaluation-harness
A framework for few-shot evaluation of language models.
ydyjya/Awesome-LLM-Safety
A curated list of safety-related papers, articles, and resources focused on Large Language Models (LLMs). This repository aims to provide researchers, practitioners, and enthusiasts with insights into the safety implications, challenges, and advancements surrounding these powerful models.
haizelabs/redteaming-resistance-benchmark
METR/task-standard
METR Task Standard
adityatelange/hugo-PaperMod
A fast, clean, responsive Hugo theme.
UKGovernmentBEIS/inspect_ai
Inspect: A framework for large language model evaluations
OpenSafetyLab/SALAD-BENCH
【ACL 2024】 SALAD benchmark & MD-Judge
ChiangE/Sophon
The implementation of Sophon
princeton-nlp/SWE-agent
SWE-agent takes a GitHub issue and tries to automatically fix it, using GPT-4, or your LM of choice. It can also be employed for offensive cybersecurity or competitive coding challenges.
AI-secure/DecodingTrust
A Comprehensive Assessment of Trustworthiness in GPT Models
centerforaisafety/HarmBench
HarmBench: A Standardized Evaluation Framework for Automated Red Teaming and Robust Refusal
hpcaitech/Open-Sora
Open-Sora: Democratizing Efficient Video Production for All
alexandrasouly/strongreject
Repository for "StrongREJECT for Empty Jailbreaks" paper
open-compass/opencompass
OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.
BeenKim/BeenKim.github.io
website
Xianjun-Yang/Awesome_papers_on_LLMs_detection
The lastest paper about detection of LLM-generated text and code