wellido's Stars
IS2Lab/S-Eval
S-Eval: Automatic and Adaptive Test Generation for Benchmarking Safety Evaluation of Large Language Models
hkust-nlp/deita
Deita: Data-Efficient Instruction Tuning for Alignment [ICLR2024]
Lordog/R-Judge
R-Judge: Benchmarking Safety Risk Awareness for LLM Agents
eseckel/ai-for-grant-writing
A curated list of resources for using LLMs to develop more competitive grant applications.
mdrafiqulrabin/tnpa-generalizability
IST'21 & SANER'22: Semantic-Preserving Program Transformations
zhiyuanhubj/UoT
[NeurIPS 2024] Uncertainty of Thoughts: Uncertainty-Aware Planning Enhances Information Seeking in Large Language Models
hijkzzz/Awesome-LLM-Strawberry
A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 and reasoning techniques.
thunlp/OpenAttack
An Open-Source Package for Textual Adversarial Attack.
FudanSELab/Agent4SE-Paper-List
Repository for the paper "Large Language Model-Based Agents for Software Engineering: A Survey".
ZJU-ACES-ISE/ChatUniTest
eth-sri/llm-quantization-attack
lafeat/apbench
APBench: A Unified Availability Poisoning Attack and Defenses Benchmark (TMLR 08/2024)
eth-sri/SafeCoder
SunflowerPKU/ICSE22_SC_Data
wagner-group/active-learning
Continuous Learning for Android Malware Detection (USENIX Security 2023)
null1024-ws/Poisoning-Attack-on-Code-Completion-Models
Paper "An LLM-Assisted Easy-to-Trigger Poisoning Attack on Code Completion Models: Injecting Disguised Vulnerabilities against Strong Detection"
ultralytics/ultralytics
Ultralytics YOLO11 🚀
google-deepmind/icml2024-roundtrip-correctness
All-Hands-AI/OpenHands
🙌 OpenHands: Code Less, Make More
Jingkang50/OpenOOD
Benchmarking Generalized Out-of-Distribution Detection
EachSheep/ShortcutsBench
ShortcutsBench: A Large-Scale Real-World Benchmark for API-Based Agents
OpenAutoCoder/Agentless
Agentless🐱: an agentless approach to automatically solve software development problems
testingautomated-usi/selforacle
The code of our paper "Misbehaviour Prediction for Autonomous Driving Systems", including our improved Udacity simulator
ast-fortiss-tum/misbehaviour-prediction-with-uncertainty-quantification
Codebase of the MSc thesis by Ruben Grewal "Uncertainty Quantification for Failure Prediction in Autonomous Driving Systems" and replication package of the paper "Predicting Safety Misbehaviours in Autonomous Driving Systems using Uncertainty Quantification" (ICST 2024).
microsoft/rho
Repo for Rho-1: Token-level Data Selection & Selective Pretraining of LLMs.
bigcode-project/bigcodebench
BigCodeBench: Benchmarking Code Generation Towards AGI
MCEVAL/McEval
smartyfh/LLM-Uncertainty-Bench
Benchmarking LLMs via Uncertainty Quantification
THU-MIG/torch-model-compression
针对pytorch模型的自动化模型结构分析和修改工具集,包含自动分析模型结构的模型压缩算法库
xingjianleng/autoeval_baselines
This repository includes various baseline techniques for label-free model evaluation task for the VDU2023 competition.