wellido

homeless

wellido's Stars

IS2Lab/S-Eval
S-Eval: Automatic and Adaptive Test Generation for Benchmarking Safety Evaluation of Large Language Models
323
hkust-nlp/deita
Deita: Data-Efficient Instruction Tuning for Alignment [ICLR2024]
Language:Python47427
Lordog/R-Judge
R-Judge: Benchmarking Safety Risk Awareness for LLM Agents
Language:Python586
eseckel/ai-for-grant-writing
A curated list of resources for using LLMs to develop more competitive grant applications.
Language:Python1.9k254
mdrafiqulrabin/tnpa-generalizability
IST'21 & SANER'22: Semantic-Preserving Program Transformations
Language:Java317
zhiyuanhubj/UoT
[NeurIPS 2024] Uncertainty of Thoughts: Uncertainty-Aware Planning Enhances Information Seeking in Large Language Models
Language:Python643
hijkzzz/Awesome-LLM-Strawberry
A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 and reasoning techniques.
4k215
thunlp/OpenAttack
An Open-Source Package for Textual Adversarial Attack.
Language:Python682124
FudanSELab/Agent4SE-Paper-List
Repository for the paper "Large Language Model-Based Agents for Software Engineering: A Survey".
25219
ZJU-ACES-ISE/ChatUniTest
6917
eth-sri/llm-quantization-attack
Language:Python101
lafeat/apbench
APBench: A Unified Availability Poisoning Attack and Defenses Benchmark (TMLR 08/2024)
Language:Python252
eth-sri/SafeCoder
Language:Python254
SunflowerPKU/ICSE22_SC_Data
Language:Python71
wagner-group/active-learning
Continuous Learning for Android Malware Detection (USENIX Security 2023)
Language:Python5615
null1024-ws/Poisoning-Attack-on-Code-Completion-Models
Paper "An LLM-Assisted Easy-to-Trigger Poisoning Attack on Code Completion Models: Injecting Disguised Vulnerabilities against Strong Detection"
Language:Python6
ultralytics/ultralytics
Ultralytics YOLO11 🚀
Language:Python29.4k5.8k
google-deepmind/icml2024-roundtrip-correctness
Language:Python82
All-Hands-AI/OpenHands
🙌 OpenHands: Code Less, Make More
Language:Python32.6k3.7k
Jingkang50/OpenOOD
Benchmarking Generalized Out-of-Distribution Detection
Language:Python842107
EachSheep/ShortcutsBench
ShortcutsBench: A Large-Scale Real-World Benchmark for API-Based Agents
Language:Python72
OpenAutoCoder/Agentless
Agentless🐱: an agentless approach to automatically solve software development problems
Language:Python67777
testingautomated-usi/selforacle
The code of our paper "Misbehaviour Prediction for Autonomous Driving Systems", including our improved Udacity simulator
Language:Python2112
ast-fortiss-tum/misbehaviour-prediction-with-uncertainty-quantification
Codebase of the MSc thesis by Ruben Grewal "Uncertainty Quantification for Failure Prediction in Autonomous Driving Systems" and replication package of the paper "Predicting Safety Misbehaviours in Autonomous Driving Systems using Uncertainty Quantification" (ICST 2024).
Language:Jupyter Notebook11
microsoft/rho
Repo for Rho-1: Token-level Data Selection & Selective Pretraining of LLMs.
29611
bigcode-project/bigcodebench
BigCodeBench: Benchmarking Code Generation Towards AGI
Language:Python19122
MCEVAL/McEval
Language:Python201
smartyfh/LLM-Uncertainty-Bench
Benchmarking LLMs via Uncertainty Quantification
Language:Python2097
THU-MIG/torch-model-compression
针对pytorch模型的自动化模型结构分析和修改工具集，包含自动分析模型结构的模型压缩算法库
Language:Python23640
xingjianleng/autoeval_baselines
This repository includes various baseline techniques for label-free model evaluation task for the VDU2023 competition.
Language:Python191

wellido

wellido's Stars

IS2Lab/S-Eval

hkust-nlp/deita

Lordog/R-Judge

eseckel/ai-for-grant-writing

mdrafiqulrabin/tnpa-generalizability

zhiyuanhubj/UoT

hijkzzz/Awesome-LLM-Strawberry

thunlp/OpenAttack

FudanSELab/Agent4SE-Paper-List

ZJU-ACES-ISE/ChatUniTest

eth-sri/llm-quantization-attack

lafeat/apbench

eth-sri/SafeCoder

SunflowerPKU/ICSE22_SC_Data

wagner-group/active-learning

null1024-ws/Poisoning-Attack-on-Code-Completion-Models

ultralytics/ultralytics

google-deepmind/icml2024-roundtrip-correctness

All-Hands-AI/OpenHands

Jingkang50/OpenOOD

EachSheep/ShortcutsBench

OpenAutoCoder/Agentless

testingautomated-usi/selforacle

ast-fortiss-tum/misbehaviour-prediction-with-uncertainty-quantification

microsoft/rho

bigcode-project/bigcodebench

MCEVAL/McEval

smartyfh/LLM-Uncertainty-Bench

THU-MIG/torch-model-compression

xingjianleng/autoeval_baselines