Pinned Repositories
FFT
Benchmark for LLM Harmlessness Evaluation with Factuality, Fairness and Toxicity
assessing_safety_realign
edit_knwoledge
Scenarios-aware Commonsense Correcton via Instance-level Knowledge Injection
FGDILP
Fine-Grained Detoxification via Instance-Level Prefixes for Large Language Models (accepted by Nurocomputing)
safety_realignment
A safety realignment framework via subspace-oriented model fusion for large language models (accepted by KBS)
xinykou's Repositories
xinykou/safety_realignment
A safety realignment framework via subspace-oriented model fusion for large language models (accepted by KBS)
xinykou/assessing_safety_realign
xinykou/edit_knwoledge
Scenarios-aware Commonsense Correcton via Instance-level Knowledge Injection
xinykou/FGDILP
Fine-Grained Detoxification via Instance-Level Prefixes for Large Language Models (accepted by Nurocomputing)