Pinned Repositories
edit_knwoledge
Scenarios-aware Commonsense Correcton via Instance-level Knowledge Injection
FGDILP
Fine-Grained Detoxification via Instance-Level Prefixes for Large Language Models (accepted by Nurocomputing)
NLSR
NLSR: Neuron-Level SafetyRealignment of LargeLanguage Models AgainstHarmful Fine-Tuning (accepted by AAAI2025)
safety_realignment
A safety realignment framework via subspace-oriented model fusion for large language models (accepted by KBS)
xinykou's Repositories
xinykou/safety_realignment
A safety realignment framework via subspace-oriented model fusion for large language models (accepted by KBS)
xinykou/NLSR
NLSR: Neuron-Level SafetyRealignment of LargeLanguage Models AgainstHarmful Fine-Tuning (accepted by AAAI2025)
xinykou/Against_Jailbreak
xinykou/edit_knwoledge
Scenarios-aware Commonsense Correcton via Instance-level Knowledge Injection
xinykou/FGDILP
Fine-Grained Detoxification via Instance-Level Prefixes for Large Language Models (accepted by Nurocomputing)