Yangyang0906C's Stars
Significant-Gravitas/AutoGPT
AutoGPT is the vision of accessible AI for everyone, to use and to build on. Our mission is to provide the tools, so that you can focus on what matters.
hiyouga/LLaMA-Factory
Unified Efficient Fine-Tuning of 100+ LLMs (ACL 2024)
windingwind/zotero-pdf-translate
Translate PDF, EPub, webpage, metadata, annotations, notes to the target language. Support 20+ translate services.
huggingface/alignment-handbook
Robust recipes to align language models with human and AI preferences
wdndev/llm_interview_note
主要记录大语言大模型(LLMs) 算法(应用)工程师相关的知识及面试题
OpenRLHF/OpenRLHF
An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & RingAttention & RFT)
google-research/football
Check out the new game server:
oxwhirl/pymarl
Python Multi-Agent Reinforcement Learning framework
multimodal-art-projection/MAP-NEO
RLHFlow/Online-RLHF
A recipe for online RLHF and online iterative DPO.
shariqiqbal2810/REFIL
Code for "Randomized Entity-wise Factorization for Multi-Agent Reinforcement Learning" ICML 2021
JiwenJ/Awesome-RL
A curated list of RL resources
yinyueqin/relative-preference-optimization
Relative Preference Optimization: Enhancing LLM Alignment through Contrasting Responses across Identical and Diverse Prompts