paraGONG

paraGONG's Stars

Zjh-819/LLMDataHub
A quick guide (especially) for trending instruction finetuning datasets
2.7k175
allenai/reward-bench
RewardBench: the first evaluation tool for reward models.
Language:Python47456
OpenLMLab/MOSS-RLHF
Secrets of RLHF in Large Language Models Part I: PPO
Language:Python1.3k101
yfzhang114/Generalization-Causality
关于domain generalization，domain adaptation，causality，robutness，prompt，optimization，generative model各式各样研究的阅读笔记
1.2k102
princeton-nlp/LESS
[ICML 2024] LESS: Selecting Influential Data for Targeted Instruction Tuning
Language:Jupyter Notebook39539
MathFoundationRL/Book-Mathematical-Foundation-of-Reinforcement-Learning
This is the homepage of a new book entitled "Mathematical Foundations of Reinforcement Learning."
Language:MATLAB4.3k561
RLHFlow/RLHF-Reward-Modeling
Recipes to train reward model for RLHF.
Language:Python1.1k76
RLHFlow/Online-RLHF
A recipe for online RLHF and online iterative DPO.
Language:Python47151
XayahSuSuSu/Latex-HNUThesisTemplate
湖南大学本科毕业论文LaTeX模板（大理类）
Language:TeX22
louieworth/awesome-rlhf
An index of algorithms for reinforcement learning from human feedback (rlhf))
892
zepingyu0512/awesome-llm-understanding-mechanism
awesome papers in LLM interpretability
37111
openai/transformer-debugger
Language:Python4.1k241
wangshusen/DRL
Deep Reinforcement Learning
3.5k593
openai/weak-to-strong
Language:Python2.5k310
Improbable-AI/curiosity_redteam
Official implementation of ICLR'24 paper, "Curiosity-driven Red Teaming for Large Language Models" (https://openreview.net/pdf?id=4KqkizXgXU)
Language:Jupyter Notebook6711
microsoft/DeepSpeed-MII
MII makes low-latency and high-throughput inference possible, powered by DeepSpeed.
Language:Python1.9k176
vllm-project/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
Language:Python33.1k5k
OpenRLHF/OpenRLHF
An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & RingAttention & RFT)
Language:Python3.5k331
thunlp/UltraChat
Large-scale, Informative, and Diverse Multi-round Chat Data (and Models)
Language:Python2.3k117
OFA-Sys/InsTag
InsTag: A Tool for Data Analysis in LLM Supervised Fine-tuning
2347
langchain-ai/langchain
🦜🔗 Build context-aware reasoning applications
Language:Jupyter Notebook97.6k15.8k
ZigeW/data_management_LLM
Collection of training data management explorations for large language models
29929
tatsu-lab/stanford_alpaca
Code and documentation to train Stanford's Alpaca models, and generate the data.
Language:Python29.7k4.1k
ydyjya/Awesome-LLM-Safety
A curated list of safety-related papers, articles, and resources focused on Large Language Models (LLMs). This repository aims to provide researchers, practitioners, and enthusiasts with insights into the safety implications, challenges, and advancements surrounding these powerful models.
1.1k56
project-baize/baize-chatbot
Let ChatGPT teach your own chatbot in hours with a single GPU!
Language:Python3.2k287
PAIR-code/lit
The Learning Interpretability Tool: Interactively analyze ML models to understand their behavior in an extensible and framework agnostic interface.
Language:TypeScript3.5k357
ridiculouz/LLMaAA
The official repository for paper "LLMaAA: Making Large Language Models as Active Annotators"
Language:Python373
openai/openai-python
The official Python library for the OpenAI API
Language:Python23.8k3.4k
dsdanielpark/Bard-API
The unofficial python package that returns response of Google Bard through cookie value.
Language:Python5.3k526
baichuan-inc/Baichuan2
A series of large language models developed by Baichuan Intelligent Technology
Language:Python4.1k297