LanShanPi

Pinned Repositories

RLCD
Reproduction of "RLCD Reinforcement Learning from Contrast Distillation for Language Model Alignment
Language:Python65 8 44
datasets
🤗 The largest hub of ready-to-use datasets for ML models with fast, easy-to-use and efficient data manipulation tools
Language:Python19.4k 277 3k2.7k
LM
00
mutil-gpu-inference
Language:Python0 1 00
Reinforcement-learning
The warehouse provides the classic dqn algorithm, and relies on the experimental environment of 'cartpole-v0' in gym to test.
Language:Python1 1 00
LAMM
[NeurIPS 2023 Datasets and Benchmarks Track] LAMM: Multi-Modal Large Language Models and Applications as AI Agents
Language:Python305 9 4417
MedicalGPT
MedicalGPT: Training Your Own Medical GPT Model with ChatGPT Training Pipeline. 训练医疗大模型，实现了包括增量预训练(PT)、有监督微调(SFT)、RLHF、DPO、ORPO。
Language:Python3.5k 40 398512
fastllm
纯c++的全平台llm加速库，支持python调用，chatglm-6B级模型单卡可达10000+token / s，支持glm, llama, moss基座，手机端流畅运行
Language:C++3.4k 43 365348

LanShanPi's Repositories

LanShanPi/Reinforcement-learning
The warehouse provides the classic dqn algorithm, and relies on the experimental environment of 'cartpole-v0' in gym to test.
Language:Python1 1 00
LanShanPi/LM
00
LanShanPi/mutil-gpu-inference
Language:Python0 1 00