SsmallSong
Student of RUC and undergraduate majored in AI and Fintech. Developed a 2.4B parameter LLM that was pre-trained from scratch. WeChat: song156054017555
Renmin University of ChinaBeijing
Pinned Repositories
YuLan-Mini
A highly capable 2.4B lightweight LLM using only 1T pre-training data with all details.
alignment-handbook
Robust recipes to align language models with human and AI preferences
FlashRAG
⚡FlashRAG: A Python Toolkit for Efficient RAG Research
Leetcode-Solution-All
1000篇通俗易懂且高质量的 LeetCode 解析,动画题解,套路分析,模板分享
LESS
[ICML 2024] LESS: Selecting Influential Data for Targeted Instruction Tuning
OpenCoder-llm
openrlhf
ProLong
Homepage for ProLong (Princeton long-context language models) and paper "How to Train Long-Context Language Models (Effectively)"
RAGEN
RAGEN is the first open-source reproduction of DeepSeek-R1 on AGENT training.
TRL_FT
This repository is used for studying reinforcement learning and fine-tuning large models.
SsmallSong's Repositories
SsmallSong/TRL_FT
This repository is used for studying reinforcement learning and fine-tuning large models.
SsmallSong/alignment-handbook
Robust recipes to align language models with human and AI preferences
SsmallSong/FlashRAG
⚡FlashRAG: A Python Toolkit for Efficient RAG Research
SsmallSong/Leetcode-Solution-All
1000篇通俗易懂且高质量的 LeetCode 解析,动画题解,套路分析,模板分享
SsmallSong/LESS
[ICML 2024] LESS: Selecting Influential Data for Targeted Instruction Tuning
SsmallSong/OpenCoder-llm
SsmallSong/openrlhf
SsmallSong/ProLong
Homepage for ProLong (Princeton long-context language models) and paper "How to Train Long-Context Language Models (Effectively)"
SsmallSong/RAGEN
RAGEN is the first open-source reproduction of DeepSeek-R1 on AGENT training.
SsmallSong/simpo
SsmallSong/test
SsmallSong/ResearchFigure
Some example codes for drawing figures in research paper
SsmallSong/resume
An elegant \LaTeX\ résumé template. 大陆镜像 https://gods.coding.net/p/resume/git
SsmallSong/ruc_gsai_rl
这是**人民大学高瓴人工智能学院本科课程《强化学习》的期末项目安排,项目内容是训练一个适用于国标麻将的强化学习智能体。
SsmallSong/sd-webui-controlnet
WebUI extension for ControlNet
SsmallSong/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs