Yangjinluan

student of Zhejiang University

Zhejiang University

Yangjinluan's Stars

hkust-nlp/simpleRL-reason
Simple RL training for reasoning
Language:Python3.4k 33 64249
NovaSky-AI/SkyThought
Sky-T1: Train your own O1 preview model within $450
Language:Python3.2k 41 51320
RUCAIBox/Slow_Thinking_with_LLMs
A series of technical report on Slow Thinking with LLM
Language:Python621 10 3634
Zhen-Tan-dmml/LLM4Annotation
517 5 321
bruno686/Awesome-RL-based-LLM-Reasoning
Awesome RL-based LLM Reasoning
399 6 020
pengr/LLM-Synthetic-Data
Real-time updated, fine-grained reading list on LLM-synthetic-data.🔥
246 5 020
RLHFlow/Self-rewarding-reasoning-LLM
Recipes to train the self-rewarding reasoning LLMs.
Language:Python2089
horseee/CoT-Valve
CoT-Valve: Length-Compressible Chain-of-Thought Tuning
59 2 24
NineAbyss/S2R
This is the official implementation of the paper "S²R: Teaching LLMs to Self-verify and Self-correct via Reinforcement Learning"
Language:Python552
thu-ml/STAIR
Official codebase for "STAIR: Improving Safety Alignment with Introspective Reasoning"
Language:Python31 3 01
pgasawa/BARE
Combining Base and Instruction-Tuned Language Models for Better Synthetic Data Generation
Language:Python26 4 03
tanganke/peta
Code for paper "Parameter Efficient Multi-task Model Fusion with Partial Linearization"
Language:Jupyter Notebook20 2 31
uservan/ThinkPO
Language:Python172
AI45Lab/DEAN
Language:Python10 1 01
tanganke/opcm
Language:Python101
tanganke/pareto_set_learning
Code for paper "Towards Efficient Pareto Set Approximation via Weight-Ensembling Mixture of Experts"
Language:Python7 1 00
PKU-Alignment/llms-resist-alignment
Repo for paper "Language Models Resist Alignment"
Language:Python6 2 00
tanganke/point_cloud_viewer
Simple OpenGL program to visualize point cloud.
Language:C++6 1 00
zhaoy777/AFICE
Aligning Large Language Models for Faithful Integrity Against Opposing Argument
Language:Python6 1 0
tanganke/introduction-to-factorio
introductory book to factorio.
3 0 0
tanganke/MathematicaCppProgramming
examples and tutorials of calling C/C++ in Wolfram Language ( Mathematica )
Language:C2 1 00
tanganke/pytorch_classification
Language:Python2 1 00
tanganke/pyutils2
personal python toolkit https://anke-pyutils.readthedocs.io/en/latest/
Language:Python2 1 0
tanganke/Awesome-Model-Merging
:couple: A curated list of Model Merging methods.
1 0 0
tanganke/Awesome-Model-Merging-Methods-Theories-Applications
Model Merging in LLMs, MLLMs, and Beyond: Methods, Theories, Applications and Opportunities. arXiv:2408.07666.
1 0 0
tanganke/oh-my-bash
A delightful community-driven framework for managing your bash configuration, and an auto-update tool so that makes it easy to keep up with the latest updates from the community.
Language:Shell1 0 0
tanganke/pytorch_optimizer
optimizer & lr scheduler & loss function collections in PyTorch
Language:Python1 0 0
tanganke/wildcat-beamer-template
Modified latex template
Language:TeX1 1 0
Yangjinluan/DAM
Codes for "Mitigating the Backdoor Effect for Multi-Task Model Merging via Safety-Aware Subspace" （ICLR2025)
Language:Jupyter Notebook10
Yangjinluan/HEI
Codes for "Leveraging Invariant Principle for Heterophilic Graph Structure Distribution Shifts "（WWW2025)
Language:Python10

Yangjinluan

Yangjinluan's Stars

hkust-nlp/simpleRL-reason

NovaSky-AI/SkyThought

RUCAIBox/Slow_Thinking_with_LLMs

Zhen-Tan-dmml/LLM4Annotation

bruno686/Awesome-RL-based-LLM-Reasoning

pengr/LLM-Synthetic-Data

RLHFlow/Self-rewarding-reasoning-LLM

horseee/CoT-Valve

NineAbyss/S2R

thu-ml/STAIR

pgasawa/BARE

tanganke/peta

uservan/ThinkPO

AI45Lab/DEAN

tanganke/opcm

tanganke/pareto_set_learning

PKU-Alignment/llms-resist-alignment

tanganke/point_cloud_viewer

zhaoy777/AFICE

tanganke/introduction-to-factorio

tanganke/MathematicaCppProgramming

tanganke/pytorch_classification

tanganke/pyutils2

tanganke/Awesome-Model-Merging

tanganke/Awesome-Model-Merging-Methods-Theories-Applications

tanganke/oh-my-bash

tanganke/pytorch_optimizer

tanganke/wildcat-beamer-template

Yangjinluan/DAM

Yangjinluan/HEI