mansicer

Doing machine learning research at @LAMDA-NJU and @LAMDA-RL

Nanjing University

mansicer's Stars

stevenilsen123/mac-keyboard-behavior-in-windows
This AutoHotKey script gets you all the MacOS keyboard shortcuts you love, in Windows!
Language:AutoHotkey26015
ntu-nail/CE7455
Language:Jupyter Notebook318
RLHFlow/RLHF-Reward-Modeling
Recipes to train reward model for RLHF.
Language:Python1.1k80
memoavatar/memo
Memory-Guided Diffusion for Expressive Talking Video Generation
Language:Python66262
GreatX3/Playable-Game-Generation
An open-source lightweight game generation paradigm. It includes everything from data processing to model architecture design and playability-based evaluation methods. The game runs at 20 FPS on a single consumer-grade graphics card (RTX-2060) while maintaining high playability.
Language:Jupyter Notebook632
FoundationVision/VAR
[NeurIPS 2024 Best Paper][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ultra-simple, user-friendly yet state-of-the-art* codebase for autoregressive image generation!
Language:Jupyter Notebook6.4k418
jquesnelle/yarn
YaRN: Efficient Context Window Extension of Large Language Models
Language:Python1.4k118
hughbzhang/o1_inference_scaling_laws
Replicating O1 inference-time scaling laws
Language:Python703
volcengine/verl
veRL: Volcano Engine Reinforcement Learning for LLM
Language:Python68454
unslothai/unsloth
Finetune Llama 3.3, Mistral, Phi-4, Qwen 2.5 & Gemma LLMs 2-5x faster with 70% less memory
Language:Python20.7k1.5k
OpenRLHF/OpenRLHF
An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & RingAttention & RFT)
Language:Python3.8k362
meta-llama/llama-cookbook
Welcome to the Llama Cookbook! This is your go to guide for Building with Llama: Getting started with Inference, Fine-Tuning, RAG. We also show you how to solve end to end problems using Llama model family and using them on various provider services
Language:Jupyter Notebook15.9k2.3k
joeljang/RLPHF
Personalized Soups: Personalized Large Language Model Alignment via Post-hoc Parameter Merging
Language:Python988
confident-ai/deepeval
The LLM Evaluation Framework
Language:Python4.4k360
allwefantasy/byzer-llm
Easy, fast, and cheap pretrain,finetune, serving for everyone
Language:Python27240
gkamradt/LLMTest_NeedleInAHaystack
Doing simple retrieval from LLM models at various context lengths to measure accuracy
Language:Jupyter Notebook1.7k181
open-compass/opencompass
OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.
Language:Python4.5k478
QwenLM/Qwen-Agent
Agent framework and applications built upon Qwen>=2.0, featuring Function Calling, Code Interpreter, RAG, and Chrome extension.
Language:Python5.3k449
TradeMaster-NTU/TradeMaster
TradeMaster is an open-source platform for quantitative trading empowered by reinforcement learning :fire: :zap: :rainbow:
Language:Jupyter Notebook1.5k305
huggingface/lighteval
Lighteval is your all-in-one toolkit for evaluating LLMs across multiple backends
Language:Python966119
EleutherAI/lm-evaluation-harness
A framework for few-shot evaluation of language models.
Language:Python7.5k2k
plandex-ai/plandex
AI driven development in your terminal. Designed for large, real-world tasks.
Language:Go11k761
LAMDA-RL/ReDA
The implementation of the AAMAS'24 paper "Disentangling Policy from Offline Task Representation Learning via Adversarial Data Augmentation"
Language:Python3
liyang619/COLE-Platform
Overcooked human-AI experiment platform
Language:Python324
samjia2000/HSP
This is a repository for Hidden-utility Self-Play.
Language:JavaScript262
vvbbnn00/WARP-Clash-API
该项目可以让你通过订阅的方式使用Cloudflare WARP+，自动获取流量。This project enables you to use Cloudflare WARP+ through subscription, automatically acquiring traffic.
Language:Python8.6k1.2k
yangjianxin1/Firefly
Firefly: 大模型训练工具，支持训练Qwen2.5、Qwen2、Yi1.5、Phi-3、Llama3、Gemma、MiniCPM、Yi、Deepseek、Orion、Xverse、Mixtral-8x7B、Zephyr、Mistral、Baichuan2、Llma2、Llama、Qwen、Baichuan、ChatGLM2、InternLM、Ziya2、Vicuna、Bloom等大模型
Language:Python6k537
hiyouga/LLaMA-Factory
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
Language:Python38.2k4.7k
AffordableGenerativeAgents/Affordable-Generative-Agents
Language:Python466
DefTruth/CUDA-Learn-Notes
📚200+ Tensor/CUDA Cores Kernels, ⚡️flash-attn-mma, ⚡️hgemm with WMMA, MMA and CuTe (98%~100% TFLOPS of cuBLAS/FA2 🎉🎉).
Language:Cuda2k207

mansicer

mansicer's Stars

stevenilsen123/mac-keyboard-behavior-in-windows

ntu-nail/CE7455

RLHFlow/RLHF-Reward-Modeling

memoavatar/memo

GreatX3/Playable-Game-Generation

FoundationVision/VAR

jquesnelle/yarn

hughbzhang/o1_inference_scaling_laws

volcengine/verl

unslothai/unsloth

OpenRLHF/OpenRLHF

meta-llama/llama-cookbook

joeljang/RLPHF

confident-ai/deepeval

allwefantasy/byzer-llm

gkamradt/LLMTest_NeedleInAHaystack

open-compass/opencompass

QwenLM/Qwen-Agent

TradeMaster-NTU/TradeMaster

huggingface/lighteval

EleutherAI/lm-evaluation-harness

plandex-ai/plandex

LAMDA-RL/ReDA

liyang619/COLE-Platform

samjia2000/HSP

vvbbnn00/WARP-Clash-API

yangjianxin1/Firefly

hiyouga/LLaMA-Factory

AffordableGenerativeAgents/Affordable-Generative-Agents

DefTruth/CUDA-Learn-Notes