KJLdefeated

Computer science student in National Yang Ming Chiao Tung University, interested in Reinforcement Learning.

NYCU

KJLdefeated's Stars

jiadong5/ECE408_FA23_UIUC
My GitHub Repo for UIUC ECE408 Applied Parallel Programming, mainly focus on CUDA programming and algorithm implementation.
Language:Cuda3
microsoft/graphrag
A modular graph-based Retrieval-Augmented Generation (RAG) system
Language:Python6.7k517
leo811121/UIUC-CS-483-Parallel-Programming
Language:Cuda151
talkingwallace/ChatGPT-Paper-Reader
This repo offers a simple interface that helps you to read&summerize research papers in pdf format. You can ask some questions after reading. This interface is developed based on openai API and using GPT-3.5-turbo model.
Language:Python720110
haarnoja/sac
Soft Actor-Critic
Language:Python939231
ZiyiZhang27/tdpo
[ICML 2024] Code for the paper "Confronting Reward Overoptimization for Diffusion Models: A Perspective of Inductive and Primacy Biases"
Language:Python5
hiwonjoon/IROS2021_SORS
Language:Python104
AI4Finance-Foundation/RLSolver
Solvers for NP-hard and NP-complete problems with an emphasis on high-performance GPU computing.
Language:Python12131
openai/safety-gym
Tools for accelerating safe exploration research.
Language:Python485133
sfujim/TD3
Author's PyTorch implementation of TD3 for OpenAI gym tasks
Language:Python1.6k434
alexrame/rewardedsoups
Rewarded soups official implementation
Language:HTML414
goldmansachs/gs-quant
Python toolkit for quantitative finance
Language:Jupyter Notebook6.1k775
THUDM/ImageReward
[NeurIPS 2023] ImageReward: Learning and Evaluating Human Preferences for Text-to-image Generation
Language:Python1k54
kristery/Awesome-Imitation-Learning
A curated list of awesome imitation learning resources and publications
49360
Kaixhin/imitation-learning
Imitation learning algorithms
Language:Python40639
Shentao-YANG/Dense_Reward_T2I
Source code for "A Dense Reward View on Aligning Text-to-Image Diffusion with Preference" (ICML'24).
Language:Python18
NVlabs/DoRA
[ICML2024 (Oral)] Official PyTorch implementation of DoRA: Weight-Decomposed Low-Rank Adaptation
Language:Python41420
clu0/unet.cu
UNet diffusion model in pure CUDA
Language:Cuda50021
artidoro/qlora
QLoRA: Efficient Finetuning of Quantized LLMs
Language:Jupyter Notebook9.7k796
modelscope/DiffSynth-Studio
Enjoy the magic of Diffusion models!
Language:Python5.7k517
x35f/alpha2
pseudocode and algorithms for the paper "Alpha$^2$: Discovering Logical Formulaic Alphas using Deep Reinforcement Learning"
Language:Python7017
karpathy/LLM101n
LLM101n: Let's build a Storyteller
14.7k698
lucidrains/multimodal-dit-pytorch
Implementation of a multimodal diffusion transformer in Pytorch
90
SalesforceAIResearch/DiffusionDPO
Code for "Diffusion Model Alignment Using Direct Preference Optimization"
Language:Python17517
RLHFlow/Online-RLHF
A recipe for online RLHF.
Language:Python31439
digital-nomad-cheng/ECE408_Applied_Parallel_Programming
CUDA solutions for the lab assignments in the UIUC-ECE408 Applied Parallel Programming course.
Language:C++71
mistralai/mistral-finetune
Language:Python2.4k163
togethercomputer/MoA
Together Mixture-Of-Agents (MoA) – 65.1% on AlpacaEval with OSS models
Language:Python1.9k252
ridgerchu/matmulfreellm
Implementation for MatMul-free LM.
Language:Python2.6k151
hibana2077/wisper_ui
Language:Python2

KJLdefeated

KJLdefeated's Stars

jiadong5/ECE408_FA23_UIUC

microsoft/graphrag

leo811121/UIUC-CS-483-Parallel-Programming

talkingwallace/ChatGPT-Paper-Reader

haarnoja/sac

ZiyiZhang27/tdpo

hiwonjoon/IROS2021_SORS

AI4Finance-Foundation/RLSolver

openai/safety-gym

sfujim/TD3

alexrame/rewardedsoups

goldmansachs/gs-quant

THUDM/ImageReward

kristery/Awesome-Imitation-Learning

Kaixhin/imitation-learning

Shentao-YANG/Dense_Reward_T2I

NVlabs/DoRA

clu0/unet.cu

artidoro/qlora

modelscope/DiffSynth-Studio

x35f/alpha2

karpathy/LLM101n

lucidrains/multimodal-dit-pytorch

SalesforceAIResearch/DiffusionDPO

RLHFlow/Online-RLHF

digital-nomad-cheng/ECE408_Applied_Parallel_Programming

mistralai/mistral-finetune

togethercomputer/MoA

ridgerchu/matmulfreellm

hibana2077/wisper_ui