Jaewoopudding

@kaist-silab

Jaewoopudding's Stars

meta-llama/codellama
Inference code for CodeLlama models
Language:Python16k 185 1961.9k
alshedivat/al-folio
A beautiful, simple, clean, and responsive Jekyll theme for academics
Language:HTML11k 23 56611.2k
huggingface/trl
Train transformer language models with reinforcement learning.
Language:Python9.9k 74 1.2k1.2k
NVIDIA/TensorRT-LLM
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
Language:C++8.5k 95 1.9k967
opendilab/awesome-RLHF
A curated list of reinforcement learning with human feedback resources (continually updated)
3.4k 61 3208
takuseno/d3rlpy
An offline deep reinforcement learning library
Language:Python1.3k 28 342238
juncongmoo/chatllama
ChatLLaMA 📢 Open source implementation for LLaMA-based ChatGPT runnable in a single GPU. 15x faster training process than ChatGPT
Language:Python1.2k 20 8138
gorisanson/pikachu-volleyball
Pikachu Volleyball implemented into JavaScript by reverse engineering the original game
Language:JavaScript981 10 10114
ej0cl6/deep-active-learning
Deep Active Learning
Language:Python806 15 15184
jxzhangjhu/Awesome-LLM-Uncertainty-Reliability-Robustness
Awesome-LLM-Robustness: a curated list of Uncertainty, Reliability and Robustness in Large Language Models
657 25 245
CleanDiffuserTeam/CleanDiffuser
CleanDiffuser: An Easy-to-use Modularized Library for Diffusion Models in Decision Making
Language:Jupyter Notebook358 1 1330
jannerm/ddpo
Code for the paper "Training Diffusion Models with Reinforcement Learning"
Language:Python334 7 1125
Samimust/predictive-maintenance
Data Wrangling, EDA, Feature Engineering, Model Selection, Regression, Binary and Multi-class Classification (Python, scikit-learn)
Language:Jupyter Notebook248 12 0148
facebookresearch/online-dt
Online Decision Transformer
Language:Python236 5 833
IBM/rl-testbed-for-energyplus
Reinforcement Learning Testbed for Power Consumption Optimization using EnergyPlus
Language:Python188 22 6777
prosysscience/JSSEnv
An OpenAi Gym environment for the Job Shop Scheduling problem.
Language:Python187 6 1855
EmptyJackson/policy-guided-diffusion
Official implementation of the RLC 2024 paper "Policy-Guided Diffusion"
Language:Python118 2 57
IBM/iot-predictive-analytics
Method for Predicting failures in Equipment using Sensor data. Sensors mounted on devices like IoT devices, Automated manufacturing like Robot arms, Process monitoring and Control equipment etc., collect and transmit data on a continuous basis which is Time stamped.
Language:Jupyter Notebook105 18 572
conglu1997/v-d4rl
Challenges and Opportunities in Offline Reinforcement Learning from Visual Observations
Language:Python94 4 179
nttcslab/msm-mae
Masked Spectrogram Modeling using Masked Autoencoders for Learning General-purpose Audio Representations
Language:Jupyter Notebook87 7 68
Dragon-Zhuang/BPPO
Author's Pytorch implementation of ICLR2023 paper Behavior Proximal Policy Optimization (BPPO).
Language:Python72 3 65
Junyoungpark/Pytorch-AWAC
A PyTorch implementation of Advantage weighted Actor-Critic (AWAC)
Language:Jupyter Notebook52 2 39
qingshi9974/PPO-pytorch-Mujoco
Implement PPO algorithm on mujoco environment，such as Ant-v2, Humanoid-v2, Hopper-v2, Halfcheeth-v2.
Language:Python50 2 63
datvodinh/ppo-transformer
A Reinforcement Learning Project using PPO + Transformer
Language:Jupyter Notebook31 2 12
reinforcement-learning-kr/rl-montezuma
The state-of-art deep rl algorithms for Montezuma's revenge
25 9 21
GRAAL-Research/OfflineRLReadingGroup
Offline Reinforcement Learning Reading Group
24 10 04
dbsxodud-11/ls_gfn
Official Code for Local Search GFlowNets (ICLR 2024 Spotlight)
Language:Python15 1 00
Jaewoopudding/GTA
Official codebase for GTA: Generative Trajectory Augmentation with Guidance for Offline Reinforcement Learning.
Language:Python13 2 12
beanie00/Decision-ConvFormer
[ICLR 2024 Spotlight] Code for the paper "Decision ConvFormer: Local Filtering in MetaFormer is Sufficient for Decision Making"
Language:Python8 1 51
umkiyoung/EXplorativeDT
RESEARCH
Language:Python1 1 00

Jaewoopudding

Jaewoopudding's Stars

meta-llama/codellama

alshedivat/al-folio

huggingface/trl

NVIDIA/TensorRT-LLM

opendilab/awesome-RLHF

takuseno/d3rlpy

juncongmoo/chatllama

gorisanson/pikachu-volleyball

ej0cl6/deep-active-learning

jxzhangjhu/Awesome-LLM-Uncertainty-Reliability-Robustness

CleanDiffuserTeam/CleanDiffuser

jannerm/ddpo

Samimust/predictive-maintenance

facebookresearch/online-dt

IBM/rl-testbed-for-energyplus

prosysscience/JSSEnv

EmptyJackson/policy-guided-diffusion

IBM/iot-predictive-analytics

conglu1997/v-d4rl

nttcslab/msm-mae

Dragon-Zhuang/BPPO

Junyoungpark/Pytorch-AWAC

qingshi9974/PPO-pytorch-Mujoco

datvodinh/ppo-transformer

reinforcement-learning-kr/rl-montezuma

GRAAL-Research/OfflineRLReadingGroup

dbsxodud-11/ls_gfn

Jaewoopudding/GTA

beanie00/Decision-ConvFormer

umkiyoung/EXplorativeDT