qiaowenchuan

qiaowenchuan's Stars

labmlai/annotated_deep_learning_paper_implementations
🧑‍🏫 60+ Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), gans(cyclegan, stylegan2, ...), 🎮 reinforcement learning (ppo, dqn), capsnet, distillation, ... 🧠
Language:Python57.6k 461 1325.9k
microsoft/LightGBM
A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks.
Language:C++16.8k 435 3.4k3.8k
DLR-RM/rl-baselines3-zoo
A training framework for Stable Baselines3 reinforcement learning agents, with hyperparameter optimization and pre-trained agents included.
Language:Python2.2k 24 255525
Pyomo/pyomo
An object-oriented algebraic modeling language in Python for structured optimization problems.
Language:Python2.1k 62 1.4k526
openai/requests-for-research
A living collection of deep learning problems
Language:HTML1.7k 379 10606
PKU-Alignment/omnisafe
JMLR: OmniSafe is an infrastructural framework for accelerating SafeRL research.
Language:Python963 40 105132
ericyangyu/PPO-for-Beginners
A simple and well styled PPO implementation. Based on my Medium series: https://medium.com/@eyyu/coding-ppo-from-scratch-with-pytorch-part-1-4-613dfc1b14c8.
Language:Python826 12 9120
google-deepmind/funsearch
Language:Jupyter Notebook750 20 6135
vwxyzjn/ppo-implementation-details
The source code for the blog post The 37 Implementation Details of Proximal Policy Optimization
Language:Python666 3 6100
marload/DeepRL-TensorFlow2
🐋 Simple implementations of various popular Deep Reinforcement Learning algorithms using TensorFlow2
Language:Python604 19 8141
Mcompetitions/M5-methods
Data, Benchmarks, and methods submitted to the M5 forecasting competition
Language:Jupyter Notebook591 48 14237
PKU-Alignment/Safe-Policy-Optimization
NeurIPS 2023: Safe Policy Optimization: A benchmark repository for safe reinforcement learning algorithms
Language:Python334 8 1146
twni2016/pomdp-baselines
Simple (but often Strong) Baselines for POMDPs in PyTorch, ICML 2022
Language:Python310 5 1042
ml-jku/baselines-rudder
RUDDER for ATARI games with delayed rewards in OpenAI Baselines package
Language:Python267 16 040
Stable-Baselines-Team/rl-colab-notebooks
Colab notebooks part of the documentation of Stable Baselines reinforcement learning library
Language:Jupyter Notebook211 7 639
yuxiaowww/BDCI-2018-Supply-Chain-Demand-Forecast
初赛Rank1 复赛Rank1 2018 CCF 大数据与计算智能大赛供应链需求预测 Miracccccccle
Language:Python175 10 177
MarcoMeter/recurrent-ppo-truncated-bptt
Baseline implementation of recurrent PPO using truncated BPTT
Language:Jupyter Notebook131 4 1117
hwiberg/OptiCL
An end-to-end framework for mixed-integer optimization with data-driven learned constraints.
Language:Jupyter Notebook117 6 019
LaunchpadAI/space-bandits
Language:Jupyter Notebook102 9 1930
twni2016/Memory-RL
When Do Transformers Shine in RL? Decoupling Memory from Credit Assignment, NeurIPS 2023 (oral)
Language:Python60 2 35
akjayant/PPO_Lagrangian_PyTorch
Implementation of PPO Lagrangian in PyTorch
Language:Python35 2 110
widmi/rudder-a-practical-tutorial
A practical step-by-step guide to applying RUDDER
Language:Jupyter Notebook34 4 014
venktesh22/ExpressLanes_Deep-RL
Language:Python26 3 05
ml-jku/rudder-demonstration-code
Code for demonstration example-task in RUDDER blog
Language:Python22 5 111
JorenGijsbrechts/DRL_A3C_inventory
Language:Python19 1 16
bramdemoor-BE/Reward-shaping-to-improve-the-performance-of-DRL-in-inventory-management
Link to paper: https://www.ssrn.com/abstract=3804655
Language:Python13 1 04
shaneg1507/data-analytics-in-supply-chain
An analysis, with a focus on demand forecasting, of transactional data associated with over 2.5 million customers and 31,868 SKUs over the month of March in 2018 from JD.com, one of China’s largest retailers.
Language:Jupyter Notebook13 1 21
ggrani/JSSP_actor-critic_Agasucci_Monaci_Grani
Code from the paper An actor-critic algorithm with policy gradients to solve the job shop scheduling problem using deep double recurrent agents."
Language:Python9 1 03
LinghengMeng/Multistep-DDPG
The implementation of Multistep-DDPG and Mixed-Multistep-DDPG
Language:Python9 2 01
DRL-OM/DRL-assortment
Language:Jupyter Notebook6 1 01

qiaowenchuan

qiaowenchuan's Stars

labmlai/annotated_deep_learning_paper_implementations

microsoft/LightGBM

DLR-RM/rl-baselines3-zoo

Pyomo/pyomo

openai/requests-for-research

PKU-Alignment/omnisafe

ericyangyu/PPO-for-Beginners

google-deepmind/funsearch

vwxyzjn/ppo-implementation-details

marload/DeepRL-TensorFlow2

Mcompetitions/M5-methods

PKU-Alignment/Safe-Policy-Optimization

twni2016/pomdp-baselines

ml-jku/baselines-rudder

Stable-Baselines-Team/rl-colab-notebooks

yuxiaowww/BDCI-2018-Supply-Chain-Demand-Forecast

MarcoMeter/recurrent-ppo-truncated-bptt

hwiberg/OptiCL

LaunchpadAI/space-bandits

twni2016/Memory-RL

akjayant/PPO_Lagrangian_PyTorch

widmi/rudder-a-practical-tutorial

venktesh22/ExpressLanes_Deep-RL

ml-jku/rudder-demonstration-code

JorenGijsbrechts/DRL_A3C_inventory

bramdemoor-BE/Reward-shaping-to-improve-the-performance-of-DRL-in-inventory-management

shaneg1507/data-analytics-in-supply-chain

ggrani/JSSP_actor-critic_Agasucci_Monaci_Grani

LinghengMeng/Multistep-DDPG

DRL-OM/DRL-assortment