qiaowenchuan's Stars
labmlai/annotated_deep_learning_paper_implementations
🧑🏫 60+ Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), gans(cyclegan, stylegan2, ...), 🎮 reinforcement learning (ppo, dqn), capsnet, distillation, ... 🧠
microsoft/LightGBM
A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks.
DLR-RM/rl-baselines3-zoo
A training framework for Stable Baselines3 reinforcement learning agents, with hyperparameter optimization and pre-trained agents included.
Pyomo/pyomo
An object-oriented algebraic modeling language in Python for structured optimization problems.
openai/requests-for-research
A living collection of deep learning problems
PKU-Alignment/omnisafe
JMLR: OmniSafe is an infrastructural framework for accelerating SafeRL research.
ericyangyu/PPO-for-Beginners
A simple and well styled PPO implementation. Based on my Medium series: https://medium.com/@eyyu/coding-ppo-from-scratch-with-pytorch-part-1-4-613dfc1b14c8.
google-deepmind/funsearch
vwxyzjn/ppo-implementation-details
The source code for the blog post The 37 Implementation Details of Proximal Policy Optimization
marload/DeepRL-TensorFlow2
🐋 Simple implementations of various popular Deep Reinforcement Learning algorithms using TensorFlow2
Mcompetitions/M5-methods
Data, Benchmarks, and methods submitted to the M5 forecasting competition
PKU-Alignment/Safe-Policy-Optimization
NeurIPS 2023: Safe Policy Optimization: A benchmark repository for safe reinforcement learning algorithms
twni2016/pomdp-baselines
Simple (but often Strong) Baselines for POMDPs in PyTorch, ICML 2022
ml-jku/baselines-rudder
RUDDER for ATARI games with delayed rewards in OpenAI Baselines package
Stable-Baselines-Team/rl-colab-notebooks
Colab notebooks part of the documentation of Stable Baselines reinforcement learning library
yuxiaowww/BDCI-2018-Supply-Chain-Demand-Forecast
初赛Rank1 复赛Rank1 2018 CCF 大数据与计算智能大赛 供应链需求预测 Miracccccccle
MarcoMeter/recurrent-ppo-truncated-bptt
Baseline implementation of recurrent PPO using truncated BPTT
hwiberg/OptiCL
An end-to-end framework for mixed-integer optimization with data-driven learned constraints.
LaunchpadAI/space-bandits
twni2016/Memory-RL
When Do Transformers Shine in RL? Decoupling Memory from Credit Assignment, NeurIPS 2023 (oral)
akjayant/PPO_Lagrangian_PyTorch
Implementation of PPO Lagrangian in PyTorch
widmi/rudder-a-practical-tutorial
A practical step-by-step guide to applying RUDDER
venktesh22/ExpressLanes_Deep-RL
ml-jku/rudder-demonstration-code
Code for demonstration example-task in RUDDER blog
JorenGijsbrechts/DRL_A3C_inventory
bramdemoor-BE/Reward-shaping-to-improve-the-performance-of-DRL-in-inventory-management
Link to paper: https://www.ssrn.com/abstract=3804655
shaneg1507/data-analytics-in-supply-chain
An analysis, with a focus on demand forecasting, of transactional data associated with over 2.5 million customers and 31,868 SKUs over the month of March in 2018 from JD.com, one of China’s largest retailers.
ggrani/JSSP_actor-critic_Agasucci_Monaci_Grani
Code from the paper An actor-critic algorithm with policy gradients to solve the job shop scheduling problem using deep double recurrent agents."
LinghengMeng/Multistep-DDPG
The implementation of Multistep-DDPG and Mixed-Multistep-DDPG
DRL-OM/DRL-assortment