QiJing47512138's Stars
MISTCARRYYOU/FPPO4taskschedulngincloudedge
The light codes for the paper published in IJPR named 'Federated Deep Reinforcement Learning for Dynamic Job Scheduling in Cloud-edge Collaborative Manufacturing Systems'
Yunhui1998/Gymjsp
Share a benchmark that can easily apply reinforcement learning in Job-shop-scheduling
zcaicaros/L2D
Official implementation of paper "Learning to Dispatch for Job Shop Scheduling via Deep Reinforcement Learning"
davidtw0320/Resources-Allocation-in-The-Edge-Computing-Environment-Using-Reinforcement-Learning
Simulated the scenario between edge servers and users with a clear graphic interface. Also, implemented the continuous control with Deep Deterministic Policy Gradient (DDPG) to determine the resources allocation (offload targets, computational resources, migration bandwidth) in the edge servers
revenol/LyDROO
Lyapunov-guided Deep Reinforcement Learning for Stable Online Computation Offloading in Mobile-Edge Computing Networks
Derfei/MergingandScheduling
This is the code for the paper titleed task merge and scheduling for deep learning applications in mobile edge computing
Aihong-Sun/FJSP_AGV-Machine_Instances
The Instances of AGV and machine integrated scheduling problem in flexible job shop
stanfordnlp/wge
Workflow-Guided Exploration: sample-efficient RL agent for web tasks
palash-s/Scheduling-algo-DQN-AI
The agent’s goal is to maximize the total expected reward over all possible trajectories, even though we defined finite states and action space, there is still a huge number of trajectories, which motivates the use of reinforcement learning [1]. It can be converted as an iterative update in the deep-Q network, which is proposed by Watkins [2] as follows: Q(S_t,A_t )=Q(S_t,A_t )+α[r_(t+1)+γMaxQ(S_(t+1),A_(t+1) )-Q(S_t,A_t )] (2) Where left Q(S_t,A_t ) is the updating Q-values (rewards) at state S_t execute action A_t. r_(t+1)+γMaxQ(S_(t+1),A_(t+1) ) is the predicted target-Q value, where r_(t+1) is the reward when executing action A_(t+1) from state A_t into state A_(t+1).a is learning rate. MaxQ(S_(t+1),A_(t+1) ) is maximum Q-value after executing all possible actions A_(t+1). In DQN, will adopt deep neural network for predicting the Q-values
weixians/JobSchedule
Implementation of paper "Research on Adaptive Job Shop Scheduling Problems Based on Dueling Double DQN"
RK0731/openrl
通用强化学习研究框架
RK0731/Deep-reinforcement-learning-for-dynamic-scheduling-of-a-flexible-job-shop
IJPR paper: Deep Reinforcement Learning for dynamic scheduling of a flexible job shop
stephan-who/DRL_to_DFJSP
this repository is used to reappear thesis《Dynamic scheduling for flexible job shop with new job insertions by deep reinforcement learning》
jjjj0458/Deep-Reinforcement-Learning-for-Solving-Job-Shop-Scheduling-Problems
DRL on JSSPs