Sabriab's Stars
cheadrian/snake-reinforcement-learning
Example of a snake game ML with Ray RLlib, PyGame, and Gymnasium
diego-vicente/som-tsp
Solving the Traveling Salesman Problem using Self-Organizing Maps
upb-lea/reinforcement_learning_course_materials
Lecture notes, tutorial tasks including solutions as well as online videos for the reinforcement learning course hosted by Paderborn University
onlyacat/Capacitated-Arc-Routing-Problem-CARP-
Rintarooo/VRP_DRL_MHA
"Attention, Learn to Solve Routing Problems!"[Kool+, 2019], Capacitated Vehicle Routing Problem solver
Roberto09/Dynamic-Attention-Model-for-VRP---Pytorch
Implementation for the paper "A Deep Reinforcement Learning Algorithm Using Dynamic Attention Model for Vehicle Routing Problems".
JaswanthBadvelu/Reinforcement-Learning-CVRP
Dynamic Attention Encoder-Decoder model to learn and design heuristics to solve capacitated vehicle routing problems
krishnaik06/Pytorch-Tutorial
echofist/AM-VRP
My implementation of solving the Capacitated Vehicle Routing Problem in the paper "Attention, learn to solve routing problems"
wpwei/attention_to_route
This repository is a third-party implementation of Attention, Learn to Solve Routing Problems!.
alexeypustynnikov/AM-VRP
TF2 implementation of "Attention, Learn to Solve Routing Problems!" (arXiv:1803.08475) article.
wouterkool/attention-learn-to-route
Attention based model for learning to solve different routing problems
industrial-ucn/jupyter-examples
palash-s/Scheduling-algo-DQN-AI
The agent’s goal is to maximize the total expected reward over all possible trajectories, even though we defined finite states and action space, there is still a huge number of trajectories, which motivates the use of reinforcement learning [1]. It can be converted as an iterative update in the deep-Q network, which is proposed by Watkins [2] as follows: Q(S_t,A_t )=Q(S_t,A_t )+α[r_(t+1)+γMaxQ(S_(t+1),A_(t+1) )-Q(S_t,A_t )] (2) Where left Q(S_t,A_t ) is the updating Q-values (rewards) at state S_t execute action A_t. r_(t+1)+γMaxQ(S_(t+1),A_(t+1) ) is the predicted target-Q value, where r_(t+1) is the reward when executing action A_(t+1) from state A_t into state A_(t+1).a is learning rate. MaxQ(S_(t+1),A_(t+1) ) is maximum Q-value after executing all possible actions A_(t+1). In DQN, will adopt deep neural network for predicting the Q-values
dastratakos/Optimized-Task-Scheduling
Q-learning and Value Iteration machine learning project to provide an optimized tennis racquet priority schedule for a local tennis shop.
spaceVStab/Discrete-Event-Simulator
Job Scheduling Simulator for Reinforcement Learning Models
sohaibafifi/vrp-gpu
VRPTW on GPU
maximelhrg/comsw4995-fall19-pointer-tsp
Project developed in the context of a DL course at Columbia University's CS Department. Aiming at providing a Deep Learning approach to the Traveling Salesman Problem relying on ConvNets and Pointer Networks architecture.
yining043/TSP-improve
An improvement-based Deep Reinforcement Learning Algorithm presented in paper https://arxiv.org/abs/1912.05784v2 for solving the TSP problem.
akashsrikanth2310/Using-Deep-Reinforcement-Learning-with-Graph-Embedding-to-solve-the-Travelling-Salesman-Problem
Using Deep Reinforcement Learning with Graph Embedding to solve the Travelling Salesman Problem
OptMLGroup/VRP-RL
Reinforcement Learning for Solving the Vehicle Routing Problem
YvanYin/VNL_Monocular_Depth_Prediction
Monocular Depth Prediction
jakeret/tf_unet
Generic U-Net Tensorflow implementation for image segmentation