(More about model description, see in Intelligent Transportation System.md)
Deep Reinforcement Learning for Intelligent Transportation Systems NIPS 2018 Workshop MLITS
The states transformation principle is shown in the graph:
States representation:
''0'': green light for direction 1 and hence red light for direction 2;
''1'': green light for direction 2 and hence red light for direction 1;
''2'': yellow light for direction 1 and hence red light for direction 2;
''3'': yellow light for direction 2 and hence red light for direction 1.
Actions representation: (contrary to in paper)
⓪: change state
①: keep on
To run this repo, you need to use Pyhton 3.5.
Generally, use python xxx.py --train
for training and python xxx.py --test
for testing.
Running the code in different categories:
./1.one_way_two_queue:
python light_constr.py --train/test
./2.two_intersections(linear):
python lights.py --train/test
./3.grid_square_network:
python lights.py --train
./4.multithread_for_grid:
python lights.py --train/test
./5.one_agent_for_each_intersection:
python lights_re.py --train/test
Generally, use python -m run.py --alg=ddpg --num_timesteps=xxx --train
for training, python -m run.py --alg=ddpg --num_timesteps=xxx --test
for testing and python -m run.py --alg=ddpg --num_timesteps=xxx --retrain
for retraining from last saved checkpoint.
./6.ddpg_for_single, ./7.ddpg_for_linear and ./8.ddpg_for_grid:
python -m run.py --alg=ddpg --num_timesteps=1e4 --train/test/retrain
It is one intersection with only two unidirectional roads, no left or right turning. The number of cars on each road is denoted as respectively. The state of the traffic light is denoted by state S, which can be in one of the following four states
- "0": green light for road Q_1, and hence red light for road Q_2;
- "1": red light for road Q_1, and hence green light for road Q_1;
- "2": yellow light for Q_1, and red light for road Q_2;
- "3": red light for road Q_1, and yellow light for road Q_2;
And the transition of states, which is called the action in RL algorithm, can be:
- "0": change;
- "1": keep on;
According to general transportation principles, the state transition of traffic lights could only be unidirectional, which is under our definition of light states above. The trained RL agent takes the tuple [Q_1, Q_2, S] as input and generates action choice for traffic light.
Code in ./1.one_way_two_queue.
Linear network model is combined with multiple single intersections on a line, as shown in the following graph. Noticing that we don't care much about the outcoming roads, which is denoted by dashed lines.
the color of lights is 'green' or 'red' or 'yellow'. The black rectangular represents incoming car for periphery of road networks. The numbers indicates number of cars on each road. If the light is 'green', the number of cars in that road will reduce the number of passing cars after transition. If there is 'black rectangular', the number of cars in the corresponding road will increase one after transition. The upper image is the state before transition, while the lower image is the state after transition.
Code in ./2.two_intersections(linear).
Code in ./3.grid_square_network.
Apply multi-thread for accelerating training process.
Code in ./4.multithread_for_grid.
Single agent for every intersection (instead of single agent for whole road network), input of agent is from each one intersection. All intersections share the same agent, every time agent stores [obs,a,r,obs_] for each intersection, share the same overall reward (lights.py
) or restore each reward for each intersection (lights_re.py
).
Code in ./5.one_agent_for_each_intersection.
Basic environments are similar with for DQN, only with main/branch road difference. For all 3 circumstances, main road is direction 2, and branch road is direction 1, larger coming and passing rates on main roads than branch roads. Another difference of DDPG version environment with DQN version is the number of cars on roads (coming, queueing, passing) are more realistic values like 16, 8, etc instead of 0, 1.
Code in ./6.ddpg_for_single.
Testing of 10*1 linear network.
Code in ./7.ddpg_for_linear.
Testing of 10*5 grid network.
Code in ./8.ddpg_for_grid.
If you use this repository for any projects, please cite this paper:
@article{liu2018deep,
title={Deep Reinforcement Learning for Intelligent Transportation Systems},
author={Liu, Xiao-Yang and Ding, Zihan and Borst, Sem and Walid, Anwar},
journal={arXiv preprint arXiv:1812.00979},
year={2018}
}