daddabarba/NHRL
An adaptive algorithm, which should abstract temporally extended actions online, without the need for additional background information (besides a Markovian description of the environment). Several Reinforcement Learning algorithms where embedded in a Hierarchy of policies, among which n-step QL, Expected Sarsa, LSTM neural networks (for Q value learning), Deep Mind's Deep Q-learning architecture, and simultaneous off-policy training (of all abstract actions).
Python
Issues
- 0
update state only in call. For neural hierarchies override __call__ with super and then update
#48 opened - 0
Reset state at the start of training
#47 opened - 0
- 0
- 0
Inefficent traning in TD class
#33 opened - 0
add pickle in the readme file
#32 opened - 0
add limit to hierarchy
#30 opened - 0
- 1
- 0
- 0
SD returning inf in data
#26 opened - 0
in README add packages versions
#25 opened - 0
- 0
- 0
- 0
- 0
- 0
add arguments to loading script
#19 opened - 0
- 0
- 2
add arguments to testing script
#16 opened - 1
make n experiments consecutively
#15 opened - 0
- 0
- 2
make matplotlib optional
#12 opened - 1
- 2
add common session mechanism when rs>1
#10 opened - 0
implement batch RL
#9 opened - 0
implement Boltzman exploration
#8 opened - 0
- 0
- 1
- 0
- 0
separate agent and q learning
#3 opened - 1
- 0
colors are not modular
#1 opened