daddabarba/NHRL

An adaptive algorithm, which should abstract temporally extended actions online, without the need for additional background information (besides a Markovian description of the environment). Several Reinforcement Learning algorithms where embedded in a Hierarchy of policies, among which n-step QL, Expected Sarsa, LSTM neural networks (for Q value learning), Deep Mind's Deep Q-learning architecture, and simultaneous off-policy training (of all abstract actions).

Python

Issues

update state only in call. For neural hierarchies override __call__ with super and then update
#48 opened 6 years ago
0
Reset state at the start of training
#47 opened 6 years ago
0
Need to change deprecated LSTM methods in usage (qLearningAgent class)
#46 opened 6 years ago
0
Adding methods to copy network and add extra output neuron to PyTorch based LSTM class
#45 opened 6 years ago
0
Inefficent traning in TD class
#33 opened 6 years ago
0
add pickle in the readme file
#32 opened 6 years ago
0
add limit to hierarchy
#30 opened 6 years ago
0
remember that when you increase the number of actions you also give the previous action as input, therefore the input size also changes
#29 opened 6 years ago
0
add delay after abstracting task (before abstracting again )
#28 opened 6 years ago
1
matplot lib not optional , used in testing to output plot
#27 opened 6 years ago
0
SD returning inf in data
#26 opened 6 years ago
0
in README add packages versions
#25 opened 6 years ago
0
in agent loader, loop cannot select end-goal
#24 opened 6 years ago
0
close session and clear variables in destructor
#23 opened 6 years ago
0
add state space SD online update for A.A.s and firing sequence for hierarchy
#22 opened 6 years ago
0
add, for QL class add action and add policy
#21 opened 6 years ago
0
Generate folder and files after testing, so that there is no need to delete the folder if test is blocked
#20 opened 6 years ago
0
add arguments to loading script
#19 opened 6 years ago
0
Transition history should be managed by qAgent (for easy use in hierarchy)
#18 opened 6 years ago
0
if rs>1 but given reward is double, then use same reward for all the policies
#17 opened 6 years ago
0
add arguments to testing script
#16 opened 6 years ago
2
make n experiments consecutively
#15 opened 6 years ago
1
make agent parameters readable from and exportable to json file
#14 opened 6 years ago
0
Implement chain learning from a.a. on the same level
#13 opened 6 years ago
0
make matplotlib optional
#12 opened 6 years ago
2
add different variable scope for each lstm network
#11 opened 6 years ago
1
add common session mechanism when rs>1
#10 opened 6 years ago
2
implement batch RL
#9 opened 6 years ago
0
implement Boltzman exploration
#8 opened 6 years ago
0
Perception should also include previous action
#7 opened 6 years ago
0
returning single element in problem and goal state when list is length one
#6 opened 6 years ago
0
Plotting belief state not necessary anymore
#5 opened 6 years ago
1
Interest update dependent on reward settings
#4 opened 6 years ago
0
separate agent and q learning
#3 opened 7 years ago
0
a legend must be implemented (also modular)
#2 opened 7 years ago
1
colors are not modular
#1 opened 7 years ago
0