yandexdataschool/AgentNet

Deep Reinforcement Learning library for humans

PythonNOASSERTION

Issues

Original DQN Example
#103 opened 7 years ago by ehknight
1
Destination GpuArray is not contiguous
#96 opened 8 years ago by kashif
5
attention tests
#99 opened 8 years ago by justheuristic
1
policy_estimators param is weird
#93 opened 8 years ago by pshvechikov
3
example: Learning to be kind
#48 opened 8 years ago by justheuristic
1
Targetnet of layer on top of LSTMCell results in deepcopy error
#94 opened 8 years ago by pshvechikov
3
deprecate preprocess_observation
#90 opened 8 years ago by justheuristic
1
batch_size parameter is wierd
#95 opened 8 years ago by pshvechikov
1
Support both Theano (Lasagne; Keras) and Tensorflow (Keras) backend
#92 opened 8 years ago by Omrigan
1
BaseResolver returns int64
#91 opened 8 years ago by justheuristic
1
Vectorized environment
#89 opened 8 years ago by justheuristic
0
DPG refactor and demo
#86 opened 8 years ago by justheuristic
1
example:Qlearning with normalized advantage functions
#81 opened 8 years ago by justheuristic
1
grad dtypes mismatch in some rare case
#83 opened 8 years ago by justheuristic
2
Deprecation list
#68 opened 8 years ago by justheuristic
0
better weights management for memory layers
#84 opened 8 years ago by justheuristic
0
AgentNet recurrence won't compile if batch_size = 1 and unroll_scan=False and at least one input is a single-element vector.
#79 opened 8 years ago by justheuristic
2
canonicalize LSTM
#80 opened 8 years ago by justheuristic
3
Brief outline of modules
#77 opened 8 years ago by arogozhnikov
3
Hierarchical MDP as a demo?
#71 opened 9 years ago by justheuristic
0
Counter, switch, everyK (see if it works and saves time)
#49 opened 9 years ago by justheuristic
0
Minimal initial example.
#53 opened 9 years ago by arogozhnikov
5
Automated tests on convergence
#54 opened 9 years ago by arogozhnikov
3
KSfinder experiment setup
#32 opened 9 years ago by justheuristic
1
Dockerfile aka "makeitwork"
#69 opened 9 years ago by justheuristic
0
TupleLayer refactor
#55 opened 9 years ago by justheuristic
1
TODOs
#58 opened 9 years ago by justheuristic
0
Continuous action space policy gradient
#50 opened 9 years ago by justheuristic
1
Add py3 to container
#52 opened 9 years ago by justheuristic
0
Continuous/ndimensional action support
#25 opened 9 years ago by justheuristic
2
Recurrence internal refactor as Lasagne layer
#47 opened 9 years ago by justheuristic
0
Environment interface with Lasagne layers
#46 opened 9 years ago by justheuristic
1
Environment model agent training (a.k.a learning curiosity)
#31 opened 9 years ago by justheuristic
1
Adversarial architecture
#35 opened 9 years ago by justheuristic
1
Release preparations
#34 opened 9 years ago by justheuristic
1
Forced category predictions
#40 opened 9 years ago by justheuristic
1
persistence: support unnamed layers and layers with same names
#29 opened 9 years ago by justheuristic
5
Allow environments that work outside theano + make tutorial
#37 opened 9 years ago by justheuristic
1
Window memory
#44 opened 9 years ago by justheuristic
1
Getting published
#33 opened 9 years ago by justheuristic
2
Store initial hidden values with SessionPool and SessionBatch
#43 opened 9 years ago by justheuristic
1
Dialogs demo stand
#41 opened 9 years ago by justheuristic
1
could not download data to installation folder
#42 opened 9 years ago by justheuristic
2
Remove "Qvalues"-related names from where inappropriate
#38 opened 9 years ago by justheuristic
3
A3c a.k.a. Actor-Critic method
#36 opened 9 years ago by justheuristic
2
Learning refactor
#39 opened 9 years ago by justheuristic
1
K-step reinforcement learning
#27 opened 9 years ago by justheuristic
4
Reinforcement Learning Comparison
#28 opened 9 years ago by justheuristic
2
Session printing broken
#30 opened 9 years ago by justheuristic
1
Implement SARSA and compare with Q-learning
#26 opened 9 years ago by justheuristic
2