ikostrikov/pytorch-a3c
PyTorch implementation of Asynchronous Advantage Actor Critic (A3C) from "Asynchronous Methods for Deep Reinforcement Learning".
PythonMIT
Issues
- 1
- 0
where see the result?
#80 opened by xbdeng - 0
- 22
Can't work on Ubuntu 16.04
#46 opened by caozhenxiang-kouji - 0
Can you provide the python, pytorch, numpy and other versions used in the project?
#77 opened by LongLongLongWayToGo - 1
- 0
- 1
Stuck in 'p.join()'
#74 opened by RickWangww - 0
- 1
Why do we reverse rewards?
#72 opened by npitsillos - 1
- 0
with respect to how to choose an action
#70 opened by obitoquilt - 6
NotImplementedError
#66 opened by ebasatemesgen - 0
Question in train.py
#69 opened by verystrongjoe - 0
- 0
The while True loop of function train?
#65 opened by machanic - 0
Reward Smoothing
#63 opened by WangChen100 - 1
Multi-processing or multi-threading
#64 opened by lingzhang0319 - 0
- 4
- 1
gradient share problem
#59 opened by vergilus - 1
how to under ensure ensure_shared_grads?
#55 opened by luochao1024 - 1
Did lstm cell really make more sense in A3C?
#58 opened by WonderSeven - 8
Cant' work on pytorch 0.4.0
#52 opened by jiakai0419 - 1
- 1
Why is the convergence on Pong so fast?
#56 opened by Omegastick - 0
action_space.n and actions sampling
#53 opened by bionick87 - 0
- 10
- 1
Atari Environment Decision Choice
#50 opened by choinker - 6
is ensure_shared_grads still required?
#49 opened by edbeeching - 1
environment observation normalization
#48 opened by yhcao6 - 0
No framestack?
#51 opened by lweitkamp - 0
Works better with 80x80 images
#43 opened by ShaniGam - 2
GPU version of a3c algorithm?
#44 opened by bearpaw - 4
Mixture of model prediction and update
#42 opened by dohnala - 10
Running with pytorch 0.2.0
#40 opened by ShaniGam - 3
Where does the initializer come from?
#41 opened by zhengsx - 2
- 4
File "main.py", line 55,TypeError: sum received an invalid combination of arguments
#38 opened by happykayy - 12
I cannot train with your recent pytorch-a3c
#36 opened by aizawatkm - 1
question about using GAE
#34 opened by andrewliao11 - 1
- 0
Error when rendering
#32 opened by ShaniGam - 5
question about the hyper-parameters
#27 opened by onlytailei - 1
loss backward
#28 opened by Tord-Zhang - 1
- 1
LSTM vs FF
#26 opened by ShaniGam - 0
Question about the policy loss calculation?
#31 opened by hyparxis - 1
Why policy loss is negative?
#30 opened by xuehy