ikostrikov/pytorch-a3c

PyTorch implementation of Asynchronous Advantage Actor Critic (A3C) from "Asynchronous Methods for Deep Reinforcement Learning".

PythonMIT

Issues

if there's no "if shared_param.grad is not None: return" what will happen?
#79 opened 2 years ago by yinqinghai
1
where see the result？
#80 opened 2 years ago by xbdeng
0
TypeError: tuple indices must be integers or slices, not tuple
#78 opened 2 years ago by JingYuPrime
0
Can't work on Ubuntu 16.04
#46 opened 7 years ago by caozhenxiang-kouji
22
Can you provide the python, pytorch, numpy and other versions used in the project?
#77 opened 2 years ago by LongLongLongWayToGo
0
After some steps, all the NNs always output same action
#75 opened 4 years ago by Eify666666
1
Scepticism about the correctness of the use of the LSTMCell
#76 opened 3 years ago by alirezakazemipour
0
Stuck in 'p.join()'
#74 opened 4 years ago by RickWangww
1
Dependency list not provided (environment.yml file)
#73 opened 5 years ago by MasterScrat
0
Why do we reverse rewards?
#72 opened 5 years ago by npitsillos
1
How does A3C aggregate the model from different learner?
#71 opened 5 years ago by dxu23nc
1
with respect to how to choose an action
#70 opened 5 years ago by obitoquilt
0
NotImplementedError
#66 opened 5 years ago by ebasatemesgen
6
Question in train.py
#69 opened 5 years ago by verystrongjoe
0
[Question] Does a2c support distributed processing?
#67 opened 5 years ago by QiXuanWang
0
The while True loop of function train?
#65 opened 5 years ago by machanic
0
Reward Smoothing
#63 opened 6 years ago by WangChen100
0
Multi-processing or multi-threading
#64 opened 6 years ago by lingzhang0319
1
What's the difference between environment 'Pong-v4' and 'PongDeterministic-v4'
#62 opened 6 years ago by HuiSiqi
0
GAE parameter name should be lambda not tau. And why is default 1.0?
#60 opened 6 years ago by beduffy
4
gradient share problem
#59 opened 6 years ago by vergilus
1
how to under ensure ensure_shared_grads?
#55 opened 6 years ago by luochao1024
1
Did lstm cell really make more sense in A3C?
#58 opened 6 years ago by WonderSeven
1
Cant' work on pytorch 0.4.0
#52 opened 7 years ago by jiakai0419
8
big bug
#57 opened 6 years ago by harini20
1
Why is the convergence on Pong so fast?
#56 opened 6 years ago by Omegastick
1
action_space.n and actions sampling
#53 opened 6 years ago by bionick87
0
Question about normalized_columns_initializer(weights, std=1.0) method.
#45 opened 6 years ago by xueyaohuang
0
When using no-shared = False, the process is blocked
#37 opened 7 years ago by keithyin
10
Atari Environment Decision Choice
#50 opened 7 years ago by choinker
1
is ensure_shared_grads still required?
#49 opened 7 years ago by edbeeching
6
environment observation normalization
#48 opened 7 years ago by yhcao6
1
No framestack?
#51 opened 7 years ago by lweitkamp
0
Works better with 80x80 images
#43 opened 7 years ago by ShaniGam
0
GPU version of a3c algorithm?
#44 opened 7 years ago by bearpaw
2
Mixture of model prediction and update
#42 opened 7 years ago by dohnala
4
Running with pytorch 0.2.0
#40 opened 7 years ago by ShaniGam
10
Where does the initializer come from?
#41 opened 7 years ago by zhengsx
3
when run this code in macbook pro, python exit unnormally
#39 opened 7 years ago by yyhTHU
2
File "main.py", line 55,TypeError: sum received an invalid combination of arguments
#38 opened 7 years ago by happykayy
4
I cannot train with your recent pytorch-a3c
#36 opened 7 years ago by aizawatkm
12
question about using GAE
#34 opened 7 years ago by andrewliao11
1
What is the purpose of `os.environ['OMP_NUM_THREADS'] = '1'`?
#33 opened 7 years ago by xmfbit
1
Error when rendering
#32 opened 7 years ago by ShaniGam
0
question about the hyper-parameters
#27 opened 7 years ago by onlytailei
5
loss backward
#28 opened 7 years ago by Tord-Zhang
1
Steps taken and/or hardware used for deterministic Pong
#29 opened 7 years ago by seann999
1
LSTM vs FF
#26 opened 7 years ago by ShaniGam
1
Question about the policy loss calculation?
#31 opened 7 years ago by hyparxis
0
Why policy loss is negative?
#30 opened 7 years ago by xuehy
1