Issues
- 0
Quickstart issue
#266 opened by nikkujensen - 0
How to implement Meta-RL algorithm like MAML
#265 opened by Katsumi-N - 1
Validation of QT-Opt
#264 opened by leinxx - 5
Executed run ppo.py,RuntimeError: CUDA error: device-side assert triggered occured.
#261 opened by hosokawa-taiji - 9
Executed run_r2d2_sac.py,`BrokenPipeError: [Errno 32] Broken pipe` occurred.
#260 opened by hosokawa-taiji - 8
- 2
Example script of DDPG seems incorrect
#254 opened by caprest - 1
max_ent in example/run_sac.py is not used
#255 opened by caprest - 4
Some refactoring to files in pols directory
#252 opened by tadashiK - 3
Can't use CNN?
#248 opened by yumion - 0
- 0
Implement DQN
#244 opened by takerfume - 1
- 0
Write DIAYN into `Implemented Algorithms` in Readme
#237 opened by takerfume - 1
- 2
CEMDeterminsticsSAVFunc take too much batch_size.
#221 opened by iory - 4
Enpty rows in progress.vsv
#225 opened by swdr1904 - 3
Make document pages
#213 opened by takerfume - 1
Complete lacked docstrings
#228 opened by takerfume - 2
Support gym.spaces.Dict
#216 opened by iory - 4
Fix options for ppo and trpo with rnn
#192 opened by takerfume - 0
Use default device manager in PyTorch
#214 opened by rarilurelo - 0
Add option of bptt's length
#209 opened by rarilurelo - 2
Dataparallel Option in PPO Cause Error
#202 opened by takerfume - 1
Inappropriate mean in loss_functional with rnn
#118 opened by rarilurelo - 0
Add names of implemented algorithms to readme
#152 opened by rarilurelo - 1
Managing number of steps in a batch
#119 opened by rarilurelo - 1
Implementation of R2D2
#111 opened by rarilurelo - 1
Allocate Traj's tensor to cpu
#121 opened by rarilurelo - 1
- 1
Faster sampling in random batch
#179 opened by rarilurelo - 1
Add N-distill
#176 opened by pwuethri - 1
Example code does not run anymore
#180 opened by pwuethri - 1
Data parallel on CEMDeteminisiticSAVfunc
#172 opened by rarilurelo - 2
`lf.likelihood` seems to be log-likelihood
#168 opened by rarilurelo - 2
Test for new algorithm
#139 opened by rarilurelo - 0
More general hs (hidden state)
#167 opened by rarilurelo - 0
Remove pds (probabilistic distributions) class and incorporating to pol (policy) class.
#166 opened by rarilurelo - 2
log_std referenced before assignment
#161 opened by jtoyama4 - 1
Write meanings of args
#133 opened by takerfume - 2
Testing policy distillation
#155 opened by pwuethri - 0
Add Explanation about Imitation Learning
#142 opened by takerfume - 0
Script for taking movies of learned policy
#132 opened by takerfume - 1
Use cpu_mode in sampling phase
#143 opened by rarilurelo - 1
- 1
Adversarial Inverse Reinforcement Learning
#122 opened by takerfume - 0
Variational Discriminator Bottleneck
#129 opened by takerfume - 0
Learning Self-Imitating Diverse Policies
#125 opened by takerfume - 1
- 1