DeepX-inc/machina

Control section: Deep Reinforcement Learning framework

PythonMIT

Issues

Quickstart issue
#266 opened 3 years ago by nikkujensen
0
How to implement Meta-RL algorithm like MAML
#265 opened 4 years ago by Katsumi-N
0
Validation of QT-Opt
#264 opened 5 years ago by leinxx
1
Executed run ppo.py,RuntimeError: CUDA error: device-side assert triggered occured.
#261 opened 5 years ago by hosokawa-taiji
5
Executed run_r2d2_sac.py,`BrokenPipeError: [Errno 32] Broken pipe` occurred.
#260 opened 5 years ago by hosokawa-taiji
9
Executed run_ppo.py,RuntimeError: size mismatch occurred.
#258 opened 5 years ago by hosokawa-taiji
8
Example script of DDPG seems incorrect
#254 opened 5 years ago by caprest
2
max_ent in example/run_sac.py is not used
#255 opened 5 years ago by caprest
1
Some refactoring to files in pols directory
#252 opened 5 years ago by tadashiK
4
Can't use CNN?
#248 opened 6 years ago by yumion
3
Add info about which type of action space each algorithm supports in READMe
#245 opened 6 years ago by takerfume
0
Implement DQN
#244 opened 6 years ago by takerfume
0
CEMDeterminsticsSAVFunc doesn't support gym.spaces.Discrete
#242 opened 6 years ago by yumion
1
Write DIAYN into `Implemented Algorithms` in Readme
#237 opened 6 years ago by takerfume
0
Rename ob_space and ac_space to observation_space and action_space
#230 opened 6 years ago by takerfume
1
CEMDeterminsticsSAVFunc take too much batch_size.
#221 opened 6 years ago by iory
2
Enpty rows in progress.vsv
#225 opened 6 years ago by swdr1904
4
Make document pages
#213 opened 6 years ago by takerfume
3
Complete lacked docstrings
#228 opened 6 years ago by takerfume
1
Support gym.spaces.Dict
#216 opened 6 years ago by iory
2
Fix options for ppo and trpo with rnn
#192 opened 6 years ago by takerfume
4
Use default device manager in PyTorch
#214 opened 6 years ago by rarilurelo
0
Add option of bptt's length
#209 opened 6 years ago by rarilurelo
0
Dataparallel Option in PPO Cause Error
#202 opened 6 years ago by takerfume
2
Inappropriate mean in loss_functional with rnn
#118 opened 6 years ago by rarilurelo
1
Add names of implemented algorithms to readme
#152 opened 6 years ago by rarilurelo
0
Managing number of steps in a batch
#119 opened 6 years ago by rarilurelo
1
Implementation of R2D2
#111 opened 6 years ago by rarilurelo
1
Allocate Traj's tensor to cpu
#121 opened 6 years ago by rarilurelo
1
TD3
#193 opened 6 years ago by takerfume
1
Faster sampling in random batch
#179 opened 6 years ago by rarilurelo
1
Add N-distill
#176 opened 6 years ago by pwuethri
1
Example code does not run anymore
#180 opened 6 years ago by pwuethri
1
Data parallel on CEMDeteminisiticSAVfunc
#172 opened 6 years ago by rarilurelo
1
`lf.likelihood` seems to be log-likelihood
#168 opened 6 years ago by rarilurelo
2
Test for new algorithm
#139 opened 6 years ago by rarilurelo
2
More general hs (hidden state)
#167 opened 6 years ago by rarilurelo
0
Remove pds (probabilistic distributions) class and incorporating to pol (policy) class.
#166 opened 6 years ago by rarilurelo
0
log_std referenced before assignment
#161 opened 6 years ago by jtoyama4
2
Write meanings of args
#133 opened 6 years ago by takerfume
1
Testing policy distillation
#155 opened 6 years ago by pwuethri
2
Add Explanation about Imitation Learning
#142 opened 6 years ago by takerfume
0
Script for taking movies of learned policy
#132 opened 6 years ago by takerfume
0
Use cpu_mode in sampling phase
#143 opened 6 years ago by rarilurelo
1
QT-Opt
#130 opened 6 years ago by takerfume
1
Adversarial Inverse Reinforcement Learning
#122 opened 6 years ago by takerfume
1
Variational Discriminator Bottleneck
#129 opened 6 years ago by takerfume
0
Learning Self-Imitating Diverse Policies
#125 opened 6 years ago by takerfume
0
Wrapper environment in which observation includes reward
#114 opened 6 years ago by rarilurelo
1
Wrapper environment in which observation includes action
#113 opened 6 years ago by rarilurelo
1