thu-ml/tianshou

An elegant PyTorch deep reinforcement learning library.

PythonMIT

Pinned issues

Adding Hyperparameter Optimisation (HPO)

#978 opened 7 months ago by bordeauxred

Open2

Clearer separation between the trainer and the algorithm and refactoring of policy classes

#1034 opened 4 months ago by maxhuettenrauch

Open1

Issues

Poetry update the torch versioned from cuda (2.0.1+cu118) to cpu (2.1.1) defaultly on Windows
#1145 opened 7 days ago by coolermzb3
5
ModuleNotFoundError: No module named 'tianshou.highlevel'
#1149 opened 3 days ago by luweiagi
1
ImportError: cannot import name 'Self' from 'typing' (/root/miniconda3/lib/python3.10/typing.py)
#1148 opened 3 days ago by luweiagi
1
[question] Why does Tianshou use a replay buffer in on-policy RL algorithms?
#1147 opened 6 days ago by maguro27
1
Document effects of the relations between buffer size, num workers and episode length
#1143 opened 9 days ago by MischaPanch
0
How can I make action sampling within the range specified by my environment when using onpolicy_trainer?
#1142 opened 10 days ago by lidaken
6
Revisit `Launcher` for starting multiple experiments
#1121 opened 11 days ago by MischaPanch
1
Extend benchmark with mujoco v4 envs
#1140 opened 12 days ago by MischaPanch
0
Does Tianshou truly supports MARL out of the box?
#1137 opened 13 days ago by Legendorik
1
Change log is chaotic and partly uninformative
#1129 opened 19 days ago by opcode81
2
Some issues regarding configuration parameters
#1119 opened 14 days ago by yshichseu
5
Potential confusion about where start timesteps are collected in HL interfaces
#1135 opened 14 days ago by MischaPanch
4
Use Altair inside a notebook to display benchmark results
#1136 opened 14 days ago by MischaPanch
0
how to run RL using multi-nodes in cluster
#1133 opened 17 days ago by HYB777
1
Adjust locations of setting the policy in train/eval mode
#1122 opened 25 days ago by maxhuettenrauch
1
Batch: remove `is_empty`
#1108 opened a month ago by MischaPanch
24
Buffer: fix discrepancy in slicing order
#1090 opened 2 months ago by MischaPanch
7
Glad you agree with me on this ^^. I'm not sure whether anywhere in the code the retrieval of the slice with empty values is used. For me it's fine to completely remove it, however, many tests will need to be adjusted, as now many of them rely on this somehow weird retrieval mechanism.
#1120 opened a month ago by MischaPanch
0
Chinese document pages return 404
#1078 opened 2 months ago by H-xie
5
Replicating results in collect random operations through seed setting
#1083 opened a month ago by Gemirobot
3
AttributeError: 'PPOPolicy' object has no attribute 'set_eps'
#1101 opened a month ago by prologua
2
Provide a devcontainer, base GH actions off it
#1118 opened a month ago by MischaPanch
0
Add the non-in-place counterpart of `Batch.to_torch`
#1116 opened a month ago by dantp-ai
0
Batch: don't create new objects on getitem
#1086 opened a month ago by MischaPanch
9
Batch: don't just strip off empty entries when creating batches
#1089 opened 2 months ago by MischaPanch
5
Batch: don't just set 0 when elements have None entries
#1088 opened 2 months ago by MischaPanch
8
UnboundLocalError: cannot access local variable 'obs_space_dtype' in atari_wrapper.py
#1111 opened a month ago by zhuyuanyang
1
Use Atari-5 for future benchmarking of discrete RL
#1110 opened a month ago by nuance1979
1
Should we use torch.compile?
#1114 opened a month ago by MischaPanch
2
Should we use the new schedule-free optimizer?
#1115 opened a month ago by MischaPanch
1
Revisit "warm-up" phase in examples
#1112 opened a month ago by MischaPanch
0
/test/continuous/test_ppo.py TypeError on torch.distributions
#1104 opened a month ago by nado5
3
Batch: deprecate setattr
#1085 opened 2 months ago by MischaPanch
1
Batch: only allow entries with the same length
#1087 opened 2 months ago by MischaPanch
3
Missing Link
#1099 opened a month ago by DarkTechPirate
5
Don't pass envpool envs where vectorenvs are needed
#1096 opened 2 months ago by MischaPanch
0
Reduce duplication between examples/atari/atari_network and examples/vizdoom/network
#1092 opened 2 months ago by MischaPanch
1
Support Dict observation spaces
#1065 opened 3 months ago by MischaPanch
7
Re-examine the whole state story for RNNs
#1095 opened 2 months ago by MischaPanch
0
Re-examine the need of utils.net.common.DataParallelNet
#1094 opened 2 months ago by MischaPanch
0
Fix docstring in BranchingNet
#1093 opened 2 months ago by MischaPanch
0
Better interfaces and names for Actor, Critic, Net and other classes
#1091 opened 2 months ago by MischaPanch
0
How to monitor the episode/epoch return/length in Tianshou?
#1082 opened 2 months ago by PingH129
1
data recording and saving method
#1079 opened 2 months ago by Xiong5Heng
4
Typing annotations of step from MyTestEnv is incompatible with its current subclass gym.Env because it can generate non-scalar rewards.
#1080 opened 2 months ago by dantp-ai
0
how to convert Batch into ndarray/tensor
#1064 opened 2 months ago by qmpzzpmq
5
Revisit and maybe optimize Collectors
#1069 opened 3 months ago by MischaPanch
0
Question: Is Recurrent net supported for FQF
#1075 opened 2 months ago by edoust
0
Inquiry version 0.5.1 and version recommendation
#1073 opened 2 months ago by H-xie
2
two dimensional input action in DDPG
#1070 opened 2 months ago by chenyi8920
3