kzl/decision-transformer
Official codebase for Decision Transformer: Reinforcement Learning via Sequence Modeling.
PythonMIT
Issues
- 3
- 1
- 1
MultiDiscrete Action Space
#71 opened by hamzaali98 - 2
- 3
- 6
- 1
Where are the datasets?
#74 opened by wilhem - 0
compilation error (MuJoCo)
#72 opened by NagarajJ111 - 1
Gsutil Error
#60 opened by ChaselWan - 2
TypeError on trajectory_gpt2.py
#31 opened by dljzx - 11
Error when loading fixed replay buffer
#34 opened by dgjung0220 - 1
Is there a code for the shortest path search?
#48 opened by geon0325 - 4
No registered env with id: halfcheetah-medium-v2
#44 opened by boykac - 0
Image Data
#65 opened by AsadMir10 - 1
- 2
how to print accurate in this code?
#63 opened by YuZheng23 - 0
torch.stack bug
#62 opened by ChaselWan - 0
- 1
- 0
Questions about dataset preprocessing
#55 opened by typoverflow - 1
[IDEA] Code for dataset generation
#54 opened by JustinS6626 - 0
Bug in state and action prediction
#53 opened by ezhang7423 - 2
- 0
- 2
Padding / attention_mask questions
#36 opened by DaveyBiggers - 0
The setting of final token
#49 opened by IpadLi - 4
Atari results
#46 opened by TongZhangTHU - 1
MemoryError: Unable to allocate 6.57 GiB for an array with shape (7056000000,) and data type uint8
#45 opened by leeruibin - 1
Can you add reacher dataset
#29 opened by jsun57 - 0
I am wondering if the shortest path case is included in the code? thanks
#43 opened by bradley-code-again - 0
The results on Mujoco reported in paper might be heavily influenced by env version
#42 opened by linprophet - 0
- 0
More Training Information on Reacher
#40 opened by ivo-1 - 0
- 0
Some problems after reading the paper and code
#37 opened by lylwy - 1
- 1
About parameters in code
#32 opened by lylwy - 1
- 0
about graph experiment
#28 opened by ysymyth - 1
batch sampling: only last tokens?
#27 opened by Howuhh - 2
undestanding use of rewards
#24 opened by jeweinb - 1
State and Return preds input
#25 opened by backpropper - 2
Question: is it possible to use the same Decision Transformer for new training trajectories generation?
#26 opened by danielgafni - 2
Return-to-go conditioning on Atari
#23 opened by geekyutao - 3
- 1
- 1
Regarding atari breakout results
#21 opened by dido1998 - 1
- 1
Timesteps Shape
#19 opened by MrShininnnnn - 1