kzl/decision-transformer

Official codebase for Decision Transformer: Reinforcement Learning via Sequence Modeling.

PythonMIT

Issues

Any reason for using different models for gym and atari experiments?
#47 opened 2 years ago by GilgameshD
2
Potential bug: Attention mask allows access to future tokens?
#30 opened 2 years ago by donthomasitos
5
position embeddings do not vary between various time steps
#64 opened 2 months ago by udaymallappa
1
MultiDiscrete Action Space
#71 opened 2 months ago by hamzaali98
1
What should return to go be during inferencing?
#70 opened 2 months ago by FinAminToastCrunch
2
Question about the output of the decision transformer
#67 opened 2 months ago by Pulsar110
3
AttributeError: 'GPT2Config' object has no attribute 'n_ctx'
#56 opened a year ago by YTL7
6
Where are the datasets?
#74 opened 3 months ago by wilhem
1
compilation error (MuJoCo)
#72 opened 3 months ago by NagarajJ111
0
Gsutil Error
#60 opened a year ago by ChaselWan
1
TypeError on trajectory_gpt2.py
#31 opened 3 years ago by dljzx
2
Error when loading fixed replay buffer
#34 opened 2 years ago by dgjung0220
11
Is there a code for the shortest path search?
#48 opened 2 years ago by geon0325
1
No registered env with id: halfcheetah-medium-v2
#44 opened 2 years ago by boykac
4
Image Data
#65 opened 9 months ago by AsadMir10
0
how to print accurate in this code?
#63 opened a year ago by YuZheng23
2
torch.stack bug
#62 opened a year ago by ChaselWan
0
why sample from multinomial distribution during evaluation in Atari?
#61 opened a year ago by sallyqiansun
0
Problem Creating Deterministic Action Selection
#39 opened 2 years ago by carsonsmith87
1
Questions about dataset preprocessing
#55 opened a year ago by typoverflow
0
[IDEA] Code for dataset generation
#54 opened a year ago by JustinS6626
1
Bug in state and action prediction
#53 opened a year ago by ezhang7423
0
Global position embedding and timesteps look wrong in atari
#51 opened 2 years ago by nzw0301
2
Why the padding is different for state, action, reward?
#50 opened 2 years ago by CeyaoZhang
0
Padding / attention_mask questions
#36 opened 2 years ago by DaveyBiggers
2
The setting of final token
#49 opened 2 years ago by IpadLi
0
Atari results
#46 opened 2 years ago by TongZhangTHU
4
MemoryError: Unable to allocate 6.57 GiB for an array with shape (7056000000,) and data type uint8
#45 opened 2 years ago by leeruibin
1
Can you add reacher dataset
#29 opened 3 years ago by jsun57
1
I am wondering if the shortest path case is included in the code? thanks
#43 opened 2 years ago by bradley-code-again
0
The results on Mujoco reported in paper might be heavily influenced by env version
#42 opened 2 years ago by linprophet
0
Padding tokens represented differently in different parts of the code
#41 opened 2 years ago by asmadotgh
0
More Training Information on Reacher
#40 opened 2 years ago by ivo-1
0
Confusion over shape of returns_to_go in get_batch
#38 opened 2 years ago by DaveyBiggers
0
Some problems after reading the paper and code
#37 opened 2 years ago by lylwy
0
Where are the weights for the trained models?
#35 opened 2 years ago by simoninithomas
1
About parameters in code
#32 opened 2 years ago by lylwy
1
Minor bug that removes best performing trajectory in gym experiments
#33 opened 2 years ago by micahcarroll
1
about graph experiment
#28 opened 3 years ago by ysymyth
0
batch sampling: only last tokens?
#27 opened 3 years ago by Howuhh
1
undestanding use of rewards
#24 opened 3 years ago by jeweinb
2
State and Return preds input
#25 opened 3 years ago by backpropper
1
Question: is it possible to use the same Decision Transformer for new training trajectories generation?
#26 opened 3 years ago by danielgafni
2
Return-to-go conditioning on Atari
#23 opened 3 years ago by geekyutao
2
aligning action embeddings to other embeddings at line 237
#20 opened 3 years ago by loct824
3
Possible misalignment in calculating rtg in Atari
#22 opened 3 years ago by geekyutao
1
Regarding atari breakout results
#21 opened 3 years ago by dido1998
1
difference between two GPT models used in this repo?
#17 opened 3 years ago by ChenDRAG
1
Timesteps Shape
#19 opened 3 years ago by MrShininnnnn
1
how to get the score of an expert policy and some other details
#16 opened 3 years ago by TianhongDai
1