jbloomAus/DecisionTransformerInterpretability
Interpreting how transformers simulate agents performing RL tasks
Jupyter NotebookMIT
Issues
- 0
Over resource limits on Streamlit Cloud
#110 opened by subratpp - 1
Over resource limits on Streamlit Cloud
#109 opened by eggsyntax - 0
Over resource limits on Streamlit Cloud
#108 opened by mycpuorg - 0
Over resource limits on Streamlit Cloud
#107 opened by hamzaali98 - 3
Cuda cannot be disabled
#106 opened by jackmiller2003 - 9
Folding Layer Norm in Model Loading
#71 opened by jbloomAus - 1
- 0
Complete Embedding visualizations
#78 opened by jbloomAus - 0
- 0
Complete QK/OV Circuit visualizations
#81 opened by jbloomAus - 0
Fix Ablation Tool
#80 opened by jbloomAus - 2
Write Up Analysis of Memory Env Solution.
#40 opened by jbloomAus - 2
Write a post before EAG London
#74 opened by jbloomAus - 0
Reverse Logit Lense
#77 opened by jbloomAus - 3
Mega Card: Improve Analysis App in various ways to facilitate better interpretability analysis of the new models
#44 opened by jbloomAus - 1
- 0
Expand analytical AVEC
#75 opened by jbloomAus - 2
Implement AVEC in the interpretability app
#72 opened by jbloomAus - 0
Streamlit app requires mujoco installation
#73 opened by DalasNoin - 1
- 1
SVD Decomp / Explore ways to use dimensionality reduction to quickly understand what heads are doing.
#69 opened by jbloomAus - 1
Improve history panel in streamlit app
#68 opened by jbloomAus - 1
Facelift of the RTG Scan in the streamlit app
#67 opened by jbloomAus - 2
- 0
Train a BC on PCT traj = 1 with two different agents mixed in and see if we can tell which one it thinks it is.
#66 opened by jbloomAus - 0
- 6
Explore Improvements to DT Training Procedure
#53 opened by jbloomAus - 1
Do an experiment where you turn off the weighted random sampler and/or visualize sampling prob distribution.
#56 opened by jbloomAus - 0
Write a check to look at layer weight norms at initialization on the architecture, maybe visualize in a bar chart.
#63 opened by jbloomAus - 6
- 7
- 0
- 0
- 3
- 0
Investigate the effect of Dropout / Stochastic Depth on Model training/interpretability
#58 opened by jbloomAus - 1
- 1
Train a model using layer norm pre to see if this helps formation of calibrated, performant memory env agents.
#52 opened by jbloomAus - 2
- 1
Investigate the effects of training on data sampled using different strategies created in #46
#47 opened by jbloomAus - 1
Write a Rollout Sampling Utility for PPO Agents and add features affect generated distribution.
#46 opened by jbloomAus - 0
Upgrade Collect Demonstrations Workflow
#51 opened by jbloomAus - 0
- 1
Investigate whether anyone else does/ just experiment with finetuning of PPO models without entropy at the end of training to remove entropy optimising behaviors.
#45 opened by jbloomAus - 1
Set padded RTG in training data to be true RTG until masking is implemented correctly.
#43 opened by jbloomAus - 0
Update the app to also work with BC models
#42 opened by jbloomAus - 0
- 0
Add checkpoints during Offline Training
#39 opened by jbloomAus - 1
- 0
- 0
Update ppo checkpoints code to upload each checkpoint in real time rather than all at the end of the workflow
#35 opened by jbloomAus