Integration with TransformerLens ?
glerzing opened this issue · 0 comments
glerzing commented
I discovered LIT from an issue on TransformerLens, which is a mechanistic interpretability Python library that seeks to analyze at a low level how transformers work. They use techniques such as activation patching, that help to analyze the causal relation between each layer and the output, and notably how much each attention head contributes the result.
Do you think there is some potential in integrating TransformerLens into LIT (or just re-implement these techniques), or the other way, integrate LIT into TransformerLens to get access to a GUI ?
I might consider helping with the development if I have the time.