[Improvement] get_activating_examples is going to be a bottleneck
Closed this issue · 0 comments
SrGonao commented
This function is going to be a bottleneck (I had something similar in my code).
https://github.com/EleutherAI/sae-auto-interp/blob/9751cb25f22824ec544d2718a3bc4a8e246c326f/sae_auto_interp/features/features.py#L210
I'm not sure about the "l_ctx" and "r_ctx" part, but looping over unique examples is very slow. I made this function that makes a list of all the unique sentences:
https://github.com/EleutherAI/sae-auto-interp/blob/9751cb25f22824ec544d2718a3bc4a8e246c326f/sae_auto_interp/features/features.py#L152
I think it is potentially better to save the trimming of the sentences to somewhere else in the code. I feel like this should be something we want to experiment with