pomonam/kronfluence

Question about per-token influence for time series transformer models

Closed this issue · 3 comments

Thank you for open sourcing your amazing work.

I tried to compute per-token influences using Hugging Face's patchTST model. However, I encountered the following error:

RuntimeError: The model does not support token-wise score computation. Set compute_per_module_scores=True or compute_per_token_scores=False to avoid this error.

This is triggered by the DIMENSION_NOT_MATCH_ERROR_MSG in this file. However, it does not seem to be a dimensionality issue. I was wondering if it is because we don't need to use a tokenizer for time series.

My question is: Does the per-token influence apply exclusively to language transformer models? How can I make it work for time series transformer models like PatchTST? Thank you.

Thank you for reporting the issue! Have you tried setting compute_per_module_scores = True? While per-token influence computation can be applied to other transformer models (e.g., binary classification), enabling this flag in such cases is important. This is because some modules have a token dimension while others do not, and aggregating all scores causes a dimensionality mismatch issue (e.g., adding matrices of dimensionquery_size x train_size x token_size and query_size x train_size). If this also causes an error, it would be great if you could share a small toy code to reproduce this; I can take a look at it later.

This has solved my issue. Thank you!

That is great to hear! You can optimize the code a bit by specifying only the modules that have the same token dimension in Task and turning off compute_per_module_scores (especially if the context length is large). But please feel free to open the issue if the speed becomes a bottleneck.