Add Autoquant Cache

Question

Add Autoquant Cache

Opened this issue 22 days ago · 1 comments

Summary

Today whenever a user runs autoquant, the AutoQuantCache gets populated with dtype + information for Linears seen within an arbitrary torch.nn.Module. This cache is not persistent. We should add a way to persist the benchmarking information across runs.

Details

We likely want a similar paradigm to inductor: store cache to /tmp/torchaoautoquant_{user}.
Provide a mechanism for overriding save location + whether it should be used

Answer 1 · 2024-09-09T20:40:47.000Z

There's some code for that within the autotuning setup: https://github.com/pytorch/ao/tree/e1039abac7f429a8d7f489d047d9b34d6ac6afe2/torchao/kernel