google/aqt

Store quantized weights and calibration scales in the checkpoint.

Closed this issue · 1 comments

This is needed for inference performance as weight memory transfers are usually the limiting factor

This is now fixed.