microsoft/Moonlit

Support for Compresso pruned weights removal

Opened this issue · 0 comments

currently after merging pruning masks and LoRA weights, LLaMA-7B size is increasing from 15GB to 26GB. Please provide support to remove pruned weights from the model