liuliu/s4nnc

Loading the model with quantized weights , two times corrupts the model

Closed this issue · 5 comments

to reproduce

call the load weights function two times and run the model . you get NaNs.
Does not happen with normal fp16/32 weights


graph.openStore(sdxl_model_path) {
    $0.read("unet", model: unet , codec: [.q6p, .q8p, .jit, .ezm7] )
  }

graph.openStore(sdxl_model_path) {
    $0.read("unet", model: unet , codec: [.q6p, .q8p, .jit, .ezm7] )
  }

what could be the possible problem and solution?
Thanks

liuliu commented

Probably because unlike normal weights we allocated on nnc side and just read the blob in, for jit weights, we allocated them on s4nnc side: https://github.com/liuliu/s4nnc/blob/main/nnc/Store.swift#L2053

Workaround would be to create new model when you need to load the weights, but otherwise need to look into why this behavior (possible memory corruption) happens and how to fix them.

Okay thanks

liuliu commented

The limited case fixed in 53f737c

The reason it is limited because if the weight of the same name quantized differently (for example, once in q6p, another in q8p) it will still nan in the future.

Thanks