the-crypt-keeper/can-ai-code

Evaluate ibm-granite/granite-code family

the-crypt-keeper opened this issue · 2 comments

3b, 8b, 20b and 34b instruction following models just released

3B and 8B evaluations at FP16 and NF4 completed

Something might be wrong with the 20B: the FP16 throws a CUDA illegal memory access error when I load it across 4 GPUs and the NF4 performance is worse then 8B.

Going to stop here and not bother with the 34B, if you want to try this model use the 8B.

Update: The 20B and 34B models are a different architecture then 3B and 8B which likely explains the differences I'm seeing.