skeskinen/bert.cpp

About the calculation of overhead.

znsoftm opened this issue · 4 comments

or BERT mode, its overhead is calculated as :

model_mem_req += (5 + 16 * n_layer) * 256; // object overhead

Can anyone explain the meaning 5 is extra tensors, 16 means each layer has 16 tensor, and 256 for what?

Is it the sizeof ggml_tensor struct ? The actual size is 208 bytes, so 256 is rounded size?

My memory is a little hazy on this subject.
Like you said 5 should be the extra model wise tensors not tied to any layer. I think I tried smaller number than 256 for the size but it crashed with OOM.
Probably the real size of C structs is always rounded up to the next power of 2?

thanks for your answer:)

I have tested the latest ggml, should alter the 256 to 512. Do not understand why:(