finetune.py segmentation fault

Question

finetune.py segmentation fault

Closed this issue 10 months ago · 6 comments

I am trying to run the finetune.py and getting a seg. fault. Can anyone help. I am on Apple M2 mac mini with 24G memory.

% python finetune.py 
loc("mps_transpose"("(mpsFileLoc): /AppleInternal/Library/BuildRoots/75428952-3aa4-11ee-8b65-46d450270006/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShadersGraph/mpsgraph/MetalPerformanceShadersGraph/Core/Files/MPSGraphUtilities.mm":206:0)): error: 'anec.transpose' op Invalid configuration for the following reasons: Tensor dimensions N1D1C4096H1W32000 are not within supported range, N[1-65536]D[1-16384]C[1-65536]H[1-16384]W[1-16384].
loc("mps_matmul"("(mpsFileLoc): /AppleInternal/Library/BuildRoots/75428952-3aa4-11ee-8b65-46d450270006/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShadersGraph/mpsgraph/MetalPerformanceShadersGraph/Core/Files/MPSGraphUtilities.mm":39:0)): error: 'anec.matmul' op Invalid configuration for the following reasons: Tensor dimensions N1D1C4096H1W32000 are not within supported range, N[1-65536]D[1-16384]C[1-65536]H[1-16384]W[1-16384].
zsh: segmentation fault  python finetune.py

Answer 1 · 2023-10-26T13:25:16.000Z

Which version of pytorch do you use? I saw some issues with pytorch 2.0.1 (pytorch/pytorch#110975)
did you update the slowllama repo recently?

Thank you!

Answer 2 · 2023-10-26T14:10:15.000Z

Thanks. In that case I should upgrade?

Yes.

% pip freeze | egrep -i 'torch|numpy|sentence|fewlines'
fewlines==0.0.9
numpy==1.25.2
sentence-transformers==2.2.2
sentencepiece==0.1.99
torch==2.0.1
torchvision==0.15.2

Just couple of hours ago.

Answer 3 · 2023-10-26T15:20:22.000Z

Yes, try 2.1.0 please. Also please make sure you ran prepare_model and finetune at the same version of slowllama - as it is pretty early/experimental there's no backwards compatibility.

Answer 4 · 2023-10-26T15:53:49.000Z

Thanks, I did and it works now! This is super. Great work! Some points I noted:

Not all snapshots were written to disk. Some were skipped. I will re run and check.
GPU usage during finetune was optimum. However during inferencing, it is using fairly low, <10%. Is this expected?
Will try more things in coming days. Thanks again.

Answer 5 · 2023-10-26T16:04:38.000Z

The logic to save is to only save if loss is lower. We can change that (https://github.com/okuvshynov/slowllama/blob/main/finetune.py#L53-L58).
Yes, I put close to no effort to inference optimization. I think there are other libraries focusing on that specifically (e.g. llama.cpp)

Answer 6 · 2023-10-26T16:10:15.000Z

Thanks for the clarifications, I will dive into the code too. Though I am not an expert at this but I will give it my best shot. llama.cpp works great for me, however I am unable to get the finetune to work on it on my mac.