CUDA kernels from PEFT v0.11.0 breaks C++ compilation
BenjaminBossan opened this issue · 4 comments
System Info
Who can help?
Information
- The official example scripts
- My own modified scripts
Tasks
- An officially supported task in the
examples
folder - My own task or dataset (give details below)
Reproduction
As reported to us by @danielhanchen
the new PEFT 0.11.0 release is breaking llama.cpp / C++ compilation. If you import PEFT, it just breaks C++ compilation - presumably its related to some scripting.
Repro: PEFT 0.10.0 works: https://colab.research.google.com/drive/1vQ4_wUazxvf39wEeN6fxP58xHVaT3Mj8?usp=sharing
PEFT 0.11.0 fails causing gcc to break after importing peft: https://colab.research.google.com/drive/1-NHOoRLISEyisuQqFgUR5L714Fe9sLij?usp=sharing
Ping @yfeng95 @Zeju1997 @YuliangXiu
Expected behavior
We may have to remove the kernels in a patch release if there is no quick solution.
I made a repo to comment out BOFT for now - https://github.com/danielhanchen/peft
And repro which worked after comment it out: https://colab.research.google.com/drive/1Y_MdJnS73hIlR_t2DXgXCgqKVwXHPE82?usp=sharing
I manually added the below to every line and tried isolating the problem:
def install_llama_cpp_blocking(use_cuda = True):
import subprocess
import os
import psutil
# https://github.com/ggerganov/llama.cpp/issues/7062
# Weirdly GPU conversion for GGUF breaks??
# use_cuda = "LLAMA_CUDA=1" if use_cuda else ""
commands = [
"git clone --recursive https://github.com/ggerganov/llama.cpp",
"make clean -C llama.cpp",
# https://github.com/ggerganov/llama.cpp/issues/7062
# Weirdly GPU conversion for GGUF breaks??
# f"{use_cuda} make all -j{psutil.cpu_count()*2} -C llama.cpp",
f"make all -j{psutil.cpu_count()*2} -C llama.cpp",
"pip install gguf protobuf",
]
# if os.path.exists("llama.cpp"): return
for command in commands:
with subprocess.Popen(command, shell = True, stdout = subprocess.PIPE, stderr = subprocess.STDOUT, bufsize = 1) as sp:
for line in sp.stdout:
line = line.decode("utf-8", errors = "replace")
if "undefined reference" in line:
raise RuntimeError("Failed compiling llama.cpp")
# print(line, flush = True, end = "")
pass
pass
pass
Running this Python script reproduces the error on my machine:
import os
import subprocess
from peft import PeftModelForCausalLM
os.chdir("/tmp/")
commands = [
"git clone --recursive https://github.com/ggerganov/llama.cpp",
"make clean -C llama.cpp",
"make all -j4 -C llama.cpp",
"echo $?",
]
for command in commands:
with subprocess.Popen(command, shell=True, stdout=subprocess.PIPE, stderr=subprocess.STDOUT, bufsize=1) as sp:
for line in sp.stdout:
line = line.decode("utf-8", errors = "replace")
print(line, end = "")
if "undefined reference" in line:
raise RuntimeError("Failed compiling llama.cpp")
print(f"-------------- finished: {command} --------------")
print("done")
Commenting out these lines seems to fix it for me:
peft/src/peft/tuners/boft/layer.py
Lines 34 to 35 in ae1ae20