huggingface/peft

CUDA kernels from PEFT v0.11.0 breaks C++ compilation

BenjaminBossan opened this issue · 4 comments

System Info

Who can help?

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder
  • My own task or dataset (give details below)

Reproduction

As reported to us by @danielhanchen

the new PEFT 0.11.0 release is breaking llama.cpp / C++ compilation. If you import PEFT, it just breaks C++ compilation - presumably its related to some scripting.
Repro: PEFT 0.10.0 works: https://colab.research.google.com/drive/1vQ4_wUazxvf39wEeN6fxP58xHVaT3Mj8?usp=sharing
PEFT 0.11.0 fails causing gcc to break after importing peft: https://colab.research.google.com/drive/1-NHOoRLISEyisuQqFgUR5L714Fe9sLij?usp=sharing

Ping @yfeng95 @Zeju1997 @YuliangXiu

Expected behavior

We may have to remove the kernels in a patch release if there is no quick solution.

I made a repo to comment out BOFT for now - https://github.com/danielhanchen/peft

And repro which worked after comment it out: https://colab.research.google.com/drive/1Y_MdJnS73hIlR_t2DXgXCgqKVwXHPE82?usp=sharing

I manually added the below to every line and tried isolating the problem:

def install_llama_cpp_blocking(use_cuda = True):
    import subprocess
    import os
    import psutil
    # https://github.com/ggerganov/llama.cpp/issues/7062
    # Weirdly GPU conversion for GGUF breaks??
    # use_cuda = "LLAMA_CUDA=1" if use_cuda else ""

    commands = [
        "git clone --recursive https://github.com/ggerganov/llama.cpp",
        "make clean -C llama.cpp",
        # https://github.com/ggerganov/llama.cpp/issues/7062
        # Weirdly GPU conversion for GGUF breaks??
        # f"{use_cuda} make all -j{psutil.cpu_count()*2} -C llama.cpp",
        f"make all -j{psutil.cpu_count()*2} -C llama.cpp",
        "pip install gguf protobuf",
    ]
    # if os.path.exists("llama.cpp"): return

    for command in commands:
        with subprocess.Popen(command, shell = True, stdout = subprocess.PIPE, stderr = subprocess.STDOUT, bufsize = 1) as sp:
            for line in sp.stdout:
                line = line.decode("utf-8", errors = "replace")
                if "undefined reference" in line:
                    raise RuntimeError("Failed compiling llama.cpp")
                # print(line, flush = True, end = "")
        pass
    pass
pass

Running this Python script reproduces the error on my machine:

import os
import subprocess
from peft import PeftModelForCausalLM

os.chdir("/tmp/")

commands = [
    "git clone --recursive https://github.com/ggerganov/llama.cpp",
    "make clean -C llama.cpp",
    "make all -j4 -C llama.cpp",
    "echo $?",
]

for command in commands:
    with subprocess.Popen(command, shell=True, stdout=subprocess.PIPE, stderr=subprocess.STDOUT, bufsize=1) as sp:
        for line in sp.stdout:
            line = line.decode("utf-8", errors = "replace")
            print(line, end = "")
            if "undefined reference" in line:
                raise RuntimeError("Failed compiling llama.cpp")
    print(f"-------------- finished: {command} --------------")
print("done")

Commenting out these lines seems to fix it for me:

os.environ["CC"] = "gcc"
os.environ["CXX"] = "gcc"