unslothai/unsloth

Conda installation detailed instructions

NasonZ opened this issue · 30 comments

NasonZ commented

I'm trying to follow the instructions for installing unsloth in a conda environment, the problem is that the conda gets stuck when running the install lines.

I've tried running it twice, both times it got stuck solving the environment and I stopped after 30 minutes.

$ conda install cudatoolkit xformers bitsandbytes pytorch pytorch-cuda=12.1 -c pytorch -c nvidia -c xformers -c conda-forge -y
Collecting package metadata (current_repodata.json): \ WARNING conda.models.version:get_matcher(546): Using .* with relational operator is superfluous and deprecated and will be removed in a future version of conda. Your spec was 1.7.1.*, but conda is ignoring the .* and treating it as 1.7.1
done
Solving environment: failed with initial frozen solve. Retrying with flexible solve.
Solving environment: failed with repodata from current_repodata.json, will retry with next repodata source.
Collecting package metadata (repodata.json): - WARNING conda.models.version:get_matcher(546): Using .* with relational operator is superfluous and deprecated and will be removed in a future version of conda. Your spec was 1.8.0.*, but conda is ignoring the .* and treating it as 1.8.0
WARNING conda.models.version:get_matcher(546): Using .* with relational operator is superfluous and deprecated and will be removed in a future version of conda. Your spec was 1.9.0.*, but conda is ignoring the .* and treating it as 1.9.0
WARNING conda.models.version:get_matcher(546): Using .* with relational operator is superfluous and deprecated and will be removed in a future version of conda. Your spec was 1.6.0.*, but conda is ignoring the .* and treating it as 1.6.0
done
Solving environment: | 

Additional system info:

$ nvidia-smi
Mon Jan  8 20:28:55 2024       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.129.03             Driver Version: 535.129.03   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA A10G                    Off | 00000000:00:1E.0 Off |                    0 |
|  0%   28C    P8              16W / 300W |      4MiB / 23028MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
                                                                                         
+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|  No running processes found                                                           |
+---------------------------------------------------------------------------------------+

Do you have mamba?

Maybe try mamba install cudatoolkit xformers bitsandbytes pytorch pytorch-cuda=12.1 -c pytorch -c nvidia -c xformers -c conda-forge -y

Mamba can help solve long solving issues

NasonZ commented

No, I have a miniconda/anaconda which was installed via oobabooga.

(base) ubuntu@awsec2:~$ conda activate model_train_env
(model_train_env) ubuntu@awsec2:~$ mamba install cudatoolkit xformers bitsandbytes pytorch pytorch-cuda=12.1 -c pytorch -c nvidia -c xformers -c conda-forge -y
Command 'mamba' not found, did you mean:
  command 'samba' from deb samba (2:4.15.13+dfsg-0ubuntu1.5)
Try: sudo apt install <deb name>

@NasonZ hmmmm another approach is to install it one by one and ignoring pytorch

conda install cudatoolkit xformers bitsandbytes -c nvidia -c xformers -c conda-forge
NasonZ commented

TLDR:

These are the steps I took to get my unsloth conda env working

$ conda create --name <your_unsloth_env> python=<3.10/3.9>

$ conda install pytorch torchvision torchaudio pytorch-cuda=<12.1/11.8> -c pytorch -c nvidia

$ conda install xformers -c xformers -y

$ pip install bitsandbytes

$ pip install "unsloth[conda] @ git+https://github.com/unslothai/unsloth.git"

So I tried installing one by one which raised I few issues which I was able to work around.

  1. xformers needs python 3.9 or 3.10 (I had 3.11 as it wasn't specified what python version was needed in the readme.md)
(model_train_env) ubuntu@awsec2:~/dmyzer/dmyzer-data-generator$ conda install xformers -c xformers -y        
Collecting package metadata (current_repodata.json): done                                                             
Solving environment: failed with initial frozen solve. Retrying with flexible solve.                                  
Solving environment: failed with repodata from current_repodata.json, will retry with next repodata source.           
Collecting package metadata (repodata.json): done                                                                     
Solving environment: failed with initial frozen solve. Retrying with flexible solve.                                  
Solving environment: -                                                                                                
Found conflicts! Looking for incompatible packages.                                                                   
This can take several minutes.  Press CTRL-C to abort.                                                                
failed                                                                                                                
                                                                                                                      
UnsatisfiableError: The following specifications were found                                                           
to be incompatible with the existing python installation in your environment:                                         
                      
Specifications:

  - xformers -> python[version='>=3.10,<3.11.0a0|>=3.9,<3.10.0a0']

Your python: python=3.11

If python is on the left-most side of the chain, that's the version you've asked for.
When python appears to the right, that indicates that the thing on the left is somehow
not available for the python version you are constrained to. Note that conda will not
change your python version to a different minor version unless you explicitly specify
that.

The following specifications were found to be incompatible with your system:

  - feature:/linux-64::__cuda==12.2=0
  - feature:/linux-64::__glibc==2.35=0
  - feature:|@/linux-64::__glibc==2.35=0
  - python=3.11 -> libgcc-ng[version='>=11.2.0'] -> __glibc[version='>=2.17']
  - xformers -> pytorch=2.0.1 -> __cuda[version='>=11.8']

Your installed version is: 2.35
  1. Installing cudatoolkit separately led to issues when installing pytorch after, cudatoolkit is installed by pytorch-cuda so specifying it separately was redundant in my case.

  2. Installing bitsandbytes via conda install bitsandbytes -c conda-forge -y led to the same frozen solve issue outlined originally. Installing via conda install conda-forge::bitsandbytes also didn't work, bitsandbytes threw a load of errors when running from unsloth import FastLanguageModel. Eventually got it running by installing the method mentioned in the bitsandbytes repo - pip install bitsandbytes.

I verified that my enviornment was working by running the TinyLLama notebook.

Oh my! Thanks so so much for the detailed instructions - I'll be pinning this if you don't mind :) Glad it finnaly was able to work!!

NasonZ commented

No worries, happy to help other get onboard with what looks to be a really useful package :)

hi there, still getting the following error: Could not solve for environment specs
The following packages are incompatible
└─ xformers is installable with the potential options
├─ xformers [0.0.16|0.0.17|...|0.0.24] would require
│ └─ python >=3.10,<3.11.0a0 , which can be installed;
├─ xformers [0.0.16|0.0.17|...|0.0.24] would require
│ └─ python >=3.9,<3.10.0a0 , which can be installed;
└─ xformers [0.0.16|0.0.20|0.0.21] conflicts with any installable versions previously reported. When running (unsloth) (base) ubuntu@ip-172-31-34-94:~$ mamba install cudatoolkit xformers bitsandbytes pytorch pytorch-cuda=12.1 -c pytorch -c nvidia -c xformers -c conda-forge -y, I checked that I have coda 12.1 installed

I ran into this error:

Exception has occurred: RuntimeError

        CUDA Setup failed despite GPU being available. Please run the following command to get more information:

        python -m bitsandbytes

And used this combination of the approaches listed above to get things working:

conda create --name unsloth_env python=3.10
conda activate unsloth_env
mamba install xformers pytorch pytorch-cuda=12.1 -c pytorch -c nvidia -c xformers -c conda-forge -y
pip install bitsandbytes
pip install "unsloth[conda] @ git+https://github.com/unslothai/unsloth.git"

Just a heads up to anyone that is trying to install the package on a miniconda env and getting error in the xformers installation because of conflicts, it turns out nowadays the conda is installing pytorch==2.2.1 that is not compatible with the xformers. You need to set the pytorch version to 2.2.0 in order to make the installation work properly.

This is what I used:

conda create --name unsloth_env python=3.10
conda activate unsloth_env
conda install pytorch==2.2.0 cudatoolkit torchvision torchaudio pytorch-cuda=12.1 -c pytorch -c nvidia
conda install xformers -c xformers
pip install bitsandbytes
pip install "unsloth[conda] @ git+https://github.com/unslothai/unsloth.git"

@felipepenhorate Thanks for the quick fix, just encountered this issue with conda.

@felipepenhorate Yes thanks so much! I'll update the readme!!

Also, because of the triton package requirements, this only works on Linux systems (without compiling your own triton workaround 😬). You can train on Linux and then deploy on other systems using regular Hugging Face workflows. Thanks @danielhanchen!

TLDR:

These are the steps I took to get my unsloth conda env working

$ conda create --name <your_unsloth_env> python=<3.10/3.9>

$ conda install pytorch torchvision torchaudio pytorch-cuda=<12.1/11.8> -c pytorch -c nvidia

$ conda install xformers -c xformers -y

$ pip install bitsandbytes

$ pip install "unsloth[conda] @ git+https://github.com/unslothai/unsloth.git"

So I tried installing one by one which raised I few issues which I was able to work around.

  1. xformers needs python 3.9 or 3.10 (I had 3.11 as it wasn't specified what python version was needed in the readme.md)
(model_train_env) ubuntu@awsec2:~/dmyzer/dmyzer-data-generator$ conda install xformers -c xformers -y        
Collecting package metadata (current_repodata.json): done                                                             
Solving environment: failed with initial frozen solve. Retrying with flexible solve.                                  
Solving environment: failed with repodata from current_repodata.json, will retry with next repodata source.           
Collecting package metadata (repodata.json): done                                                                     
Solving environment: failed with initial frozen solve. Retrying with flexible solve.                                  
Solving environment: -                                                                                                
Found conflicts! Looking for incompatible packages.                                                                   
This can take several minutes.  Press CTRL-C to abort.                                                                
failed                                                                                                                
                                                                                                                      
UnsatisfiableError: The following specifications were found                                                           
to be incompatible with the existing python installation in your environment:                                         
                      
Specifications:

  - xformers -> python[version='>=3.10,<3.11.0a0|>=3.9,<3.10.0a0']

Your python: python=3.11

If python is on the left-most side of the chain, that's the version you've asked for.
When python appears to the right, that indicates that the thing on the left is somehow
not available for the python version you are constrained to. Note that conda will not
change your python version to a different minor version unless you explicitly specify
that.

The following specifications were found to be incompatible with your system:

  - feature:/linux-64::__cuda==12.2=0
  - feature:/linux-64::__glibc==2.35=0
  - feature:|@/linux-64::__glibc==2.35=0
  - python=3.11 -> libgcc-ng[version='>=11.2.0'] -> __glibc[version='>=2.17']
  - xformers -> pytorch=2.0.1 -> __cuda[version='>=11.8']

Your installed version is: 2.35
  1. Installing cudatoolkit separately led to issues when installing pytorch after, cudatoolkit is installed by pytorch-cuda so specifying it separately was redundant in my case.
  2. Installing bitsandbytes via conda install bitsandbytes -c conda-forge -y led to the same frozen solve issue outlined originally. Installing via conda install conda-forge::bitsandbytes also didn't work, bitsandbytes threw a load of errors when running from unsloth import FastLanguageModel. Eventually got it running by installing the method mentioned in the bitsandbytes repo - pip install bitsandbytes.

I verified that my enviornment was working by running the TinyLLama notebook.

tmp/tmpmemclhbv/main.c: In function ‘list_to_cuuint64_array’:
/tmp/tmpmemclhbv/main.c:354:3: error: ‘for’ loop initial declarations are only allowed in C99 mode
for (Py_ssize_t i = 0; i < len; i++) {
^
/tmp/tmpmemclhbv/main.c:354:3: note: use option -std=c99 or -std=gnu99 to compile your code
/tmp/tmpmemclhbv/main.c: In function ‘list_to_cuuint32_array’:
/tmp/tmpmemclhbv/main.c:365:3: error: ‘for’ loop initial declarations are only allowed in C99 mode
for (Py_ssize_t i = 0; i < len; i++) {

subprocess.CalledProcessError: Command '['/usr/bin/gcc', '/tmp/tmporgwe35u/main.c', '-O3', '-I/miniconda3/envs/LLM/lib/python3.10/site-packages/triton/common/../third_party/cuda/include', '-I/miniconda3/envs/LLM/include/python3.10', '-I/tmp/tmporgwe35u', '-shared', '-fPIC', '-lcuda', '-o', '/tmp/tmporgwe35u/cuda_utils.cpython-310-x86_64-linux-gnu.so', '-L/lib64', '-L/lib', '-L/lib64', '-L/lib']' returned non-zero exit status 1.
getting this error after trying every type of unsloth env setup. Got stuck in this issue.

Oh maybe outdated gcc?

Oh maybe outdated gcc?

My gcc version is -
gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-44)
Is it the actual problem?

@felipepenhorate Ye I think that's wayyy too old!!

from triton.common.build import libcuda_dirs

ModuleNotFoundError: No module named 'triton.common'

@Lipapaldl Is your Triton version 3.0.0?

I ran
pip install xformers==0.0.24
to retain torch version as latest xformers require torch==2.3.0
and
conda install xformers -c xformers
doesn't seem to work anymore.

I'm planning to write a better guide for conda installs in the near future

I ran pip install xformers==0.0.24 to retain torch version as latest xformers require torch==2.3.0 and conda install xformers -c xformers doesn't seem to work anymore.

oh my god. I'm trying hard to use unsloth locally but it's a pain. I follow the conda instructions but I've been forced to downgrade xformers, i"ve tried the version printed out in the error as well yours but no way it seems that the conflict which triggered the error is still there. I'm done

@WasamiKirua
Are you running windows? If so, you may need to run it in WSL 2 instead, that’s what eventually worked for me.

conda create --name unsloth_env python=3.10
conda activate unsloth_env
conda install pytorch==2.2.0 cudatoolkit torchvision torchaudio pytorch-cuda=12.1 -c pytorch -c nvidia
pip install xformers==0.0.24
pip install bitsandbytes
pip install "unsloth[conda] @ git+https://github.com/unslothai/unsloth.git"

It works for me (Linux)

Apologies on Conda issues - I do know sometimes it can be painful - another option is to copy paste our Kaggle install instructions here: https://www.kaggle.com/danielhanchen/kaggle-gemma2-9b-unsloth-notebook which might work

On Windows this works only on WSL"?

On Windows this works only on WSL"?

In my experience, yes. There would be unresolvable dependencies otherwise.

Thanks @ArrangingFear56 Will try later in WSL. Currently fine tuning in colab.

I did make the installation somewhat better in https://github.com/unslothai/unsloth?tab=readme-ov-file#-installation-instructions! Hope this makes things better!

@NasonZ Thanks Thanks Thanks