Attempting to pass model params to ExLlama on startup causes an AttributeError
InconsolableCellist opened this issue · 2 comments
Summary
It appears that self.model_config is None in ExLlama's class.py (https://github.com/0cc4m/KoboldAI/blob/exllama/modeling/inference_models/exllama/class.py#L423), and is assumed to exist when you get to that code via passing in --model_parameters.
Additionally, play.sh has an issue with parsing JSON use space as an IFS, which is how the help
model param tells you to format your JSON.
Steps to reproduce:
- Attempt to start with something like the following, which attempts to set model_parameters:
/play.sh --host --model airoboros-l2-70b-gpt4-2.0 --model_backend ExLlama --model_parameters "{'0_Layers':35,'1_Layers':45,'model_ctx':4096}"
Actual Behavior:
Output:
Colab Check: False, TPU: False
INIT | OK | KAI Horde Models
## Warning: this project requires Python 3.9 or higher.
INFO | __main__:<module>:680 - We loaded the following model backends:
Huggingface GPTQ
KoboldAI Old Colab Method
KoboldAI API
Huggingface
Horde
Read Only
OpenAI
ExLlama
Basic Huggingface
GooseAI
INFO | __main__:general_startup:1395 - Running on Repo: https://github.com/0cc4m/KoboldAI.git Branch: exllama
MESSAGE | Welcome to KoboldAI!
MESSAGE | You have selected the following Model: airoboros-l2-70b-gpt4-2.0
ERROR | __main__:<module>:10948 - An error has been caught in function '<module>', process 'MainProcess' (12311), thread 'MainThread' (140613286643520):
Traceback (most recent call last):
> File "aiserver.py", line 10948, in <module>
run()
└ <function run at 0x7fe259d69ca0>
File "aiserver.py", line 10849, in run
command_line_backend = general_startup()
└ <function general_startup at 0x7fe25a321dc0>
File "aiserver.py", line 1634, in general_startup
model_backends[args.model_backend].set_input_parameters(arg_parameters)
│ │ │ └ {'0_Layers': 35, '1_Layers': 45, 'model_ctx': 4096, 'max_ctx': 2048, 'compress_emb': 1, 'ntk_alpha': 1, 'id': 'airoboros-l2-7...
│ │ └ 'ExLlama'
│ └ Namespace(apikey=None, aria2_port=None, cacheonly=False, colab=False, configname=None, cpu=False, customsettings=None, f=None...
└ {'Huggingface GPTQ': <modeling.inference_models.gptq_hf_torch.class.model_backend object at 0x7fe22525b2b0>, 'KoboldAI Old Co...
File "/home/.../KoboldAI-Client-llama/modeling/inference_models/exllama/class.py", line 423, in set_input_parameters
self.model_config.device_map.layers = []
│ └ None
└ <modeling.inference_models.exllama.class.model_backend object at 0x7fe22b4f31c0>
AttributeError: 'NoneType' object has no attribute 'device_map'
Expected Behavior:
The model parameters can be set at startup
Environment:
$ git remote -v
origin https://github.com/0cc4m/KoboldAI.git (fetch)
origin https://github.com/0cc4m/KoboldAI.git (push)
$ git status
On branch exllama
Your branch is up to date with 'origin/exllama'
commit 973aea12ea079e9c5de1e418b848a0407da7eab7 (HEAD -> exllama, origin/exllama)
Author: 0cc4m <picard12@live.de>
Date: Sun Jul 23 22:07:34 2023 +0200
Only import big python modules for GPTQ once they get used
Additionally, the following change should be made in play.sh:
$ git diff play.sh
diff --git a/play.sh b/play.sh
index 8ce7b781..3e88ae28 100755
--- a/play.sh
+++ b/play.sh
@@ -3,4 +3,4 @@ export PYTHONNOUSERSITE=1
if [ ! -f "runtime/envs/koboldai/bin/python" ]; then
./install_requirements.sh cuda
fi
-bin/micromamba run -r runtime -n koboldai python aiserver.py $*
+bin/micromamba run -r runtime -n koboldai python aiserver.py "$@"
So that you can pass in JSON as the model params with spaces between the KV pairs, as the help
parameter instructs you:
$ ./play.sh --host --model airoboros-l2-70b-gpt4-2.0 --model_backend ExLlama --model_parameters help
...
INFO | main:general_startup:1395 - Running on Repo: https://github.com/0cc4m/KoboldAI.git Branch: exllama
MESSAGE | Welcome to KoboldAI!
MESSAGE | You have selected the following Model: airoboros-l2-70b-gpt4-2.0
ERROR | main:general_startup:1627 - Please pass through the parameters as a json like "{'[ID]': '[Value]', '[ID2]': '[Value]'}" using --model_parameters (required parameters shown below)
ERROR | main:general_startup:1628 - Parameters (ID: Default Value (Help Text)): 0_Layers: [None] (The number of layers to put on NVIDIA GeForce RTX 3090.)
1_Layers: [0] (The number of layers to put on NVIDIA GeForce RTX 3090.)
max_ctx: 2048 (The maximum context size the model supports)
compress_emb: 1 (If the model requires compressed embeddings, set them here)
ntk_alpha: 1 (NTK alpha value)
I've faced the same situation, but I was able to get it working without using the model_backend and model path parameters. Instead, I did it manually through the web interface menu to open the model. and of course I selected the ExLlama backend on its menu.
Nvm.. I figured out a way. The issue with self.model_config.device_map.layers
, model_config being None is because it was never initialized from the beginning. The only place where it gets initialized is in the is_valid()
function within the KoboldAI\modeling\inference_models\exllama\class.py file. This is_valid()
function is called when a user opens the model through the web interface menu.
To fix this, I made a small change to the get_requested_parameters()
function in the class.py file. I added the following line at the very beginning:
if not self.model_config:
self.model_config = ExLlamaConfig(os.path.join(model_path, "config.json"))
However, it turns out there was one more thing that needed to be changed within the same function. I removed the square brackets from
"default": [layer_count if i == 0 else 0]
and changed it to
"default": layer_count if i == 0 else 0
This is needed because in the set_input_parameters()
function still in class.py file, on the line:
for i, l in enumerate(layers):
if l > 0:
that would treat it as a list instead of an integer.