arcee-ai/mergekit

Merging LoRA adapters: "TypeError: object of type 'NoneType' has no len()"

anika-ilieva opened this issue · 2 comments

Hi, I am trying to merge different LoRA adapters. Currently, I am trying to merge adapters ONLY that do not contain the base model. Whenever the adapters are used, they are loaded on top of the "unsloth/Phi-3-mini-4k-instruct" base model. I have been experimenting with different .yml configurations with and without using slices:

Example yml using slices
slices:

  • sources:
    • model: HU-Berlin-ML-Internal/opiniongpt-phi3-middle_east
      layer_range: [0, -1]
      weight: 0.3
    • model: HU-Berlin-ML-Internal/opiniongpt-phi3-men
      layer_range: [0, -1]
      weight: 0.25
      merge_method: linear
      base_model: unsloth/Phi-3-mini-4k-instruct

Example yml without slices
models:

  • model: HU-Berlin-ML-Internal/opiniongpt-phi3-middle_east
    parameters:
    weight: 0.3
    density: 0.9
  • model: HU-Berlin-ML-Internal/opiniongpt-phi3-men
    parameters:
    weight: 0.25
    density: 0.9
    merge_method: della_linear
    base_model: unsloth/Phi-3-mini-4k-instruct

Both lead to the following error message when I run mergekit-yaml examples/lora_test.yml ./merged --lazy-unpickle --allow-crimes:

Traceback (most recent call last):
  File "/Users/anikailieva/anaconda3/bin/mergekit-yaml", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/Users/anikailieva/anaconda3/lib/python3.11/site-packages/click/core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/anikailieva/anaconda3/lib/python3.11/site-packages/click/core.py", line 1078, in main
    rv = self.invoke(ctx)
         ^^^^^^^^^^^^^^^^
  File "/Users/anikailieva/anaconda3/lib/python3.11/site-packages/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/anikailieva/anaconda3/lib/python3.11/site-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/anikailieva/Documents/mergekit/mergekit/options.py", line 82, in wrapper
    f(*args, **kwargs)
  File "/Users/anikailieva/Documents/mergekit/mergekit/scripts/run_yaml.py", line 47, in main
    run_merge(
  File "/Users/anikailieva/Documents/mergekit/mergekit/merge.py", line 50, in run_merge
    model_arch_info = [
                      ^
  File "/Users/anikailieva/Documents/mergekit/mergekit/merge.py", line 51, in <listcomp>
    get_architecture_info(m.config(trust_remote_code=options.trust_remote_code))
  File "/Users/anikailieva/Documents/mergekit/mergekit/architecture.py", line 359, in get_architecture_info
    if len(config.architectures) != 1:
       ^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: object of type 'NoneType' has no len()

From this I understand that the way I write the yaml file leads to the assumption that "opiniongpt-phi3-middle_east" and "opiniongpt-phi3-men" include a base model that has an architecture. But I am currently only interested in merging the adapters. I would later load the merged adapter on top of the base model. Could you tell me if this is possible and how should the yaml file look like in my case?

Thank you so much in advance!