ndif-team/nnsight

Trouble loading GPTBigCode models

Closed this issue · 3 comments

I'm happy to try to debug this. But, in case the error is obvious to an nnsight hacker, here is an error I'm getting.

This is the model:

https://huggingface.co/bigcode/gpt_bigcode-santacoder

This is my code that raises the error below. I am able to load Pythia as shown in the tutorial.

from nnsight import LanguageModel

model = LanguageModel("bigcode/gpt_bigcode-santacoder", device_map='cuda:0')

Error:

/work/arjunguha-research-group/arjun/venvs/jan2024/lib/python3.11/site-packages/transformers/utils/hub.py:123: FutureWarning: Using `TRANSFORMERS_CACHE` is deprecated and will be removed in v5 of Transformers. Use `HF_HOME` instead.
  warnings.warn(
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
File /work/arjunguha-research-group/arjun/venvs/jan2024/lib/python3.11/site-packages/transformers/utils/import_utils.py:1382, in _LazyModule._get_module(self, module_name)
   1381 try:
-> 1382     return importlib.import_module("." + module_name, self.__name__)
   1383 except Exception as e:

File ~/miniconda3/lib/python3.11/importlib/__init__.py:126, in import_module(name, package)
    125         level += 1
--> 126 return _bootstrap._gcd_import(name[level:], package, level)

File <frozen importlib._bootstrap>:1204, in _gcd_import(name, package, level)

File <frozen importlib._bootstrap>:1176, in _find_and_load(name, import_)

File <frozen importlib._bootstrap>:1147, in _find_and_load_unlocked(name, import_)

File <frozen importlib._bootstrap>:690, in _load_unlocked(spec)

File <frozen importlib._bootstrap_external>:940, in exec_module(self, module)

File <frozen importlib._bootstrap>:241, in _call_with_frames_removed(f, *args, **kwds)

File /work/arjunguha-research-group/arjun/venvs/jan2024/lib/python3.11/site-packages/transformers/models/gpt_bigcode/modeling_gpt_bigcode.py:64
     60 # Fused kernels
     61 # Use separate functions for each case because conditionals prevent kernel fusion.
     62 # TODO: Could have better fused kernels depending on scaling, dropout and head mask.
     63 #  Is it doable without writing 32 functions?
---> 64 @torch.jit.script
     65 def upcast_masked_softmax(
     66     x: torch.Tensor, mask: torch.Tensor, mask_value: torch.Tensor, scale: float, softmax_dtype: torch.dtype
     67 ):
     68     input_dtype = x.dtype

File /work/arjunguha-research-group/arjun/venvs/jan2024/lib/python3.11/site-packages/torch/jit/_script.py:1381, in script(obj, optimize, _frames_up, _rcb, example_inputs)
   1380     _rcb = _jit_internal.createResolutionCallbackFromClosure(obj)
-> 1381 fn = torch._C._jit_script_compile(
   1382     qualified_name, ast, _rcb, get_default_args(obj)
   1383 )
   1384 # Forward docstrings

File /work/arjunguha-research-group/arjun/venvs/jan2024/lib/python3.11/site-packages/torch/jit/_recursive.py:1010, in try_compile_fn(fn, loc)
   1009 rcb = _jit_internal.createResolutionCallbackFromClosure(fn)
-> 1010 return torch.jit.script(fn, _rcb=rcb)

File /work/arjunguha-research-group/arjun/venvs/jan2024/lib/python3.11/site-packages/torch/jit/_script.py:1378, in script(obj, optimize, _frames_up, _rcb, example_inputs)
   1377     return maybe_already_compiled_fn
-> 1378 ast = get_jit_def(obj, obj.__name__)
   1379 if _rcb is None:

File /work/arjunguha-research-group/arjun/venvs/jan2024/lib/python3.11/site-packages/torch/jit/frontend.py:331, in get_jit_def(fn, def_name, self_name, is_classmethod)
    317 """
    318 Build a JIT AST (TreeView) from the given function.
    319 
   (...)
    329     self_name: If this function is a method, what the type name of `self` is.
    330 """
--> 331 parsed_def = parse_def(fn) if not isinstance(fn, _ParsedDef) else fn
    332 type_line = torch.jit.annotations.get_type_line(parsed_def.source)

File /work/arjunguha-research-group/arjun/venvs/jan2024/lib/python3.11/site-packages/torch/_sources.py:120, in parse_def(fn)
    119 def parse_def(fn):
--> 120     sourcelines, file_lineno, filename = get_source_lines_and_file(
    121         fn, ErrorReport.call_stack()
    122     )
    123     sourcelines = normalize_source_lines(sourcelines)

File /work/arjunguha-research-group/arjun/venvs/jan2024/lib/python3.11/site-packages/torch/_sources.py:23, in get_source_lines_and_file(obj, error_msg)
     22     filename = inspect.getsourcefile(obj)
---> 23     sourcelines, file_lineno = inspect.getsourcelines(obj)
     24 except OSError as e:

File ~/miniconda3/lib/python3.11/inspect.py:1244, in getsourcelines(object)
   1243 object = unwrap(object)
-> 1244 lines, lnum = findsource(object)
   1246 if istraceback(object):

File ~/miniconda3/lib/python3.11/inspect.py:1063, in findsource(object)
   1056 """Return the entire source file and starting line number for an object.
   1057 
   1058 The argument may be a module, class, method, function, traceback, frame,
   1059 or code object.  The source code is returned as a list of all the lines
   1060 in the file and the line number indexes a line in that list.  An OSError
   1061 is raised if the source code cannot be retrieved."""
-> 1063 file = getsourcefile(object)
   1064 if file:
   1065     # Invalidate cache if needed.

File ~/miniconda3/lib/python3.11/inspect.py:940, in getsourcefile(object)
    937 """Return the filename that can be used to locate an object's source.
    938 Return None if no way can be identified to get the source.
    939 """
--> 940 filename = getfile(object)
    941 all_bytecode_suffixes = importlib.machinery.DEBUG_BYTECODE_SUFFIXES[:]

File /work/arjunguha-research-group/arjun/venvs/jan2024/lib/python3.11/site-packages/torch/package/package_importer.py:696, in _patched_getfile(object)
    695         return _package_imported_modules[object.__module__].__file__
--> 696 return _orig_getfile(object)

File ~/miniconda3/lib/python3.11/inspect.py:920, in getfile(object)
    919     return object.co_filename
--> 920 raise TypeError('module, class, method, function, traceback, frame, or '
    921                 'code object was expected, got {}'.format(
    922                 type(object).__name__))

TypeError: module, class, method, function, traceback, frame, or code object was expected, got builtin_function_or_method

The above exception was the direct cause of the following exception:

RuntimeError                              Traceback (most recent call last)
Cell In[1], line 3
      1 from nnsight import LanguageModel
----> 3 model = LanguageModel("bigcode/gpt_bigcode-santacoder", device_map='cuda:0')

File /work/arjunguha-research-group/arjun/venvs/jan2024/lib/python3.11/site-packages/nnsight/models/LanguageModel.py:46, in LanguageModel.__init__(self, tokenizer, automodel, *args, **kwargs)
     43 self.local_model: PreTrainedModel = None
     44 self.automodel = automodel if not isinstance(automodel, str) else getattr(modeling_auto, automodel)
---> 46 super().__init__(*args, **kwargs)

File /work/arjunguha-research-group/arjun/venvs/jan2024/lib/python3.11/site-packages/nnsight/models/AbstractModel.py:104, in AbstractModel.__init__(self, repoid_path_model, dispatch, alter, *args, **kwargs)
     99                 self.meta_model: Module = Module.wrap(
    100                     copy.deepcopy(self.local_model).to("meta")
    101                 )
    102         else:
    103             self.meta_model: Module = Module.wrap(
--> 104                 self._load_meta(self.repoid_path_clsname, *args, **kwargs).to(
    105                     "meta"
    106                 )
    107             )
    109 # Wrap all modules in our Module class.
    110 for name, module in self.meta_model.named_children():

File /work/arjunguha-research-group/arjun/venvs/jan2024/lib/python3.11/site-packages/nnsight/models/LanguageModel.py:59, in LanguageModel._load_meta(self, repoid_or_path, *args, **kwargs)
     54 self.tokenizer = AutoTokenizer.from_pretrained(
     55     repoid_or_path, config=self.config, padding_side="left"
     56 )
     57 self.tokenizer.pad_token = self.tokenizer.eos_token
---> 59 return self.automodel.from_config(self.config, trust_remote_code=True)

File /work/arjunguha-research-group/arjun/venvs/jan2024/lib/python3.11/site-packages/transformers/models/auto/auto_factory.py:440, in _BaseAutoModelClass.from_config(cls, config, **kwargs)
    438     return model_class._from_config(config, **kwargs)
    439 elif type(config) in cls._model_mapping.keys():
--> 440     model_class = _get_model_class(config, cls._model_mapping)
    441     return model_class._from_config(config, **kwargs)
    443 raise ValueError(
    444     f"Unrecognized configuration class {config.__class__} for this kind of AutoModel: {cls.__name__}.\n"
    445     f"Model type should be one of {', '.join(c.__name__ for c in cls._model_mapping.keys())}."
    446 )

File /work/arjunguha-research-group/arjun/venvs/jan2024/lib/python3.11/site-packages/transformers/models/auto/auto_factory.py:387, in _get_model_class(config, model_mapping)
    386 def _get_model_class(config, model_mapping):
--> 387     supported_models = model_mapping[type(config)]
    388     if not isinstance(supported_models, (list, tuple)):
    389         return supported_models

File /work/arjunguha-research-group/arjun/venvs/jan2024/lib/python3.11/site-packages/transformers/models/auto/auto_factory.py:740, in _LazyAutoMapping.__getitem__(self, key)
    738 if model_type in self._model_mapping:
    739     model_name = self._model_mapping[model_type]
--> 740     return self._load_attr_from_module(model_type, model_name)
    742 # Maybe there was several model types associated with this config.
    743 model_types = [k for k, v in self._config_mapping.items() if v == key.__name__]

File /work/arjunguha-research-group/arjun/venvs/jan2024/lib/python3.11/site-packages/transformers/models/auto/auto_factory.py:754, in _LazyAutoMapping._load_attr_from_module(self, model_type, attr)
    752 if module_name not in self._modules:
    753     self._modules[module_name] = importlib.import_module(f".{module_name}", "transformers.models")
--> 754 return getattribute_from_module(self._modules[module_name], attr)

File /work/arjunguha-research-group/arjun/venvs/jan2024/lib/python3.11/site-packages/transformers/models/auto/auto_factory.py:698, in getattribute_from_module(module, attr)
    696 if isinstance(attr, tuple):
    697     return tuple(getattribute_from_module(module, a) for a in attr)
--> 698 if hasattr(module, attr):
    699     return getattr(module, attr)
    700 # Some of the mappings have entries model_type -> object of another model type. In that case we try to grab the
    701 # object at the top level.

File /work/arjunguha-research-group/arjun/venvs/jan2024/lib/python3.11/site-packages/transformers/utils/import_utils.py:1372, in _LazyModule.__getattr__(self, name)
   1370     value = self._get_module(name)
   1371 elif name in self._class_to_module.keys():
-> 1372     module = self._get_module(self._class_to_module[name])
   1373     value = getattr(module, name)
   1374 else:

File /work/arjunguha-research-group/arjun/venvs/jan2024/lib/python3.11/site-packages/transformers/utils/import_utils.py:1384, in _LazyModule._get_module(self, module_name)
   1382     return importlib.import_module("." + module_name, self.__name__)
   1383 except Exception as e:
-> 1384     raise RuntimeError(
   1385         f"Failed to import {self.__name__}.{module_name} because of the following error (look up to see its"
   1386         f" traceback):\n{e}"
   1387     ) from e

RuntimeError: Failed to import transformers.models.gpt_bigcode.modeling_gpt_bigcode because of the following error (look up to see its traceback):
module, class, method, function, traceback, frame, or code object was expected, got builtin_function_or_method

@arjunguha What version is your transformers? I see this in the model card:

Model Summary
This is the same model as SantaCoder but it can be loaded with transformers >=4.28.1 to use the GPTBigCode architecture. We refer the reader to the SantaCoder model page for full documentation about this model

main: Uses the gpt_bigcode model. Requires the bigcode fork of transformers.
main_custom: Packaged with its modeling code. Requires transformers>=4.27. Alternatively, it can run on older versions by setting the configuration parameter activation_function = "gelu_pytorch_tanh".

I'm using transformers 4.36.2. So, the main branch and not the version that has its own modeling code.

SantaCoder seems to work in the current release of nnsight.

from nnsight import LanguageModel

model = LanguageModel("bigcode/gpt_bigcode-santacoder", device_map='cuda:0')

with model.trace("# the following python function computes the sqrt"):
    test = model.transformer.h[0].attn.output.save()

Closing this issue.