VikParuchuri/surya

async/await load_models() doesn't load correctly the models

Opened this issue · 2 comments

https://github.com/VikParuchuri/surya/blob/8cd024dd9411cd6ad4b0e0bc875eec32f84d387d/surya/model/recognition/decoder.py#L99C45-L99C56

  File "/usr/local/lib/python3.11/dist-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
     return self._call_impl(*args, **kwargs)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
   File "/usr/local/lib/python3.11/dist-packages/torch/nn/modules/module.py", line 1562, in _call_impl
     return forward_call(*args, **kwargs)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
   File "/usr/local/lib/python3.11/dist-packages/surya/model/recognition/decoder.py", line 308, in forward
     hidden_states = self.moe(hidden_states, langs)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
   File "/usr/local/lib/python3.11/dist-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
     return self._call_impl(*args, **kwargs)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
   File "/usr/local/lib/python3.11/dist-packages/torch/nn/modules/module.py", line 1562, in _call_impl
     return forward_call(*args, **kwargs)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
   File "/usr/local/lib/python3.11/dist-packages/surya/model/recognition/decoder.py", line 99, in forward
     expert_layer = self.experts[str(expert_lang.item())]
                    ~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^
   File "/usr/local/lib/python3.11/dist-packages/torch/nn/modules/container.py", line 463, in __getitem__
     return self._modules[key]
            ~~~~~~~~~~~~~^^^^^
 KeyError: '65555'

I don't know if the cause is the same, but I was getting an identical error. For me it resulted from calling load_recognition_model() with the langs = ["en"].

The problem is that, while the two letter language codes are expected by the langs parameter in most places (that I've noticed), load_recognition_model() requires the lang_tokens output of _, lang_tokens = _tokenize("", get_unique_langs(lang_list)) (see, e.g., lines 54-55 in ocr_text.py).

When called with lang_list = ["en"], lang_tokens will be 65555, the sum of LANGUAGE_MAP["en"] + TOKEN_OFFSET + TOTAL_TOKENS as defined in model/recognition/config.py.

This may or may not also be the cause of #128 as well.

I was the poster of #128. I interacted with this problem from marker version 0.2.13. I could not solve the problem and left it at that.

Today I encountered the same problem again using marker version 0.2.13. When I updated using 0.2.17, it worked.