Model setup failed

Question

Model setup failed

tzktz opened this issue 5 months ago · 6 comments

i have run the model using cog predict successfully in locall...then i push the model into replicate it failes in model setup...
what is reason for that ? can u explain this thx in advance!!!
@zeke @mattt

Answer 1 · 2024-02-20T14:26:59.000Z

Hi @tzktz. Can you share any information to help us understand what broke?

If you navigate to the model version you pushed on replicate.com, you should see a "Setup Logs" tab. If you share the logs of a failed setup run, that'd give us something to go off of.

Answer 2 · 2024-02-20T16:28:08.000Z

Hi @tzktz. Can you share any information to help us understand what broke?

If you navigate to the model version you pushed on replicate.com, you should see a "Setup Logs" tab. If you share the logs of a failed setup run, that'd give us something to go off of.

replicate/cog#1541

Answer 3 · 2024-02-21T05:18:32.000Z

Hi @mattt .

Traceback (most recent call last):
File "/usr/local/lib/python3.10/site-packages/cog/server/worker.py", line 185, in _setup
run_setup(self._predictor)
File "/usr/local/lib/python3.10/site-packages/cog/predictor.py", line 70, in run_setup
predictor.setup()
File "/src/predict.py", line 67, in setup
self.face_enhancer = gfpgan.GFPGANer(model_path='weights/GFPGANv1.4.pth', upscale=1)
File "/usr/local/lib/python3.10/site-packages/gfpgan/utils.py", line 79, in __init__
self.face_helper = FaceRestoreHelper(
File "/usr/local/lib/python3.10/site-packages/facexlib/utils/face_restoration_helper.py", line 99, in __init__
self.face_det = init_detection_model(det_model, half=False, device=self.device, model_rootpath=model_rootpath)
File "/usr/local/lib/python3.10/site-packages/facexlib/detection/__init__.py", line 22, in init_detection_model
load_net = torch.load(model_path, map_location=lambda storage, loc: storage)
File "/usr/local/lib/python3.10/site-packages/torch/serialization.py", line 815, in load
return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args)
File "/usr/local/lib/python3.10/site-packages/torch/serialization.py", line 1033, in _legacy_load
magic_number = pickle_module.load(f, **pickle_load_args)
EOFError: Ran out of input
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/site-packages/cog/server/runner.py", line 317, in setup
    for event in worker.setup():
  File "/usr/local/lib/python3.10/site-packages/cog/server/worker.py", line 126, in _wait
    raise FatalWorkerException(raise_on_error + ": " + done.error_detail)
cog.server.exceptions.FatalWorkerException: Predictor errored during setup: Ran out of input

The error i got from setup log

Answer 4 · 2024-02-21T13:14:37.000Z

@tzktz It looks like the model is attempting to unpickle a file and getting EOFError: Ran out of input. This can happen if the file is empty or missing.

Answer 5 · 2024-02-21T13:51:43.000Z

@tzktz Downloading files from the internet at setup time can fail for a number of reasons. The error you shared is consistent with one of those files failing to download. Why not cache those in the image at build time?

Answer 6 · 2024-02-21T16:56:29.000Z

@tzktz Sorry, I can't help you with that. You can try asking on Discord if you aren't able to get it working.

Since this is the repo for the Python client, and this is a problem specifically with a model, I'm going to go ahead and close this issue. Let me know if you hit any other snags with the client library 😄