jina-ai/jina

`StableLM` example from the homepage doesn't work properly.

codetalker7 opened this issue · 14 comments

I was going through the small example on the homepage of the docs, and it gives me a weird error:

WARNI… gateway@6246 Getting endpoints failed: failed to connect to all           [12/09/23 07:58:35]
       addresses. Waiting for another trial                                                         
WARNI… gateway@6246 Getting endpoints failed: failed to connect to all           [12/09/23 07:59:16]
       addresses. Waiting for another trial                                                         
WARNI… gateway@6246 Getting endpoints failed: failed to connect to all           [12/09/23 08:03:15]
       addresses. Waiting for another trial                                                         
WARNI… gateway@6166 <jina.orchestrate.pods.Pod object at 0x7e03081072e0> timeout [12/09/23 08:08:30]
       after waiting for 600000ms, if your executor takes time to load, you may                     
       increase --timeout-ready                                                                     
WARNI… gateway@6246 Getting endpoints failed: failed to connect to all           [12/09/23 08:11:47]
       addresses. Waiting for another trial                                                         
INFO   gateway@6246 start server bound to 0.0.0.0:12345                          [12/09/23 08:11:48]
Traceback (most recent call last):
  File "/content/deployment.py", line 6, in <module>
    with dep:
  File "/usr/local/lib/python3.10/dist-packages/jina/orchestrate/orchestrator.py", line 14, in __enter__
    return self.start()
  File "/usr/local/lib/python3.10/dist-packages/jina/orchestrate/deployments/__init__.py", line 1157, in start
    self._wait_until_all_ready()
  File "/usr/local/lib/python3.10/dist-packages/jina/orchestrate/deployments/__init__.py", line 1095, in _wait_until_all_ready
    asyncio.get_event_loop().run_until_complete(wait_for_ready_coro)
  File "/usr/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete
    return future.result()
  File "/usr/local/lib/python3.10/dist-packages/jina/orchestrate/deployments/__init__.py", line 1212, in async_wait_start_success
    await asyncio.gather(*coros)
  File "/usr/local/lib/python3.10/dist-packages/jina/orchestrate/pods/__init__.py", line 221, in async_wait_start_success
    self._fail_start_timeout(_timeout)
  File "/usr/local/lib/python3.10/dist-packages/jina/orchestrate/pods/__init__.py", line 140, in _fail_start_timeout
    raise TimeoutError(
TimeoutError: jina.orchestrate.pods.Pod:gateway can not be initialized after 600000.0ms

Just for reference, here's the code to the executor.py and the deployment.py scripts:

executor.py:

from jina import Executor, requests
from docarray import DocList, BaseDoc

from transformers import pipeline


class Prompt(BaseDoc):
    text: str


class Generation(BaseDoc):
    prompt: str
    text: str


class StableLM(Executor):
    def __init__(self, **kwargs):
        super().__init__(**kwargs)
        self.generator = pipeline(
            'text-generation', model='stabilityai/stablelm-base-alpha-3b'
        )

    @requests
    def generate(self, docs: DocList[Prompt], **kwargs) -> DocList[Generation]:
        generations = DocList[Generation]()
        prompts = docs.text
        llm_outputs = self.generator(prompts)
        for prompt, output in zip(prompts, llm_outputs):
            generations.append(Generation(prompt=prompt, text=output))
        return generations

deployment.py:

from jina import Deployment
from executor import StableLM

dep = Deployment(uses=StableLM, timeout_ready=-1, port=12345)

with dep:
    dep.block()

And I'm running the deployment script simply by doing:

python3 deployment.py

Am I missing something or does this example need to be updated?

It seems that your model is being downloaded. Can you run in a different script this and then try agaian?

from transformers import pipeline
pipeline(
            'text-generation', model='stabilityai/stablelm-base-alpha-3b'
        )

It seems that your model is being downloaded. Can you run in a different script this and then try agaian?

from transformers import pipeline
pipeline(
            'text-generation', model='stabilityai/stablelm-base-alpha-3b'
        )

Hi @JoanFM. I tried running the new script, and it downloaded the model just fine. But the same error still persists.

Here is the output of the download script:

config.json: 100%
708/708 [00:00<00:00, 35.6kB/s]
pytorch_model.bin.index.json: 100%
21.1k/21.1k [00:00<00:00, 1.07MB/s]
Downloading shards: 100%
2/2 [05:30<00:00, 154.68s/it]
pytorch_model-00001-of-00002.bin: 100%
10.2G/10.2G [03:45<00:00, 44.1MB/s]
pytorch_model-00002-of-00002.bin: 100%
4.66G/4.66G [01:44<00:00, 41.3MB/s]

And here is the output of python3 deployment.py, which again leads to the same error:

WARNI… gateway@2969 Getting endpoints failed: failed to connect to all           [12/09/23 09:28:00]
       addresses. Waiting for another trial                                                         
WARNI… gateway@2969 Getting endpoints failed: failed to connect to all           [12/09/23 09:28:39]
       addresses. Waiting for another trial                                                         
WARNI… gateway@2969 Getting endpoints failed: failed to connect to all           [12/09/23 09:33:11]
       addresses. Waiting for another trial                                                         
WARNI… gateway@2889 <jina.orchestrate.pods.Pod object at 0x7e068c21bfa0> timeout [12/09/23 09:37:55]
       after waiting for 600000ms, if your executor takes time to load, you may                     
       increase --timeout-ready                                                                     
WARNI… gateway@2969 Getting endpoints failed: failed to connect to all           [12/09/23 09:41:08]
       addresses. Waiting for another trial                                                         
INFO   gateway@2969 start server bound to 0.0.0.0:12345                          [12/09/23 09:41:09]
Traceback (most recent call last):
  File "/content/deployment.py", line 6, in <module>
    with dep:
  File "/usr/local/lib/python3.10/dist-packages/jina/orchestrate/orchestrator.py", line 14, in __enter__
    return self.start()
  File "/usr/local/lib/python3.10/dist-packages/jina/orchestrate/deployments/__init__.py", line 1157, in start
    self._wait_until_all_ready()
  File "/usr/local/lib/python3.10/dist-packages/jina/orchestrate/deployments/__init__.py", line 1095, in _wait_until_all_ready
    asyncio.get_event_loop().run_until_complete(wait_for_ready_coro)
  File "/usr/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete
    return future.result()
  File "/usr/local/lib/python3.10/dist-packages/jina/orchestrate/deployments/__init__.py", line 1212, in async_wait_start_success
    await asyncio.gather(*coros)
  File "/usr/local/lib/python3.10/dist-packages/jina/orchestrate/pods/__init__.py", line 221, in async_wait_start_success
    self._fail_start_timeout(_timeout)
  File "/usr/local/lib/python3.10/dist-packages/jina/orchestrate/pods/__init__.py", line 140, in _fail_start_timeout
    raise TimeoutError(
TimeoutError: jina.orchestrate.pods.Pod:gateway can not be initialized after 600000.0ms

what is the Jina and docarray version that you have installed?

what is the Jina and docarray version that you have installed?

@JoanFM, I just installed jina from pip, so should be the most recent PyPI version. Here's the output of python3 -m pip show jina docarray:

Name: jina
Version: 3.23.1
Summary: Multimodal AI services & pipelines with cloud-native stack: gRPC, Kubernetes, Docker, OpenTelemetry, Prometheus, Jaeger, etc.
Home-page: https://github.com/jina-ai/jina/
Author: Jina AI
Author-email: [hello@jina.ai](mailto:hello@jina.ai)
License: Apache 2.0
Location: /usr/local/lib/python3.10/dist-packages
Requires: aiofiles, aiohttp, docarray, docker, fastapi, filelock, grpcio, grpcio-health-checking, grpcio-reflection, jcloud, jina-hubble-sdk, numpy, opentelemetry-api, opentelemetry-exporter-otlp, opentelemetry-exporter-otlp-proto-grpc, opentelemetry-exporter-prometheus, opentelemetry-instrumentation-aiohttp-client, opentelemetry-instrumentation-fastapi, opentelemetry-instrumentation-grpc, opentelemetry-sdk, packaging, pathspec, prometheus-client, protobuf, pydantic, python-multipart, pyyaml, requests, urllib3, uvicorn, uvloop, websockets
Required-by: 
---
Name: docarray
Version: 0.39.1
Summary: The data structure for multimodal data
Home-page: https://docs.docarray.org/
Author: DocArray
Author-email: 
License: Apache 2.0
Location: /usr/local/lib/python3.10/dist-packages
Requires: numpy, orjson, pydantic, rich, types-requests, typing-inspect
Required-by: jina

can u run with JINA_LOG_LEVEL=DEBUG environment variable?

can u run with JINA_LOG_LEVEL=DEBUG environment variable?

Hi @JoanFM, sure, here's the output of JINA_LOG_LEVEL=DEBUG python3 -m deployment

DEBUG  executor-replica-set@136297 Waiting for ReplicaSet to start successfully                                                                            [12/10/23 18:15:02]
DEBUG  executor/rep-0@136310 Setting signal handlers                                                                                                       [12/10/23 18:15:02]
DEBUG  executor/rep-0@136310 Signal handlers already set
DEBUG  gateway@136311 Setting signal handlers                                                                                                              [12/10/23 18:15:02]
DEBUG  gateway@136311 Signal handlers already set
DEBUG  gateway@136311 adding connection for deployment executor/heads/0 to grpc://0.0.0.0:63378                                                            [12/10/23 18:15:02]
DEBUG  gateway@136311 create_connection connection for executor to grpc://0.0.0.0:63378
DEBUG  gateway@136311 create_connection connection for executor to grpc://0.0.0.0:63378
DEBUG  gateway@136311 connection for deployment executor/heads/0 to grpc://0.0.0.0:63378 added
DEBUG  gateway@136311 Setting up GRPC server
DEBUG  gateway@136311 Get all endpoints from TopologyGraph
DEBUG  gateway@136311 Getting Endpoints data from executor
DEBUG  gateway@136311 gRPC call to executor for EndpointDiscovery errored, with error <AioRpcError of RPC that terminated with:
               status = StatusCode.UNAVAILABLE
               details = "failed to connect to all addresses"
               debug_error_string = "{"created":"@1702212302.342733487","description":"Failed to pick
       subchannel","file":"src/core/ext/filters/client_channel/client_channel.cc","file_line":3260,"referenced_errors":[{"created":"@1702212302.342731834…
       to connect to all addresses","file":"src/core/lib/transport/error_utils.cc","file_line":167,"grpc_status":14}]}"
       > and for the 1th time.
DEBUG  gateway@136311 resetting connection for executor to 0.0.0.0:63378
DEBUG  gateway@136311 create_connection connection for executor to 0.0.0.0:63378
DEBUG  gateway@136311 gRPC call to executor for EndpointDiscovery errored, with error <AioRpcError of RPC that terminated with:                            [12/10/23 18:15:03]
               status = StatusCode.UNAVAILABLE
               details = "failed to connect to all addresses"
               debug_error_string = "{"created":"@1702212303.342263328","description":"Failed to pick
       subchannel","file":"src/core/ext/filters/client_channel/client_channel.cc","file_line":3260,"referenced_errors":[{"created":"@1702212303.342262457…
       to connect to all addresses","file":"src/core/lib/transport/error_utils.cc","file_line":167,"grpc_status":14}]}"
       > and for the 2th time.
DEBUG  gateway@136311 resetting connection for executor to 0.0.0.0:63378
DEBUG  gateway@136311 create_connection connection for executor to 0.0.0.0:63378
DEBUG  gateway@136311 gRPC call to executor for EndpointDiscovery errored, with error <AioRpcError of RPC that terminated with:                            [12/10/23 18:15:05]
               status = StatusCode.UNAVAILABLE
               details = "failed to connect to all addresses"
               debug_error_string = "{"created":"@1702212305.177361743","description":"Failed to pick
       subchannel","file":"src/core/ext/filters/client_channel/client_channel.cc","file_line":3260,"referenced_errors":[{"created":"@1702212305.177360250…
       to connect to all addresses","file":"src/core/lib/transport/error_utils.cc","file_line":167,"grpc_status":14}]}"
       > and for the 3th time.
DEBUG  gateway@136311 resetting connection for executor to 0.0.0.0:63378
DEBUG  gateway@136311 create_connection connection for executor to 0.0.0.0:63378
DEBUG  gateway@136311 gRPC call to executor for EndpointDiscovery errored, with error <AioRpcError of RPC that terminated with:                            [12/10/23 18:15:07]
               status = StatusCode.UNAVAILABLE
               details = "failed to connect to all addresses"
               debug_error_string = "{"created":"@1702212307.272224650","description":"Failed to pick
       subchannel","file":"src/core/ext/filters/client_channel/client_channel.cc","file_line":3260,"referenced_errors":[{"created":"@1702212307.272223268…
       to connect to all addresses","file":"src/core/lib/transport/error_utils.cc","file_line":167,"grpc_status":14}]}"
       > and for the 4th time.
DEBUG  gateway@136311 gRPC call for executor failed, retries exhausted
DEBUG  gateway@136311 resetting connection for executor to 0.0.0.0:63378
DEBUG  gateway@136311 create_connection connection for executor to 0.0.0.0:63378
WARNI… gateway@136311 Getting endpoints failed: failed to connect to all addresses. Waiting for another trial
DEBUG  gateway@136311 Getting Endpoints data from executor                                                                                                 [12/10/23 18:15:08]
DEBUG  gateway@134873 gRPC call to executor for EndpointDiscovery errored, with error <AioRpcError of RPC that terminated with:                            [12/10/23 18:15:08]
               status = StatusCode.UNAVAILABLE
               details = "failed to connect to all addresses"
               debug_error_string = "{"created":"@1702212308.594345522","description":"Failed to pick
       subchannel","file":"src/core/ext/filters/client_channel/client_channel.cc","file_line":3260,"referenced_errors":[{"created":"@1702212308.594344169…
       to connect to all addresses","file":"src/core/lib/transport/error_utils.cc","file_line":167,"grpc_status":14}]}"
       > and for the 2th time.
DEBUG  gateway@134873 resetting connection for executor to 0.0.0.0:58784
DEBUG  gateway@134873 create_connection connection for executor to 0.0.0.0:58784
DEBUG  gateway@136311 gRPC call to executor for EndpointDiscovery errored, with error <AioRpcError of RPC that terminated with:                            [12/10/23 18:15:11]
               status = StatusCode.UNAVAILABLE
               details = "failed to connect to all addresses"
               debug_error_string = "{"created":"@1702212311.258205136","description":"Failed to pick
       subchannel","file":"src/core/ext/filters/client_channel/client_channel.cc","file_line":3260,"referenced_errors":[{"created":"@1702212311.258203573…
       to connect to all addresses","file":"src/core/lib/transport/error_utils.cc","file_line":167,"grpc_status":14}]}"
       > and for the 1th time.
DEBUG  gateway@136311 resetting connection for executor to 0.0.0.0:63378
DEBUG  gateway@136311 create_connection connection for executor to 0.0.0.0:63378
DEBUG  gateway@136311 gRPC call to executor for EndpointDiscovery errored, with error <AioRpcError of RPC that terminated with:                            [12/10/23 18:15:17]
               status = StatusCode.UNAVAILABLE
               details = "failed to connect to all addresses"
               debug_error_string = "{"created":"@1702212317.184591800","description":"Failed to pick
       subchannel","file":"src/core/ext/filters/client_channel/client_channel.cc","file_line":3260,"referenced_errors":[{"created":"@1702212317.184590177…
       to connect to all addresses","file":"src/core/lib/transport/error_utils.cc","file_line":167,"grpc_status":14}]}"
       > and for the 2th time.
DEBUG  gateway@136311 resetting connection for executor to 0.0.0.0:63378
DEBUG  gateway@136311 create_connection connection for executor to 0.0.0.0:63378
DEBUG  gateway@136311 gRPC call to executor for EndpointDiscovery errored, with error <AioRpcError of RPC that terminated with:                            [12/10/23 18:15:26]
               status = StatusCode.UNAVAILABLE
               details = "failed to connect to all addresses"
               debug_error_string = "{"created":"@1702212326.060864501","description":"Failed to pick
       subchannel","file":"src/core/ext/filters/client_channel/client_channel.cc","file_line":3260,"referenced_errors":[{"created":"@1702212326.060863649…
       to connect to all addresses","file":"src/core/lib/transport/error_utils.cc","file_line":167,"grpc_status":14}]}"
       > and for the 3th time.
DEBUG  gateway@136311 resetting connection for executor to 0.0.0.0:63378
DEBUG  gateway@136311 create_connection connection for executor to 0.0.0.0:63378
DEBUG  gateway@136311 gRPC call to executor for EndpointDiscovery errored, with error <AioRpcError of RPC that terminated with:                            [12/10/23 18:15:40]
               status = StatusCode.UNAVAILABLE
               details = "failed to connect to all addresses"
               debug_error_string = "{"created":"@1702212340.367356535","description":"Failed to pick
       subchannel","file":"src/core/ext/filters/client_channel/client_channel.cc","file_line":3260,"referenced_errors":[{"created":"@1702212340.367354442…
       to connect to all addresses","file":"src/core/lib/transport/error_utils.cc","file_line":167,"grpc_status":14}]}"
       > and for the 4th time.
DEBUG  gateway@136311 gRPC call for executor failed, retries exhausted
DEBUG  gateway@136311 resetting connection for executor to 0.0.0.0:63378
DEBUG  gateway@136311 create_connection connection for executor to 0.0.0.0:63378
WARNI… gateway@136311 Getting endpoints failed: failed to connect to all addresses. Waiting for another trial
DEBUG  gateway@136311 Getting Endpoints data from executor                                                                                                 [12/10/23 18:15:41]
DEBUG  gateway@136311 gRPC call to executor for EndpointDiscovery errored, with error <AioRpcError of RPC that terminated with:                            [12/10/23 18:16:10]
               status = StatusCode.UNAVAILABLE
               details = "failed to connect to all addresses"
               debug_error_string = "{"created":"@1702212370.642886827","description":"Failed to pick
       subchannel","file":"src/core/ext/filters/client_channel/client_channel.cc","file_line":3260,"referenced_errors":[{"created":"@1702212370.642885264…
       to connect to all addresses","file":"src/core/lib/transport/error_utils.cc","file_line":167,"grpc_status":14}]}"
       > and for the 1th time.
DEBUG  gateway@136311 resetting connection for executor to 0.0.0.0:63378
DEBUG  gateway@136311 create_connection connection for executor to 0.0.0.0:63378
DEBUG  gateway@134873 gRPC call to executor for EndpointDiscovery errored, with error <AioRpcError of RPC that terminated with:                            [12/10/23 18:16:13]
               status = StatusCode.UNAVAILABLE
               details = "failed to connect to all addresses"
               debug_error_string = "{"created":"@1702212373.436257502","description":"Failed to pick
       subchannel","file":"src/core/ext/filters/client_channel/client_channel.cc","file_line":3260,"referenced_errors":[{"created":"@1702212373.436255969…
       to connect to all addresses","file":"src/core/lib/transport/error_utils.cc","file_line":167,"grpc_status":14}]}"
       > and for the 3th time.
DEBUG  gateway@134873 resetting connection for executor to 0.0.0.0:58784
DEBUG  gateway@134873 create_connection connection for executor to 0.0.0.0:58784
DEBUG  gateway@136311 gRPC call to executor for EndpointDiscovery errored, with error <AioRpcError of RPC that terminated with:                            [12/10/23 18:16:58]
               status = StatusCode.UNAVAILABLE
               details = "failed to connect to all addresses"
               debug_error_string = "{"created":"@1702212418.856391611","description":"Failed to pick
       subchannel","file":"src/core/ext/filters/client_channel/client_channel.cc","file_line":3260,"referenced_errors":[{"created":"@1702212418.856390118…
       to connect to all addresses","file":"src/core/lib/transport/error_utils.cc","file_line":167,"grpc_status":14}]}"
       > and for the 2th time.
DEBUG  gateway@136311 resetting connection for executor to 0.0.0.0:63378
DEBUG  gateway@136311 create_connection connection for executor to 0.0.0.0:63378
DEBUG  gateway@134873 gRPC call to executor for EndpointDiscovery errored, with error <AioRpcError of RPC that terminated with:                            [12/10/23 18:17:42]
               status = StatusCode.UNAVAILABLE
               details = "failed to connect to all addresses"
               debug_error_string = "{"created":"@1702212462.414183164","description":"Failed to pick
       subchannel","file":"src/core/ext/filters/client_channel/client_channel.cc","file_line":3260,"referenced_errors":[{"created":"@1702212462.414181501…
       to connect to all addresses","file":"src/core/lib/transport/error_utils.cc","file_line":167,"grpc_status":14}]}"
       > and for the 4th time.
DEBUG  gateway@134873 gRPC call for executor failed, retries exhausted
DEBUG  gateway@134873 resetting connection for executor to 0.0.0.0:58784
DEBUG  gateway@134873 create_connection connection for executor to 0.0.0.0:58784
WARNI… gateway@134873 Getting endpoints failed: failed to connect to all addresses. Waiting for another trial
DEBUG  gateway@134873 cancel get all endpoints                                                                                                             [12/10/23 18:17:43]
DEBUG  gateway@134873 Got all endpoints from TopologyGraph None
INFO   gateway@134873 start server bound to 0.0.0.0:12345
DEBUG  gateway@134873 server bound to 0.0.0.0:12345 started
DEBUG  gateway@134873 GRPC server setup successful
DEBUG  gateway@134873 process terminated
DEBUG  gateway@136311 gRPC call to executor for EndpointDiscovery errored, with error <AioRpcError of RPC that terminated with:                            [12/10/23 18:18:15]
               status = StatusCode.UNAVAILABLE
               details = "failed to connect to all addresses"
               debug_error_string = "{"created":"@1702212495.708559314","description":"Failed to pick
       subchannel","file":"src/core/ext/filters/client_channel/client_channel.cc","file_line":3260,"referenced_errors":[{"created":"@1702212495.708557841…
       to connect to all addresses","file":"src/core/lib/transport/error_utils.cc","file_line":167,"grpc_status":14}]}"
       > and for the 3th time.
DEBUG  gateway@136311 resetting connection for executor to 0.0.0.0:63378
DEBUG  gateway@136311 create_connection connection for executor to 0.0.0.0:63378
DEBUG  gateway@136311 gRPC call to executor for EndpointDiscovery errored, with error <AioRpcError of RPC that terminated with:                            [12/10/23 18:19:57]
               status = StatusCode.UNAVAILABLE
               details = "failed to connect to all addresses"
               debug_error_string = "{"created":"@1702212597.124311780","description":"Failed to pick
       subchannel","file":"src/core/ext/filters/client_channel/client_channel.cc","file_line":3260,"referenced_errors":[{"created":"@1702212597.124309956…
       to connect to all addresses","file":"src/core/lib/transport/error_utils.cc","file_line":167,"grpc_status":14}]}"
       > and for the 4th time.
DEBUG  gateway@136311 gRPC call for executor failed, retries exhausted
DEBUG  gateway@136311 resetting connection for executor to 0.0.0.0:63378
DEBUG  gateway@136311 create_connection connection for executor to 0.0.0.0:63378
WARNI… gateway@136311 Getting endpoints failed: failed to connect to all addresses. Waiting for another trial
DEBUG  gateway@136311 Getting Endpoints data from executor                                                                                                 [12/10/23 18:19:58]
DEBUG  gateway@136311 gRPC call to executor for EndpointDiscovery errored, with error <AioRpcError of RPC that terminated with:                            [12/10/23 18:22:13]
               status = StatusCode.UNAVAILABLE
               details = "failed to connect to all addresses"
               debug_error_string = "{"created":"@1702212733.848546614","description":"Failed to pick
       subchannel","file":"src/core/ext/filters/client_channel/client_channel.cc","file_line":3260,"referenced_errors":[{"created":"@1702212733.848544320…
       to connect to all addresses","file":"src/core/lib/transport/error_utils.cc","file_line":167,"grpc_status":14}]}"
       > and for the 1th time.
DEBUG  gateway@136311 resetting connection for executor to 0.0.0.0:63378
DEBUG  gateway@136311 create_connection connection for executor to 0.0.0.0:63378
DEBUG  gateway@136311 gRPC call to executor for EndpointDiscovery errored, with error <AioRpcError of RPC that terminated with:                            [12/10/23 18:24:08]
               status = StatusCode.UNAVAILABLE
               details = "failed to connect to all addresses"
               debug_error_string = "{"created":"@1702212848.345446446","description":"Failed to pick
       subchannel","file":"src/core/ext/filters/client_channel/client_channel.cc","file_line":3260,"referenced_errors":[{"created":"@1702212848.345445024…
       to connect to all addresses","file":"src/core/lib/transport/error_utils.cc","file_line":167,"grpc_status":14}]}"
       > and for the 2th time.
DEBUG  gateway@136311 resetting connection for executor to 0.0.0.0:63378
DEBUG  gateway@136311 create_connection connection for executor to 0.0.0.0:63378
WARNI… gateway@136297 <jina.orchestrate.pods.Pod object at 0x7f1e507a2790> timeout after waiting for 600000ms, if your executor takes time to load, you    [12/10/23 18:25:02]
       may increase --timeout-ready
DEBUG  gateway@136297 waiting for ready or shutdown signal from runtime
DEBUG  gateway@136297 Runtime was never started. Runtime will end gracefully on its own
DEBUG  gateway@136297 terminating the runtime process
DEBUG  gateway@136297 runtime process properly terminated
DEBUG  gateway@136297 terminated
DEBUG  gateway@136311 Received signal SIGTERM                                                                                                              [12/10/23 18:25:02]
DEBUG  gateway@136297 waiting for ready or shutdown signal from runtime
DEBUG  gateway@136297 shutdown is already set. Runtime will end gracefully on its own
DEBUG  gateway@136297 terminating the runtime process
DEBUG  gateway@136297 runtime process properly terminated
DEBUG  gateway@136297 terminated
DEBUG  executor/rep-0@136297 waiting for ready or shutdown signal from runtime                                                                             [12/10/23 18:25:02]
DEBUG  gateway@136311 Received signal SIGTERM
DEBUG  executor/rep-0@136297 Runtime was never started. Runtime will end gracefully on its own
DEBUG  executor/rep-0@136297 terminating the runtime process
DEBUG  executor/rep-0@136297 runtime process properly terminated
DEBUG  executor/rep-0@136297 terminated
DEBUG  executor/rep-0@136297 joining the process
DEBUG  executor/rep-0@136297 successfully joined the process
DEBUG  gateway@136297 joining the process
DEBUG  gateway@136311 gRPC call to executor for EndpointDiscovery errored, with error <AioRpcError of RPC that terminated with:                            [12/10/23 18:25:50]
               status = StatusCode.UNAVAILABLE
               details = "failed to connect to all addresses"
               debug_error_string = "{"created":"@1702212950.244549389","description":"Failed to pick
       subchannel","file":"src/core/ext/filters/client_channel/client_channel.cc","file_line":3260,"referenced_errors":[{"created":"@1702212950.244547996…
       to connect to all addresses","file":"src/core/lib/transport/error_utils.cc","file_line":167,"grpc_status":14}]}"
       > and for the 3th time.
DEBUG  gateway@136311 resetting connection for executor to 0.0.0.0:63378
DEBUG  gateway@136311 create_connection connection for executor to 0.0.0.0:63378
DEBUG  gateway@136311 gRPC call to executor for EndpointDiscovery errored, with error <AioRpcError of RPC that terminated with:                            [12/10/23 18:27:29]
               status = StatusCode.UNAVAILABLE
               details = "failed to connect to all addresses"
               debug_error_string = "{"created":"@1702213049.752896158","description":"Failed to pick
       subchannel","file":"src/core/ext/filters/client_channel/client_channel.cc","file_line":3260,"referenced_errors":[{"created":"@1702213049.752894715…
       to connect to all addresses","file":"src/core/lib/transport/error_utils.cc","file_line":167,"grpc_status":14}]}"
       > and for the 4th time.
DEBUG  gateway@136311 gRPC call for executor failed, retries exhausted
DEBUG  gateway@136311 resetting connection for executor to 0.0.0.0:63378
DEBUG  gateway@136311 create_connection connection for executor to 0.0.0.0:63378
WARNI… gateway@136311 Getting endpoints failed: failed to connect to all addresses. Waiting for another trial
DEBUG  gateway@136311 cancel get all endpoints                                                                                                             [12/10/23 18:27:30]
DEBUG  gateway@136311 Got all endpoints from TopologyGraph None
INFO   gateway@136311 start server bound to 0.0.0.0:12345
DEBUG  gateway@136311 server bound to 0.0.0.0:12345 started
DEBUG  gateway@136311 GRPC server setup successful
DEBUG  gateway@136311 process terminated
DEBUG  gateway@136297 successfully joined the process                                                                                                      [12/10/23 18:27:30]
DEBUG  gateway@136297 joining the process
DEBUG  gateway@136297 successfully joined the process
Traceback (most recent call last):
  File "/usr/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/home/codetalker7/jinaAI/getting_started/deployment.py", line 6, in <module>
    with dep:
  File "/home/codetalker7/jinaAI/venv/lib/python3.8/site-packages/jina/orchestrate/orchestrator.py", line 14, in __enter__
    return self.start()
  File "/home/codetalker7/jinaAI/venv/lib/python3.8/site-packages/jina/orchestrate/deployments/__init__.py", line 1157, in start
    self._wait_until_all_ready()
  File "/home/codetalker7/jinaAI/venv/lib/python3.8/site-packages/jina/orchestrate/deployments/__init__.py", line 1095, in _wait_until_all_ready
    asyncio.get_event_loop().run_until_complete(wait_for_ready_coro)
  File "/usr/lib/python3.8/asyncio/base_events.py", line 616, in run_until_complete
    return future.result()
  File "/home/codetalker7/jinaAI/venv/lib/python3.8/site-packages/jina/orchestrate/deployments/__init__.py", line 1212, in async_wait_start_success
    await asyncio.gather(*coros)
  File "/home/codetalker7/jinaAI/venv/lib/python3.8/site-packages/jina/orchestrate/pods/__init__.py", line 221, in async_wait_start_success
    self._fail_start_timeout(_timeout)
  File "/home/codetalker7/jinaAI/venv/lib/python3.8/site-packages/jina/orchestrate/pods/__init__.py", line 140, in _fail_start_timeout
    raise TimeoutError(
TimeoutError: jina.orchestrate.pods.Pod:gateway can not be initialized after 600000.0ms

just to check, can u try moving the import from transformers to inside the init method of the Executor?

just to check, can u try moving the import from transformers to inside the init method of the Executor?

@JoanFM tried this out, but it gives me the same error. Just to be sure, here is the new code for the Executor:

from jina import Executor, requests
from docarray import DocList, BaseDoc

class Prompt(BaseDoc):
    text: str

class Generation(BaseDoc):
    prompt: str
    text: str

class StableLM(Executor):
    def __init__(self, **kwargs):
        from transformers import pipeline
        super().__init__(**kwargs)
        self.generator = pipeline(
            'text-generation', model='stabilityai/stablelm-base-alpha-3b'
        )

    @requests
    def generate(self, docs: DocList[Prompt], **kwargs) -> DocList[Generation]:
        generations = DocList[Generation]()
        prompts = docs.text
        llm_outputs = self.generator(prompts)
        for prompt, output in zip(prompts, llm_outputs):
            generations.append(Generation(prompt=prompt, text=output))
        return generations

But it still gives me the same error as before.

what is the transformers library version you are using?

what is the transformers library version you are using?

@JoanFM, installed transformers from PyPI, so should be the latest version from there. Here's the version:

Name: transformers
Version: 4.35.2
Summary: State-of-the-art Machine Learning for JAX, PyTorch and TensorFlow
Home-page: https://github.com/huggingface/transformers
Author: The Hugging Face team (past and future) with the help of all our contributors (https://github.com/huggingface/transformers/graphs/contributors)
Author-email: transformers@huggingface.co
License: Apache 2.0 License
Location: /home/codetalker7/jinaAI/venv/lib/python3.8/site-packages
Requires: tokenizers, huggingface-hub, pyyaml, tqdm, packaging, regex, filelock, safetensors, numpy, requests

do you have torch or tensorflow installed?

do you have torch or transformers installed?

Yes, I have installed them both. torch version 2.1.1.

Ok, I think what is happening is that you may not have enough memory, and your OS has killed the Executor service.

Can you try:

  • Remove the model from cache: rm -rf ~/.cache/huggingface/hub/models--stabilityai--stablelm-base-alpha-3
  • Stop all the other memory consuming services and run this code? Does it work?
from transformers import pipeline
model = pipeline(
            'text-generation', model='stabilityai/stablelm-base-alpha-3b'
        )
print('Successfully loaded the model in memory')

when this works, then I believe the example would work.

  • rm -rf ~/.cache/huggingface/hub/models--stabilityai--stablelm-base-alpha-3

@JoanFM yes, I think memory was the issue. I tried gpt2 instead of stablelm and it works out just fine. Thanks a lot for the help!