google-gemini/generative-ai-python

Gemini Flash error: None Stream removed

Closed this issue · 5 comments

Description of the bug:

I'm trying out Gemini 1.5 Flash (002) API and its long context. I prompt the LLM with the contents of a few (10) large PDF files. In the first interaction, I ask it to list the titles of the documents (to verify that the file contents are available and the model can read them). This appears to work fine: the titles are listed and the total token count is reported to be about 290K.

import google.generativeai as genai


chat_session = model.start_chat(history=[])
response = await chat_session.send_message_async(
    [f'Carefully look at the {len(files)} documents provided here and list their titles.'] +  files,
    stream=True
)
async for chunk in response:
    print(chunk.text, end='')

Next, in the same chat session, I ask it to summarize the documents, as indicated in the code below:

import random
import time


review = ''
max_retries = 3

while max_retries > 0:
    try:
        response = await chat_session.send_message_async(
            [REVIEW_PROMPT.strip()],
            stream=True,
        )

        async for chunk in response:
            print('.', end='')
            review += chunk.text
            
        print('')
        break
    except Exception as ex:        
        print(f'*** An error occurred while receiving chat response: {ex}')
        max_retries -= 1
        
        import traceback
        traceback.print_exc()
        
        if max_retries > 0:
            wait_time = random.uniform(5, 7)
            print(f'Retrying again in {wait_time} seconds...')
            chat_session.rewind()
            time.sleep(wait_time)

However, this invocation almost always results in the following error:

Traceback (most recent call last):
  File "/opt/conda/lib/python3.10/site-packages/google/api_core/grpc_helpers_async.py", line 106, in _wrapped_aiter
    async for response in self._call:  # pragma: no branch
  File "/opt/conda/lib/python3.10/site-packages/grpc/aio/_call.py", line 365, in _fetch_stream_responses
    await self._raise_for_status()
  File "/opt/conda/lib/python3.10/site-packages/grpc/aio/_call.py", line 272, in _raise_for_status
    raise _create_rpc_error(
grpc.aio._call.AioRpcError: <AioRpcError of RPC that terminated with:
	status = StatusCode.UNKNOWN
	details = "Stream removed"
	debug_error_string = "UNKNOWN:Error received from peer ipv4:<IP_ADDRESS>:443 {created_time:"2024-11-01T09:50:17.098911997+00:00", grpc_status:2, grpc_message:"Stream removed"}"
>

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/tmp/ipykernel_30/2992338197.py", line 15, in <module>
    async for chunk in response:
  File "/opt/conda/lib/python3.10/site-packages/google/generativeai/types/generation_types.py", line 727, in __aiter__
    raise self._error
  File "/opt/conda/lib/python3.10/site-packages/google/generativeai/types/generation_types.py", line 736, in __aiter__
    item = await anext(self._iterator)  # type: ignore
  File "/opt/conda/lib/python3.10/site-packages/google/api_core/grpc_helpers_async.py", line 109, in _wrapped_aiter
    raise exceptions.from_grpc_error(rpc_error) from rpc_error
google.api_core.exceptions.Unknown: None Stream removed

I have tried to run it by disabling streaming, but it still throws the same error. Rewinding the chat session and trying it again causes the same error.

How can I address this error and continue the chat?

Actual vs expected behavior:

The expected behavior is to receive the complete response from the model without any run-time exception.

Any other information you'd like to share?

Just to clarify, the code works on rare occasions. Also, I'm running the code on Kaggle (!pip install google-generativeai==0.8.3 grpcio-status).

Hi @barun-saha

I ran the same code on my local setup, and it worked fine. However, when I run it on Kaggle, I encounter the same error as you. This issue may be due to network instability on Kaggle's end. If possible, try running the code on a stable network or on a local setup.

Thanks

Hi Manoj,

Thanks for your inputs.

I tried running the code in a Colab notebook, using the synchronous send_message method (send_message_async leads to error on Colab, but that's a different issue.) Surprisingly, the code ran without any error! I tried out sending several chat messages and got responses back (with no error).

I was curious, so went back to Kaggle and used send_message. Unfortunately, I got the same error there with the synchronous call as well.

Therefore, I agree with your observation that this might be more of an environment-specific issue.

@barun-saha

I encountered the same error even with send_message. Kaggle handles fewer tokens well, but in your case, the 290K tokens are causing issues.

Thanks

Marking this issue as stale since it has been open for 14 days with no activity. This issue will be closed if no further activity occurs.

This issue was closed because it has been inactive for 28 days. Please post a new issue if you need further assistance. Thanks!