assafelovic/gpt-researcher

better error handling when search retrievers return faulty or empty results

danieldekay opened this issue · 1 comments

My setup is described below, in case it matters. I am using Azure OpenAI and Bing as web search retriever; GPT-R is used via the web frontend.

In many searches GPT-R hangs on an error in the bing retriever, and then it gets stuck. -- not reproducible, but happens at random.

Would it be possible to add a) a retry or b) a better way to ignore faulty responses from the retriever?

For example, this line fails if the search_results are None -- but it is possible that None is returned.

On my local copy of the code I changed this line so that it looks like this:

        # Preprocess the results
        if resp is None:
            return
        try:
            search_results = json.loads(resp.text)
            results = search_results["webPages"]["value"]
        except Exception:
            return
        if search_results is None:
            return

        search_results = []

...thereby essentially throwing away a faulty Bing result.

System Information

------------------
> OS:  Linux
> OS Version:  #1 SMP Fri Mar 29 23:14:13 UTC 2024
> Python Version:  3.11.0rc1 (main, Aug 12 2022, 10:02:14) [GCC 11.2.0]

Package Information
-------------------
> langchain_core: 0.2.33
> langchain: 0.2.14
> langchain_community: 0.2.12
> langsmith: 0.1.100
> langchain_openai: 0.1.22
> langchain_text_splitters: 0.2.2
> langgraph: 0.2.5

Optional packages not installed
-------------------------------
> langserve

Other Dependencies
------------------
> aiohttp: 3.10.5
> async-timeout: Installed. No version info available.
> dataclasses-json: 0.6.7
> httpx: 0.27.0
> jsonpatch: 1.33
> langgraph-checkpoint: 1.0.3
> numpy: 1.26.4
> openai: 1.42.0
> orjson: 3.10.7
> packaging: 24.1
> pydantic: 2.8.2
> PyYAML: 6.0.2
> requests: 2.32.3
> SQLAlchemy: 2.0.32
> tenacity: 8.5.0
> tiktoken: 0.7.0
> typing-extensions: 4.12.2