aws-samples/amazon-kendra-langchain-extensions

Different results in retriever and AWS console.

Opened this issue · 2 comments

Sypek commented

Hi,
I found out that I get different results from asking any question (i.e.: "What is Amazon Sagemaker?") using:

  • AWS Console by running query on Kendra Index
  • langchain.retrievers.AmazonKendraRetriever: retriever = AmazonKendraRetriever( index_id=KENDRA_INDEX_ID, region_name=AWS_REGION)

Are there any additional settings that I should make in my code to get the same results? Moreover, while results from Console are accurate, the ones I get from retriever are not especially accurate.

Hi, I am also facing the similar issue. Results from AWS Console Kendra and Retriver are not same

Hi,

Kendra Search console uses Query API, which is a bit different from the Retrieve API. Found the below on AWS Docs:

Retrieve API is similar to the Query (https://docs.aws.amazon.com/kendra/latest/APIReference/API_Query.html) API. However, by default, the Query API only returns excerpt passages of up to 100 token words. With the Retrieve API, you can retrieve longer passages of up to 200 token words and up to 100 semantically relevant passages. This doesn't include question-answer or FAQ type responses from your index. The passages are text excerpts that can be semantically extracted from multiple documents and multiple parts of the same document. If in extreme cases your documents produce zero passages using the Retrieve API, you can alternatively use the Query API and its types of responses.

The results would be similar when using the Query API.