raphaelsty/cherche

k param in retriever, ranker and pipeline, and documentation

fpservant opened this issue · 0 comments

the doc at https://raphaelsty.github.io/cherche/api/compose/Pipeline/
regarding the "call" method says:

If the batch_size_ranker, or batch_size_retriever it takes precedence over the batch_size. If the k_ranker, or k_retriever it takes precedence over the k parameter.

which is not really understandable, needs to be clarified (and could be interpreted as something misleading).

Regarding the k param, please note the following: if you define a retriever (say a tfidf one) with a k param of 20, followed by a ranker with a k param of 10, (your interested in top_k = 10 values at the end, but use 20 values at the retriever level) then a likely error one can make is to call the pipeline with a k value of 10. In this case indeed, it appears that the retriever uses a k value of 10.