quarkiverse/quarkus-langchain4j

Add Redis semantic cache feature

lordofthejars opened this issue · 4 comments

One of the things it is going to be interesting to have out of the box about AI is the semantic cache of requests.

Actually, it could be used in any method but according to a recent study, 31% of queries to LLM can be cached (or, in other words, 31% of the queries are contextually repeatable), which can significantly improve response time in GenAI apps.

I created a simple example that implements this with Redis: https://github.com/lordofthejars-ai/quarkus-langchain-examples/tree/main/semantic-cache

Do you think it might be interesting to integrate this into Quarkus Cache system for example as Redis-semantic-cache or something like this?

I think @iocanel and @andreadimaio were thinking of something similar

Yes, what we have here #659 is a concept of semantic cache.
The idea is to have something very similar to ChatMemory, so you can extend the default implementation (in-memory) with other products like Redis or something else.

Great, feel free to take a view in my example, you'll see that code to do it is not complex, some configuration parameters that's true. Calculating Keys is easy as by default Quarkus cache offers the interface to override the creation of keys. The problem is in the code to check if it is a cache miss or not.

Thanks for the input.

Closing as duplicate in light of the conversation above.