[Question] does content.url in filename for websites make sense? (I want attribution per paragraph via separate prompt)
chaelli opened this issue · 4 comments
Context / Scenario
I changed the prompt to make sure the llm includes the source per paragraph of the answer. So I can more closly align the response with the facts for my users. When I do that, I can only tell it to reference the filename (as this is what the llm gets in the facts part of the prompt). For websites this is always "content.url" - because this is set so in
kernel-memory/service/Core/MemoryService.cs
Line 120 in a1f280c
Question
I wonder if it would not make more sense to put the url there instead of a static string. Or at least include the url in the facts where it exists.
You should be able to swap content.url with the URL upon receiving the response, there is a property with the URL
This only works if there is just 1 relevant source - if there are multiple, I would not know which part of the answer is based on what page. If there are multiple sources, they are all called content.url and I cannot align separate sources to separate paragraphs.
fyi until I started using kernel memory, I just used a prompt like this:
Add a source reference to the end of each sentence. e.g. Apple is a fruit ([Reference page title](Reference page url)) (markdown link formatting). ...
@dluc Do you have any preference between the options:
- replace "content.url" during indexing with the real url value?
- additing the url as an additional value in the prompt?
Or none of them?
@dluc Do you have any preference between the options:
* replace "content.url" during indexing with the real url value? * additing the url as an additional value in the prompt?
Or none of them?
I would try the approach with the prompt, it should be easier. Changing the indexing pipeline might have unexpected impact