szczyglis-dev/py-gpt

question: how to fix indexed file types?

Closed this issue · 1 comments

pygpt version: 2.1.18

I noticed that in the "FILES" tab and chat, when the LlamaIndex source is referenced, it provides the wrong file type, for example, "file_type: video/mp2t."

From what I've read on internet about RAG this can affect q&a quality, when for document used wrong meta information. Is it possible to fix types for source code files?

image

From version 2.1.19, there is a new option: Settings -> Llama-index -> Custom metadata.

In this option you can define custom fields in document metadata, which will be included during file indexing and overwrite those generated by Llama index data loaders. Just create a new entry for extension ts with key file_type and value your_custom_file_type_here and re-index the files.