Add document loader
Opened this issue · 0 comments
prabirshrestha commented
Add document loaders to make it easy to implement RAG.
- Add markdown document loader
- Add html document loader
- Add support for CSV document loader
- Add file directory document loader. Highly suggest using https://opendal.apache.org here
- add support for only office document loader. This allows us to use the conversion api to get text from doc/docx/pdf/xls/ppt/rtf files and can be selfhosted. https://api.onlyoffice.com/editors/conversionapi
- Add support for native pdf document loader so it can be embedded. https://crates.io/crates/pdfium-render