document-search

There are 39 repositories under document-search topic.

neuml/paperai
📄 🤖 Semantic search and workflows for medical/scientific papers
Language:Python1.4k 26 71102
redis-developer/redis-arXiv-search
Vector search demo with the arXiv paper dataset, RedisVL, HuggingFace, OpenAI, Cohere, FastAPI, React, and Redis.
Language:Python141 6 923
robindekoster/chatgpt-custom-knowledge-chatbot
This open source chatbot project lets you create a chatbot that uses your own data to answer questions, thanks to the power of the OpenAI GPT-3.5 model.
Language:Python123 6 435
capjamesg/jamesql
An in-memory NoSQL database implemented in Python.
Language:Python81 0 01
kcubeterm/achoz
Search through all your personal data efficiently like web search.
Language:Python79 3 115
poloclub/mememo
A JavaScript library that brings vector search and RAG to your browser!
Language:TypeScript71 2 08
daac-tools/find-simdoc
Finding all pairs of similar documents time- and memory-efficiently
Language:Rust58 3 13
neuml/cord19q
COVID-19 Open Research Dataset (CORD-19) Analysis
Language:Python56 9 417
zayedrais/DocumentSearchEngine
Document Search Engine project with TF-IDF abd Google universal sentence encoder model
Language:Jupyter Notebook53 3 224
infinilabs/coco-app
🥥 Coco AI App - Search, Connect, Collaborate, AI-powered enterprise search, all in one space.
Language:TypeScript46 7 114
teilomillet/raggo
A lightweight, production-ready RAG (Retrieval Augmented Generation) library in Go.
Language:Go34 1 12
deepsense-ai/ragbits
Building blocks for rapid development of GenAI applications
Language:Python21 1 1385
jankovicsandras/plpgsql_bm25
BM25 search implemented in PL/pgSQL
Language:Jupyter Notebook18 1 30
infinilabs/coco-server
🥥 Coco AI Server - Search, Connect, Collaborate, AI-powered enterprise search, all in one space.
Language:Go16 7 94
easonlai/chatbot_with_pdf_streamlit
This code example shows how to make a chatbot for semantic search over documents using Streamlit, LangChain, and various vector databases. The chatbot lets users ask questions and get answers from a document collection. The code is in Python and can be customized for different scenarios and data.
Language:Jupyter Notebook15 3 04
kyr0/clientside-search
A highly efficient, isomorphic, full-featured, multilingual text search engine library, providing full-text search, fuzzy matching, phonetic scoring, document indexing and more, with micro JSON state hydration/dehydration in-browser and server-side.
Language:TypeScript10 2 10
jankovicsandras/bm25opt
faster BM25 search algorithms in Python
Language:Jupyter Notebook8 2 00
lekt9/alBERT-launcher
AI-powered file launcher and semantic search assistant. Like Spotlight/Alfred but with advanced AI capabilities for understanding context and meaning. Features local processing, privacy-first design, and seamless integration with your workflow.
Language:TypeScript60
lethalbit/bookwurm
dead simple document index and search, nothing fancy
Language:Python6 2 0
bent10/boox
Search anything, instantly
Language:TypeScript5 1 21
gsidhu/buzee-releases
Public releases for Buzee
5 1 00
mdietrichstein/ir-search-engine-rust
Rust-based text search engine from scratch supporting multiple document similarity metrics (TF-IDF, BM25, BM25VA)
Language:Rust5 2 00
opengento/magento2-document-search
This module aims to make documents searchable for customers in Magento 2.
Language:PHP3 9 11
HarshKothari21/Natural-Language-Processing-Specialization
NLP Course By Deep learning.io powered by @coursera. Taught by: Younes Bensouda Mourri, Instructor of AI at Stanford University and Łukasz Kaiser, Staff Research Scientist at Google Brain.
Language:Jupyter Notebook2 2 00
opengento/magento2-document-product-search
This module aims to make documents searchable with product keywords in Magento 2.
Language:PHP2 9 01
Qyokizzzz/simhash
The extended version of simhash supports fingerprint extraction of documents and images.
Language:Python2 1 00
AI-STACK-dev/Covid19-Comorbidities-NLP-WEB
COVID-19 comorbidities analysis platform based on Natural Language Processing(NLP)
Language:JavaScript1 1 02
dileepgodithi/DistributedSearch
Distributed document search using TF-IDF algorithm.
Language:Java1 1 00
EmirhanSyl/TheBSTSearchEngine
Mini desktop search engine with Binary Search Tree
Language:Java1 1 0
kunjankanani/Document_Query_Search
Retrieval-Augmented Generation, or RAG, is an innovative approach that enhances the capabilities of pre-trained large language models (LLMs) by integrating them with external data sources. This technique leverages the generative power of LLMs (Large Language Model), and combines it with the precision of specialized data search mechanisms.
Language:Python1 2 01
liviobisogni/solr-ocr-indexing
Apache Solr Document Search and Indexing Analysis with OCR
Language:Java1 2 01
sanu0711/llama-index-and-openai
An AI-powered solution for efficient document querying. It uses Llama Index for vector-based indexing and OpenAI's GPT to interpret natural language queries, providing accurate search results.
Language:Jupyter Notebook1 1 00
shreyansh-kothari/PDF-Querying-using-TF-IDF-from-Scratch
Given a set of PDFs and the query, the most relevant pdf can be found with the help of TF-IDF. The code has not used any library to implement TF-IDF
Language:Python1 2 01
tomlin7/AI-research-assistant
Semantic document search system with pgvector and PGAI
Language:Python1 2 01
EricSchoebel/DocSpector
Stichwortfinder für Texte in Dokumenten eines Ordners / Keyword Finder for Texts in Documents of a Directory (for English, see README-en.md)
Language:Python00
SamJoeSilvano/Multi-Source-Knowledge-Retrieval-System
An end-to-end multi-source knowledge retrieval system using LangChain, FAISS, and OpenAI embeddings. This Retrieval-Augmented Generation (RAG) pipeline intelligently searches across Wikipedia, arXiv, and custom websites, optimizing source selection and delivering precise, real-time results based on query relevance.
Language:Jupyter Notebook0 1 00

document-search

neuml/paperai

redis-developer/redis-arXiv-search

robindekoster/chatgpt-custom-knowledge-chatbot

capjamesg/jamesql

kcubeterm/achoz

poloclub/mememo

daac-tools/find-simdoc

neuml/cord19q

zayedrais/DocumentSearchEngine

infinilabs/coco-app

teilomillet/raggo

deepsense-ai/ragbits

jankovicsandras/plpgsql_bm25

infinilabs/coco-server

easonlai/chatbot_with_pdf_streamlit

kyr0/clientside-search

jankovicsandras/bm25opt

lekt9/alBERT-launcher

lethalbit/bookwurm

bent10/boox

gsidhu/buzee-releases

mdietrichstein/ir-search-engine-rust

opengento/magento2-document-search

HarshKothari21/Natural-Language-Processing-Specialization

opengento/magento2-document-product-search

Qyokizzzz/simhash

AI-STACK-dev/Covid19-Comorbidities-NLP-WEB

dileepgodithi/DistributedSearch

EmirhanSyl/TheBSTSearchEngine

kunjankanani/Document_Query_Search

liviobisogni/solr-ocr-indexing

sanu0711/llama-index-and-openai

shreyansh-kothari/PDF-Querying-using-TF-IDF-from-Scratch

tomlin7/AI-research-assistant

EricSchoebel/DocSpector

SamJoeSilvano/Multi-Source-Knowledge-Retrieval-System