document-search

There are 39 repositories under document-search topic.

  • paperai

    neuml/paperai

    📄 🤖 Semantic search and workflows for medical/scientific papers

    Language:Python1.4k2671102
  • redis-developer/redis-arXiv-search

    Vector search demo with the arXiv paper dataset, RedisVL, HuggingFace, OpenAI, Cohere, FastAPI, React, and Redis.

    Language:Python1416923
  • robindekoster/chatgpt-custom-knowledge-chatbot

    This open source chatbot project lets you create a chatbot that uses your own data to answer questions, thanks to the power of the OpenAI GPT-3.5 model.

    Language:Python1236435
  • capjamesg/jamesql

    An in-memory NoSQL database implemented in Python.

    Language:Python81001
  • kcubeterm/achoz

    Search through all your personal data efficiently like web search.

    Language:Python793115
  • mememo

    poloclub/mememo

    A JavaScript library that brings vector search and RAG to your browser!

    Language:TypeScript71208
  • daac-tools/find-simdoc

    Finding all pairs of similar documents time- and memory-efficiently

    Language:Rust58313
  • neuml/cord19q

    COVID-19 Open Research Dataset (CORD-19) Analysis

    Language:Python569417
  • zayedrais/DocumentSearchEngine

    Document Search Engine project with TF-IDF abd Google universal sentence encoder model

    Language:Jupyter Notebook533224
  • infinilabs/coco-app

    🥥 Coco AI App - Search, Connect, Collaborate, AI-powered enterprise search, all in one space.

    Language:TypeScript467114
  • teilomillet/raggo

    A lightweight, production-ready RAG (Retrieval Augmented Generation) library in Go.

    Language:Go34112
  • deepsense-ai/ragbits

    Building blocks for rapid development of GenAI applications

    Language:Python2111385
  • jankovicsandras/plpgsql_bm25

    BM25 search implemented in PL/pgSQL

    Language:Jupyter Notebook18130
  • infinilabs/coco-server

    🥥 Coco AI Server - Search, Connect, Collaborate, AI-powered enterprise search, all in one space.

    Language:Go16794
  • easonlai/chatbot_with_pdf_streamlit

    This code example shows how to make a chatbot for semantic search over documents using Streamlit, LangChain, and various vector databases. The chatbot lets users ask questions and get answers from a document collection. The code is in Python and can be customized for different scenarios and data.

    Language:Jupyter Notebook15304
  • kyr0/clientside-search

    A highly efficient, isomorphic, full-featured, multilingual text search engine library, providing full-text search, fuzzy matching, phonetic scoring, document indexing and more, with micro JSON state hydration/dehydration in-browser and server-side.

    Language:TypeScript10210
  • jankovicsandras/bm25opt

    faster BM25 search algorithms in Python

    Language:Jupyter Notebook8200
  • alBERT-launcher

    lekt9/alBERT-launcher

    AI-powered file launcher and semantic search assistant. Like Spotlight/Alfred but with advanced AI capabilities for understanding context and meaning. Features local processing, privacy-first design, and seamless integration with your workflow.

    Language:TypeScript60
  • lethalbit/bookwurm

    dead simple document index and search, nothing fancy

    Language:Python620
  • bent10/boox

    Search anything, instantly

    Language:TypeScript5121
  • gsidhu/buzee-releases

    Public releases for Buzee

  • mdietrichstein/ir-search-engine-rust

    Rust-based text search engine from scratch supporting multiple document similarity metrics (TF-IDF, BM25, BM25VA)

    Language:Rust5200
  • opengento/magento2-document-search

    This module aims to make documents searchable for customers in Magento 2.

    Language:PHP3911
  • HarshKothari21/Natural-Language-Processing-Specialization

    NLP Course By Deep learning.io powered by @coursera. Taught by: Younes Bensouda Mourri, Instructor of AI at Stanford University and Łukasz Kaiser, Staff Research Scientist at Google Brain.

    Language:Jupyter Notebook2200
  • opengento/magento2-document-product-search

    This module aims to make documents searchable with product keywords in Magento 2.

    Language:PHP2901
  • Qyokizzzz/simhash

    The extended version of simhash supports fingerprint extraction of documents and images.

    Language:Python2100
  • AI-STACK-dev/Covid19-Comorbidities-NLP-WEB

    COVID-19 comorbidities analysis platform based on Natural Language Processing(NLP)

    Language:JavaScript1102
  • dileepgodithi/DistributedSearch

    Distributed document search using TF-IDF algorithm.

    Language:Java1100
  • EmirhanSyl/TheBSTSearchEngine

    Mini desktop search engine with Binary Search Tree

    Language:Java110
  • kunjankanani/Document_Query_Search

    Retrieval-Augmented Generation, or RAG, is an innovative approach that enhances the capabilities of pre-trained large language models (LLMs) by integrating them with external data sources. This technique leverages the generative power of LLMs (Large Language Model), and combines it with the precision of specialized data search mechanisms.

    Language:Python1201
  • liviobisogni/solr-ocr-indexing

    Apache Solr Document Search and Indexing Analysis with OCR

    Language:Java1201
  • sanu0711/llama-index-and-openai

    An AI-powered solution for efficient document querying. It uses Llama Index for vector-based indexing and OpenAI's GPT to interpret natural language queries, providing accurate search results.

    Language:Jupyter Notebook1100
  • shreyansh-kothari/PDF-Querying-using-TF-IDF-from-Scratch

    Given a set of PDFs and the query, the most relevant pdf can be found with the help of TF-IDF. The code has not used any library to implement TF-IDF

    Language:Python1201
  • tomlin7/AI-research-assistant

    Semantic document search system with pgvector and PGAI

    Language:Python1201
  • EricSchoebel/DocSpector

    Stichwortfinder für Texte in Dokumenten eines Ordners / Keyword Finder for Texts in Documents of a Directory (for English, see README-en.md)

    Language:Python00
  • SamJoeSilvano/Multi-Source-Knowledge-Retrieval-System

    An end-to-end multi-source knowledge retrieval system using LangChain, FAISS, and OpenAI embeddings. This Retrieval-Augmented Generation (RAG) pipeline intelligently searches across Wikipedia, arXiv, and custom websites, optimizing source selection and delivering precise, real-time results based on query relevance.

    Language:Jupyter Notebook0100