document-search

There are 38 repositories under document-search topic.

  • paperai

    neuml/paperai

    📄 🤖 Semantic search and workflows for medical/scientific papers

    Language:Python1.3k2569100
  • redis-developer/redis-arXiv-search

    Vector search demo with the arXiv paper dataset, RedisVL, HuggingFace, OpenAI, Cohere, FastAPI, React, and Redis.

    Language:Python1396923
  • robindekoster/chatgpt-custom-knowledge-chatbot

    This open source chatbot project lets you create a chatbot that uses your own data to answer questions, thanks to the power of the OpenAI GPT-3.5 model.

    Language:Python1206434
  • kcubeterm/achoz

    Search through all your personal data efficiently like web search.

    Language:Python793115
  • capjamesg/jamesql

    An in-memory NoSQL database implemented in Python.

    Language:Python77101
  • mememo

    poloclub/mememo

    A JavaScript library that brings vector search and RAG to your browser!

    Language:TypeScript64207
  • daac-tools/find-simdoc

    Finding all pairs of similar documents time- and memory-efficiently

    Language:Rust58313
  • neuml/cord19q

    COVID-19 Open Research Dataset (CORD-19) Analysis

    Language:Python569417
  • zayedrais/DocumentSearchEngine

    Document Search Engine project with TF-IDF abd Google universal sentence encoder model

    Language:Jupyter Notebook543221
  • infinilabs/coco-app

    🥥 Coco AI App - search, connect, collaborate, AI-powered enterprise search, all in one place.

    Language:TypeScript230
  • easonlai/chatbot_with_pdf_streamlit

    This code example shows how to make a chatbot for semantic search over documents using Streamlit, LangChain, and various vector databases. The chatbot lets users ask questions and get answers from a document collection. The code is in Python and can be customized for different scenarios and data.

    Language:Jupyter Notebook15304
  • teilomillet/raggo

    A lightweight, production-ready RAG (Retrieval Augmented Generation) library in Go.

    Language:Go150
  • deepsense-ai/ragbits

    Building blocks for rapid development of GenAI applications

    Language:Python132
  • jankovicsandras/plpgsql_bm25

    BM25 search implemented in PL/pgSQL

    Language:Jupyter Notebook12130
  • infinilabs/coco-server

    🥥 Coco AI Server - search, connect, collaborate, AI-powered enterprise search, all in one place.

    Language:Go110
  • kyr0/clientside-search

    A highly efficient, isomorphic, full-featured, multilingual text search engine library, providing full-text search, fuzzy matching, phonetic scoring, document indexing and more, with micro JSON state hydration/dehydration in-browser and server-side.

    Language:TypeScript10210
  • jankovicsandras/bm25opt

    faster BM25 search algorithms in Python

    Language:Jupyter Notebook70
  • lethalbit/bookwurm

    dead simple document index and search, nothing fancy

    Language:Python620
  • mdietrichstein/ir-search-engine-rust

    Rust-based text search engine from scratch supporting multiple document similarity metrics (TF-IDF, BM25, BM25VA)

    Language:Rust5200
  • bent10/boox

    Search anything, instantly

    Language:TypeScript4111
  • opengento/magento2-document-search

    This module aims to make documents searchable for customers in Magento 2.

    Language:PHP3911
  • HarshKothari21/Natural-Language-Processing-Specialization

    NLP Course By Deep learning.io powered by @coursera. Taught by: Younes Bensouda Mourri, Instructor of AI at Stanford University and Łukasz Kaiser, Staff Research Scientist at Google Brain.

    Language:Jupyter Notebook2200
  • opengento/magento2-document-product-search

    This module aims to make documents searchable with product keywords in Magento 2.

    Language:PHP2901
  • Qyokizzzz/simhash

    The extended version of simhash supports fingerprint extraction of documents and images.

    Language:Python2100
  • AI-STACK-dev/Covid19-Comorbidities-NLP-WEB

    COVID-19 comorbidities analysis platform based on Natural Language Processing(NLP)

    Language:JavaScript1102
  • dileepgodithi/DistributedSearch

    Distributed document search using TF-IDF algorithm.

    Language:Java1100
  • EmirhanSyl/TheBSTSearchEngine

    Mini desktop search engine with Binary Search Tree

    Language:Java110
  • kunjankanani/Document_Query_Search

    Retrieval-Augmented Generation, or RAG, is an innovative approach that enhances the capabilities of pre-trained large language models (LLMs) by integrating them with external data sources. This technique leverages the generative power of LLMs (Large Language Model), and combines it with the precision of specialized data search mechanisms.

    Language:Python1200
  • liviobisogni/solr-ocr-indexing

    Apache Solr Document Search and Indexing Analysis with OCR

    Language:Java1200
  • sanu0711/llama-index-and-openai

    An AI-powered solution for efficient document querying. It uses Llama Index for vector-based indexing and OpenAI's GPT to interpret natural language queries, providing accurate search results.

    Language:Jupyter Notebook10
  • shreyansh-kothari/PDF-Querying-using-TF-IDF-from-Scratch

    Given a set of PDFs and the query, the most relevant pdf can be found with the help of TF-IDF. The code has not used any library to implement TF-IDF

    Language:Python1201
  • EricSchoebel/DocSpector

    Stichwortfinder für Texte in Dokumenten eines Ordners / Keyword Finder for Texts in Documents of a Directory (for English, see README-en.md)

    Language:Python00
  • krisluczka/OSSE

    Open Source Search Engine with built-in web/document crawler and an indexing method.

    Language:C++0100
  • gsidhu/buzee-releases

    Public releases for Buzee

  • SamJoeSilvano/Multi-Source-Knowledge-Retrieval-System

    An end-to-end multi-source knowledge retrieval system using LangChain, FAISS, and OpenAI embeddings. This Retrieval-Augmented Generation (RAG) pipeline intelligently searches across Wikipedia, arXiv, and custom websites, optimizing source selection and delivering precise, real-time results based on query relevance.

    Language:Jupyter Notebook
  • tomlin7/AI-research-assistant

    Semantic document search system with pgvector and PGAI

    Language:Python1