wuyunfeng's Stars
facebookresearch/faiss
A library for efficient similarity search and clustering of dense vectors.
prestodb/presto
The official home of the Presto distributed SQL query engine for big data
ceph/ceph
Ceph is a distributed object, block, and file storage platform
StarRocks/starrocks
StarRocks, a Linux Foundation project, is a next-generation sub-second MPP OLAP database for full analytics scenarios, including multi-dimensional analytics, real-time analytics, and ad-hoc queries.
datafuselabs/databend
๐๐ฎ๐๐ฎ, ๐๐ป๐ฎ๐น๐๐๐ถ๐ฐ๐ & ๐๐. Modern alternative to Snowflake. Cost-effective and simple for massive-scale analytics. https://databend.com
delta-io/delta
An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, and Hive and APIs
greenplum-db/gpdb
Greenplum Database - Massively Parallel PostgreSQL for Analytics. An open-source massively parallel data platform for analytics, machine learning and AI.
vespa-engine/vespa
AI + Data, online. https://vespa.ai
apache/hudi
Upserts, Deletes And Incremental Processing on Big Data.
zendesk/maxwell
Maxwell's daemon, a mysql-to-json kafka producer
RoaringBitmap/RoaringBitmap
A better compressed bitset in Java: used by Apache Spark, Netflix Atlas, Apache Pinot, Tablesaw, and many others
nmslib/nmslib
Non-Metric Space Library (NMSLIB): An efficient similarity search library and a toolkit for evaluation of k-NN methods for generic non-metric spaces.
facebookincubator/velox
A C++ vectorized database acceleration library aimed to optimizing query engines and data processing systems.
robcowart/elastiflow
Network flow analytics (Netflow, sFlow and IPFIX) with the Elastic Stack
looly/elasticsearch-definitive-guide-cn
Elasticsearchๆๅจๆๅไธญๆ็
RoaringBitmap/CRoaring
Roaring bitmaps in C (and C++), with SIMD (AVX2, AVX-512 and NEON) optimizations: used by Apache Doris, ClickHouse, and StarRocks
pingcap/tispark
TiSpark is built for running Apache Spark on top of TiDB/TiKV
lemire/FastPFor
The FastPFOR C++ library: Fast integer compression
opendistro-for-elasticsearch/sql
๐ Open Distro SQL Plugin
lemire/JavaFastPFOR
A simple integer compression library in Java
guotong1988/chinese_dictionary
ๅไน่ฏ่กจ๏ผๅไน่ฏ่กจ๏ผๅฆๅฎ่ฏ่กจ
lior-k/fast-elasticsearch-vector-scoring
Score documents using embedding-vectors dot-product or cosine-similarity with ES Lucene engine
bells/elasticsearch-analysis-dynamic-synonym
The dynamic synonym plugin adds a synonym token filter that reloads the synonym file(local file or remote file) at given intervals (default 60s).
mikemccand/luceneutil
Various utility scripts for running Lucene performance tests
morfologik/morfologik-stemming
Tools for finite state automata construction and dictionary-based morphological dictionaries. Includes Polish stemming dictionary.
yonik/java_util
Small useful things for Java
wheresvic/zsearch
A high performance search engine
opensearch-project/cross-cluster-replication
Synchronize your data across multiple clusters for lower latencies and higher availability
opendistro-for-elasticsearch/asynchronous-search
:arrow_forward: Asynchronous search makes it possible for users to run queries in the background, allowing users to track the progress, and retrieve partial results as they become available.
moshebla/solr-vector-scoring
Vector Plugin for Solr: calculate dot product / cosine similarity on documents