IdoKenan's Stars
unclecode/crawl4ai
🚀🤖 Crawl4AI: Open-source LLM Friendly Web Crawler & Scraper
apple/foundationdb
FoundationDB - the open source, distributed, transactional key-value store
ceph/ceph
Ceph is a distributed object, block, and file storage platform
temporalio/temporal
Temporal service
mage-ai/mage-ai
🧙 Build, run, and manage data pipelines for integrating and transforming data.
apache/datafusion
Apache DataFusion SQL Query Engine
paradedb/paradedb
Postgres for Search and Analytics
lancedb/lancedb
Developer-friendly, serverless vector database for AI applications. Easily add long-term memory to your LLM apps!
internetarchive/openlibrary
One webpage for every book ever published!
poem-web/poem
A full-featured and easy-to-use web framework with the Rust programming language.
roapi/roapi
Create full-fledged APIs for slowly moving datasets without writing a single line of code.
apache/arrow-rs
Official Rust implementation of Apache Arrow
unitycatalog/unitycatalog
Open, Multi-modal Catalog for Data & AI
Eventual-Inc/Daft
Distributed data engine for Python/SQL designed for the cloud, powered by Rust
unum-cloud/usearch
Fast Open-Source Search & Clustering engine × for Vectors & 🔜 Strings × in C++, C, Python, JavaScript, Rust, Java, Objective-C, Swift, C#, GoLang, and Wolfram 🔍
huggingface/datatrove
Freeing data processing from scripting madness by providing a set of platform-agnostic customizable pipeline processing blocks.
TobikoData/sqlmesh
Efficient data transformation and modeling framework that is backwards compatible with dbt.
apache/datafusion-ballista
Apache DataFusion Ballista Distributed Query Engine
uwdata/arquero
Query processing and transformation of array-backed data tables.
feldera/feldera
The Feldera Incremental Computation Engine
DropbaseHQ/dropbase
Dropbase helps developers build and prototype web apps faster with AI. Dropbase is local-first and self hosted.
apache/datafusion-comet
Apache DataFusion Comet Spark Accelerator
GlareDB/glaredb
GlareDB: An analytics DBMS for distributed data
mlcommons/croissant
Croissant is a high-level format for machine learning datasets that brings together four rich layers.
paradedb/pg_analytics
DuckDB-powered data lake analytics from Postgres
matrix-profile-foundation/matrixprofile
A Python 3 library making time series data mining tasks, utilizing matrix profile algorithms, accessible to everyone.
simonw/llm-cmd
Use LLM to generate and execute commands in your shell
ryrobes/rvbbit
Reactive Data Board & Visual Flow Platform
eto-ai/rikai
Parquet-based ML data format optimized for working with unstructured data
synnada-ai/mithril
Mithril: A Modular Machine Learning Library for Model Composability