data
There are 19453 repositories under data topic.
datasets
TFDS is a collection of datasets ready to use with TensorFlow, Jax, ...
glide-data-grid
🚀 Glide Data Grid is a no compromise, outrageously react fast data grid with rich rendering, first class accessibility, and full TypeScript support.
bad-data-guide
An exhaustive reference to problems seen in real-world data along with suggestions on how to resolve them.
gray-matter
Smarter YAML front matter parser, used by metalsmith, Gatsby, Netlify, Assemble, mapbox-gl, phenomic, vuejs vitepress, TinaCMS, Shopify Polaris, Ant Design, Astro, hashicorp, garden, slidev, saber, sourcegraph, and many others. Simple to use, and battle tested. Parses YAML by default but can also parse JSON Front Matter, Coffee Front Matter, TOML Front Matter, and has support for custom parsers. Please follow gray-matter's author: https://github.com/jonschlinkert
tinybase
The reactive data store for local‑first apps.
arroyo
Distributed stream processing engine in Rust
data-transfer-project
The Data Transfer Project makes it easy for platforms to build interoperable user data portability features. We are establishing a common framework, including data models and protocols, to enable direct transfer of data both into and out of participating online service providers.
react-refetch
A simple, declarative, and composable way to fetch data for React components
cognita
RAG (Retrieval Augmented Generation) Framework for building modular, open source applications for production by TrueFoundry
awesome-json-datasets
A curated list of awesome JSON datasets that don't require authentication.
TextRecognitionDataGenerator
A synthetic data generator for text recognition
docta
A Doctor for your data
memphis
Memphis.dev is a highly scalable and effortless data streaming platform
falso
All the Fake Data for All Your Real Needs 🙂
quadratic
Quadratic | Spreadsheet with Python, SQL, and AI
aresdb
A GPU-powered real-time analytics storage and query engine.
weld
High-performance runtime for data analytics applications
pandas-datareader
Extract data from a wide range of Internet sources into a pandas DataFrame.
data-diff
Compare tables within or across databases
stats
A well tested and comprehensive Golang statistics library package with no dependencies.
dlt
data load tool (dlt) is an open source Python library that makes data loading easy 🛠️
incubator-devlake
Apache DevLake is an open-source dev data platform to ingest, analyze, and visualize the fragmented data from DevOps tools, extracting insights for engineering excellence, developer experience, and community growth.
scio
A Scala API for Apache Beam and Google Cloud Dataflow.
gopup
数据接口:百度、谷歌、头条、微博指数,宏观数据,利率数据,货币汇率,千里马、独角兽公司,新闻联播文字稿,影视票房数据,高校名单,疫情数据…
pypika
PyPika is a python SQL query builder that exposes the full richness of the SQL language using a syntax that reflects the resulting query. PyPika excels at all sorts of SQL queries but is especially useful for data analysis.
graphic-walker
An open source alternative to Tableau. Embeddable visual analytic
datasets
🎁 5,400,000+ Unsplash images made available for research and machine learning
PyFunctional
Python library for creating data pipelines with chain functional programming
DeepBI
LLM based data scientist, AI native data application. AI-driven infinite thinking redefines BI.
mito
The mitosheet package, trymito.io, and other public Mito code.
fake2db
create custom test databases that are populated with fake data
sketch
AI code-writing assistant that understands data content
TigerBot
TigerBot: A multi-language multi-task LLM
gobblin
A distributed data integration framework that simplifies common aspects of big data integration such as data ingestion, replication, organization and lifecycle management for both streaming and batch data ecosystems.
generatedata
A powerful, feature-rich, random test data generator.
ISO-3166-Countries-with-Regional-Codes
ISO 3166-1 country lists merged with their UN Geoscheme regional codes in ready-to-use JSON, XML, CSV data sets