data
There are 19403 repositories under data topic.
TanStack/query
🤖 Powerful asynchronous state management, server-state utilities and data fetching for the web. TS/JS, React Query, Solid Query, Svelte Query and Vue Query.
metabase/metabase
The simplest, fastest way to get business intelligence and analytics to everyone in your company :yum:
run-llama/llama_index
LlamaIndex is a data framework for your LLM applications
SheetJS/sheetjs
📗 SheetJS Spreadsheet Data Toolkit -- New home https://git.sheetjs.com/SheetJS/sheetjs
vercel/swr
React Hooks for Data Fetching
DataExpert-io/data-engineer-handbook
This is a repo with links to everything you'd ever want to learn about data engineering
mendableai/firecrawl
🔥 Turn entire websites into LLM-ready markdown or structured data. Scrape, crawl and extract with a single API.
PrefectHQ/prefect
Prefect is a workflow orchestration framework for building resilient data pipelines in Python.
fivethirtyeight/data
Data and code behind the articles and graphics at FiveThirtyEight
airbytehq/airbyte
The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
prestodb/presto
The official home of the Presto distributed SQL query engine for big data
Sinaptik-AI/pandas-ai
Chat with your database (SQL, CSV, pandas, polars, mongodb, noSQL, etc). PandasAI makes data analysis conversational using LLMs (GPT 3.5 / 4, Anthropic, VertexAI) and RAG.
oxnr/awesome-bigdata
A curated list of awesome big data frameworks, ressources and other awesomeness.
faker-js/faker
Generate massive amounts of fake data in the browser and node.js
pwxcoo/chinese-xinhua
:orange_book: 中华新华字典数据库。包括歇后语,成语,词语,汉字。
apple/pkl
A configuration as code language with rich validation and tooling.
PRQL/prql
PRQL is a modern language for transforming data — a simple, powerful, pipelined SQL replacement
akfamily/akshare
AKShare is an elegant and simple financial data interface library for Python, built for human beings! 开源财经数据接口库
bchavez/Bogus
:card_index: A simple fake data generator for C#, F#, and VB.NET. Based on and ported from the famed faker.js.
rawgraphs/rawgraphs-app
A web interface to create custom vector-based visualizations on top of RAWGraphs core
mage-ai/mage-ai
🧙 Build, run, and manage data pipelines for integrating and transforming data.
mrdbourke/machine-learning-roadmap
A roadmap connecting many of the most important concepts in machine learning, how to learn them and what tools to use to perform them.
snowplow/snowplow
The leader in Next-Generation Customer Data Infrastructure
olifolkerd/tabulator
Interactive Tables and Data Grids for JavaScript
dformoso/machine-learning-mindmap
A mindmap summarising Machine Learning concepts, from Data Analysis to Deep Learning.
cloudquery/cloudquery
The open source high performance ELT framework powered by Apache Arrow
axa-group/Parsr
Transforms PDF, Documents and Images into Enriched Structured Data
flyteorg/flyte
Scalable and flexible workflow orchestration platform that seamlessly unifies data, ML and analytics stacks.
Countly/countly-server
Countly is a product analytics platform that helps teams track, analyze and act-on their user actions and behaviour on mobile, web and desktop applications.
airbnb/knowledge-repo
A next-generation curated knowledge sharing platform for data scientists and other technical professions.
cue-lang/cue
The home of the CUE language! Validate and define text-based and dynamic configuration
mdn/browser-compat-data
This repository contains compatibility data for Web technologies as displayed on MDN
superduper-io/superduper
Superduper: Build end-to-end AI applications and agent workflows on your existing data infrastructure and preferred tools - without migrating your data.
brianvoe/gofakeit
Random fake data generator written in go
ckan/ckan
CKAN is an open-source DMS (data management system) for powering data hubs and data portals. CKAN makes it easy to publish, share and use data. It powers catalog.data.gov, open.canada.ca/data, data.humdata.org among many other sites.
lk-geimfari/mimesis
Mimesis is a robust data generator for Python that can produce a wide range of fake data in multiple languages.