kylepierce

Big fan of data

kylepierce's Stars

Velir/dbt-ga4
dbt Package for modeling raw data exported by Google Analytics 4. BigQuery support, only.
Language:SQL289128
googleapis/google-cloud-python
Google Cloud Client Library for Python
Language:Python4.7k1.5k
pathwaycom/llm-app
LLM App templates for Dynamic RAG. Ready to run with Docker,⚡in sync with your data sources.
3.4k191
mage-ai/mage-ai
🧙 Build, run, and manage data pipelines for integrating and transforming data.
Language:Python7.4k672
LewisCharlesBaker/droughty
Droughty helps keep your workflow dry
Language:Python595
holistics/dbml
Database Markup Language (DBML), designed to define and document database structures
Language:JavaScript2.5k163
fraibacas/prefect-orion
Language:Shell7220
trinodb/trino
Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
Language:Java9.8k2.8k
httpie/cli
🥧 HTTPie CLI — modern, user-friendly command-line HTTP client for the API era. JSON support, colors, sessions, downloads, plugins & more.
Language:Python32.7k3.7k
mermaid-js/mermaid
Generation of diagrams like flowcharts or sequence diagrams from text in a similar manner as markdown
Language:JavaScript68.6k6.1k
dgilland/pydash
The kitchen sink of Python utility libraries for doing "stuff" in a functional way. Based on the Lo-Dash Javascript library.
Language:Python1.3k87
capitalone/DataProfiler
What's in your data? Extract schema, statistics and entities from datasets
Language:Python1.4k157
macbre/sql-metadata
Uses tokenized query returned by python-sqlparse and generates query metadata
Language:Python761120
airbytehq/airbyte
The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
Language:Python14.7k3.8k
alex/nyt-2020-election-scraper
Language:HTML1.8k289
localstack/localstack
💻 A fully functional local AWS cloud stack. Develop and test your cloud & Serverless apps offline
Language:Python52.9k3.8k
datastacktv/data-engineer-roadmap
Roadmap to becoming a data engineer in 2021
12.2k1.3k
nteract/papermill
📚 Parameterize, execute, and analyze notebooks
Language:Python5.7k420
amundsen-io/amundsen
Amundsen is a metadata driven application for improving the productivity of data analysts, data scientists and engineers when interacting with data.
Language:Python4.3k948
simdjson/simdjson
Parsing gigabytes of JSON per second : used by Facebook/Meta Velox, the Node.js runtime, ClickHouse, WatermelonDB, Apache Doris, Milvus, StarRocks
Language:C++18.8k985
spotify/luigi
Luigi is a Python module that helps you build complex pipelines of batch jobs. It handles dependency resolution, workflow management, visualization etc. It also comes with Hadoop support built in.
Language:Python17.5k2.4k
jbesomi/texthero
Text preprocessing, representation and visualization from zero to hero.
Language:Python2.9k240
cube-js/cube
📊 Cube — The Semantic Layer for Building Data Applications
Language:Rust17.5k1.7k
philipperemy/name-dataset
The Python library for names.
Language:Python800151
ricklamers/gridstudio
Grid studio is a web-based application for data science with full integration of open source data science frameworks and languages.
Language:JavaScript8.9k1.5k
pdpipe/pdpipe
Easy pipelines for pandas DataFrames.
Language:Jupyter Notebook71445
MassMove/AttackVectors
A repository to monitor attack vectors from state-backed information operations
Language:HTML39137
python-streamz/streamz
Real-time stream processing for python
Language:Python1.2k145
PrefectHQ/prefect
Prefect is a workflow orchestration framework for building resilient data pipelines in Python.
Language:Python15.3k1.5k
dagster-io/dagster
An orchestration platform for the development, production, and observation of data assets.
Language:Python10.7k1.3k

kylepierce

kylepierce's Stars

Velir/dbt-ga4

googleapis/google-cloud-python

pathwaycom/llm-app

mage-ai/mage-ai

LewisCharlesBaker/droughty

holistics/dbml

fraibacas/prefect-orion

trinodb/trino

httpie/cli

mermaid-js/mermaid

dgilland/pydash

capitalone/DataProfiler

macbre/sql-metadata

airbytehq/airbyte

alex/nyt-2020-election-scraper

localstack/localstack

datastacktv/data-engineer-roadmap

nteract/papermill

amundsen-io/amundsen

simdjson/simdjson

spotify/luigi

jbesomi/texthero

cube-js/cube

philipperemy/name-dataset

ricklamers/gridstudio

pdpipe/pdpipe

MassMove/AttackVectors

python-streamz/streamz

PrefectHQ/prefect

dagster-io/dagster