data-transformation
There are 731 repositories under data-transformation topic.
mahmoud/glom
☄️ Python's nested data operator (and CLI), for all your declarative restructuring needs. Got data? Glom it! ☄️
hi-primus/optimus
:truck: Agile Data Preparation Workflows made easy with Pandas, Dask, cuDF, Dask-cuDF, Vaex and PySpark
bruin-data/bruin
Build data pipelines with SQL and Python, ingest data from different sources, add quality checks, and build end-to-end flows.
2ndQuadrant/pglogical
Logical Replication extension for PostgreSQL 17, 16, 15, 14, 13, 12, 11, 10, 9.6, 9.5, 9.4 (Postgres), providing much faster replication than Slony, Bucardo or Londiste, as well as cross-version upgrades.
mattt/TransformerKit
A block-based API for NSValueTransformer, with a growing collection of useful examples.
raystack/optimus
Optimus is an easy-to-use, reliable, and performant workflow orchestrator for data transformation, data modeling, pipelines, and data quality management.
SebKrantz/collapse
Advanced and Fast Data Transformation in R
microsoft/prose
Microsoft Program Synthesis using Examples SDK is a framework of technologies for the automatic generation of programs from input-output examples. This repo includes samples and sample data for the Microsoft Program Synthesis using Example SDK.
ScriptFUSION/Porter
:lipstick: Durable and asynchronous data imports for consuming data at scale and publishing testable SDKs.
weAIDB/awesome-data-llm
Official Repository of "LLM × DATA" Survey Paper
dbohdan/sqawk
Like awk, but with SQL and table joins
fastverse/fastverse
An Extensible Suite of High-Performance and Low-Dependency Packages for Statistical Computing and Data Manipulation in R
feichao93/temme
📄 Concise selector to extract JSON from HTML.
mahmoudparsian/data-algorithms-with-spark
O'Reilly Book: [Data Algorithms with Spark] by Mahmoud Parsian
simongray/clojure-dsl-resources
A curated list of Clojure resources for dealing with domain-specific languages.
markus-wa/cq
Clojure Query: A Command-line Data Processor for JSON, YAML, EDN, XML and more
SETL-Framework/setl
A simple Spark-powered ETL framework that just works 🍺
mahmoudparsian/big-data-mapreduce-course
Big Data Modeling, MapReduce, Spark, PySpark @ Santa Clara University
strengejacke/sjmisc
Data transformation and utility functions for R
jim-schwoebel/allie
🤖 An automated machine learning framework for audio, text, image, video, or .CSV files (50+ featurizers and 15+ model trainers). Python 3.6 required.
data-integrations/wrangler
Wrangler Transform: A DMD system for transforming Big Data
ToucanToco/weaverbird
A visual data pipeline builder with various backends
galliaproject/gallia-core
A schema-aware Scala library for data transformation
aws-samples/aws-dbs-refarch-datalake
Reference Architectures for Datalakes on AWS
dry-rb/dry-transformer
Data transformation toolkit
devsgnr/breadroll
breadroll 🥟 is a simple lightweight library for data processing operations written in Typescript and powered by Bun.
neurons-me/all.this
All.This is a modular framework for managing and standardizing data structures, enabling seamless interaction across the neurons.me ecosystem. It transforms objects like images, text, and audio into structured formats optimized for machine learning and deep learning applications.
bhrnjica/daany
Daany - .NET DAta ANalYtics .NET library with the implementation of DataFrame, Time series decompositions and Linear Algebra routines BLASS and LAPACK.
developerforce/DataWeaveInApex
Examples for working with DataWeave scripts from Apex.
assemblee-virtuelle/Semantic-Bus
object flow treatment, data transformation
scopashq/typestream
⚡️ Next-generation data transformation framework for TypeScript that puts developer experience first
nilportugues/php-serializer
Serialize PHP variables, including objects, in any format. Support to unserialize it too.
hopsoft/pipe_envy
Elixir style pipe operator for Ruby
cjdoris/Chevrons.jl
Your friendly >> chevron >> based syntax for piping data through multiple transformations.
nicosuave/awesome-sqlmesh
A curated list of awesome SQLMesh resources
bloomberg/pycsvw
A tool to read CSV files with CSVW metadata and transform them into other formats.