data-transformation
There are 622 repositories under data-transformation topic.
mahmoud/glom
☄️ Python's nested data operator (and CLI), for all your declarative restructuring needs. Got data? Glom it! ☄️
hi-primus/optimus
:truck: Agile Data Preparation Workflows made easy with Pandas, Dask, cuDF, Dask-cuDF, Vaex and PySpark
2ndQuadrant/pglogical
Logical Replication extension for PostgreSQL 17, 16, 15, 14, 13, 12, 11, 10, 9.6, 9.5, 9.4 (Postgres), providing much faster replication than Slony, Bucardo or Londiste, as well as cross-version upgrades.
zinggAI/zingg
Scalable identity resolution, entity resolution, data mastering and deduplication using ML
bruin-data/bruin
Build data pipelines with SQL and Python, ingest data from different sources, add quality checks, and build end-to-end flows.
mattt/TransformerKit
A block-based API for NSValueTransformer, with a growing collection of useful examples.
raystack/optimus
Optimus is an easy-to-use, reliable, and performant workflow orchestrator for data transformation, data modeling, pipelines, and data quality management.
SebKrantz/collapse
Advanced and Fast Data Transformation in R
microsoft/prose
Microsoft Program Synthesis using Examples SDK is a framework of technologies for the automatic generation of programs from input-output examples. This repo includes samples and sample data for the Microsoft Program Synthesis using Example SDK.
ScriptFUSION/Porter
:lipstick: Durable and asynchronous data imports for consuming data at scale and publishing testable SDKs.
dbohdan/sqawk
Like awk, but with SQL and table joins
jupyter-naas/naas
Low-code Python library to safely use notebooks in production: schedule workflows, generate assets, trigger webhooks, send notifications, build pipelines, manage secrets (Cloud-only)
fastverse/fastverse
An Extensible Suite of High-Performance and Low-Dependency Packages for Statistical Computing and Data Manipulation in R
feichao93/temme
📄 Concise selector to extract JSON from HTML.
mahmoudparsian/data-algorithms-with-spark
O'Reilly Book: [Data Algorithms with Spark] by Mahmoud Parsian
markus-wa/cq
Clojure Query: A Command-line Data Processor for JSON, YAML, EDN, XML and more
SETL-Framework/setl
A simple Spark-powered ETL framework that just works 🍺
simongray/clojure-dsl-resources
A curated list of Clojure resources for dealing with domain-specific languages.
mahmoudparsian/big-data-mapreduce-course
Big Data Modeling, MapReduce, Spark, PySpark @ Santa Clara University
strengejacke/sjmisc
Data transformation and utility functions for R
jim-schwoebel/allie
🤖 An automated machine learning framework for audio, text, image, video, or .CSV files (50+ featurizers and 15+ model trainers). Python 3.6 required.
ToucanToco/weaverbird
A visual data pipeline builder with various backends
data-integrations/wrangler
Wrangler Transform: A DMD system for transforming Big Data
galliaproject/gallia-core
A schema-aware Scala library for data transformation
aws-samples/aws-dbs-refarch-datalake
Reference Architectures for Datalakes on AWS
dry-rb/dry-transformer
Data transformation toolkit
devsgnr/breadroll
breadroll 🥟 is a simple lightweight library for data processing operations written in Typescript and powered by Bun.
neurons-me/all.this
All.This is a modular framework for managing and standardizing data structures, enabling seamless interaction across the neurons.me ecosystem. It transforms objects like images, text, and audio into structured formats optimized for machine learning and deep learning applications.
bhrnjica/daany
Daany - .NET DAta ANalYtics .NET library with the implementation of DataFrame, Time series decompositions and Linear Algebra routines BLASS and LAPACK.
developerforce/DataWeaveInApex
Examples for working with DataWeave scripts from Apex.
assemblee-virtuelle/Semantic-Bus
object flow treatment, data transformation
scopashq/typestream
⚡️ Next-generation data transformation framework for TypeScript that puts developer experience first
nilportugues/php-serializer
Serialize PHP variables, including objects, in any format. Support to unserialize it too.
hopsoft/pipe_envy
Elixir style pipe operator for Ruby
cjdoris/Chevrons.jl
Your friendly >> chevron >> based syntax for piping data through multiple transformations.
alexocode/babel
Data transformations made easy