Pinned Repositories
clusterless
Clusterless is a tool for scheduling decentralized, scalable, and secure data pipelines for continuously arriving data, across clouds.
subpop
A CLI for diffing datasets
tessellate
A data engineering cli for reading and writing data to/from multiple locations across multiple formats.
bash-emr
Simple bash functions for manipulating Amazon Elastic MapReduce clusters
cascading
Cascading is a feature rich API for defining and executing complex and fault tolerant data processing flows locally or on a cluster.
cascading.hbase
HBase adapters for Cascading
cascading.samples
Sample applications using Cascading
riffle
Annotations and Classes for managing and executing dependent processes
mini-parsers
Small simple parsers for data cleansing or command line argument parsing
pointer-path
A declarative API for batch processing schema-less nested data types like JSON
cwensel's Repositories
cwensel/cascading
Cascading is a feature rich API for defining and executing complex and fault tolerant data processing flows locally or on a cluster.
cwensel/bash-emr
Simple bash functions for manipulating Amazon Elastic MapReduce clusters
cwensel/riffle
Annotations and Classes for managing and executing dependent processes
cwensel/notebook
Random notes on distributed computing and stuff.
cwensel/cascading-local
Now incorporated into Cascading 4.x
cwensel/cascading-regression
cwensel/mapdb-jcache
JCache (jsr107) provider for MapDB database engine
cwensel/bilberry
an ElasticSearch gradle plugin for integration tests with ElasticSearch
cwensel/builder-generator-plus-v2
Guilder generator plus v2 (public).
cwensel/cascading-avro
Cascading Scheme for the Apache Avro data serialization format
cwensel/cascading-docs
cwensel/cwensel.github.io
cwensel/docbook2asciidoc
XSL for transforming DocBook to AsciiDoc
cwensel/elasticsearch-hadoop
A fork that restores Cascading support
cwensel/elasticsearch-index-cloner
Simple java tool just to tranfer/copy/clone an elasticsearch index on different cluster using the REST endpoinds and migrating settings and mappings as well
cwensel/Flapi
Flapi is an API generator for Java, which generates chained API's for improved fluency in your code.
cwensel/gradle-dplink-plugin
cwensel/jgrapht
Master repository for the JGraphT project
cwensel/jreleaser
:rocket: Release projects quickly and easily with JReleaser
cwensel/kafka-unit
cwensel/linq4j
A port of LINQ (Language-Integrated Query) to Java
cwensel/macrobase
MacroBase: A Search Engine for Fast Data
cwensel/opencensus-java
A stats collection and distributed tracing framework
cwensel/parboiled
Elegant parsing in Java and Scala - lightweight, easy-to-use, powerful.
cwensel/pointer-path
cwensel/sagemaker-workshop-420
MLinProduction SageMaker workshop hosted in April 2020
cwensel/scalding
A Scala API for Cascading
cwensel/sqlline
Shell for issuing SQL to relational databases via JDBC
cwensel/vagrant-cascading-hadoop-cluster
Deploying apache-hadoop in a virtualized cluster as easy as 1-2-3.
cwensel/wardley-omnigraffle
Wardley Mapping stencils for OmniGraffle software