patduin

@HotelsDotCom @ExpediaGroup Krakow

Pinned Repositories

cascading
Cascading is a feature rich API for defining and executing complex and fault tolerant data processing flows locally or on a cluster.
Language:Java345 34 23221
circus-train
Circus Train is a dataset replication tool that copies Hive tables between clusters and clouds.
Language:Java86 19 6715
waggle-dance
Hive federation service. Enables disparate tables to be concurrently accessed across multiple Hive deployments.
Language:Java268 22 10875
aws-glue-data-catalog-client-for-apache-hive-metastore
The AWS Glue Data Catalog is a fully managed, Apache Hive Metastore compatible, metadata repository. Customers can use the Data Catalog as a central repository to store structural and operational metadata for their data. AWS Glue provides out-of-box integration with Amazon EMR that enables customers to use the AWS Glue Data Catalog as an external Hive Metastore. This is an open-source implementation of the Apache Hive Metastore client on Amazon EMR clusters that uses the AWS Glue Data Catalog as an external Hive Metastore. It serves as a reference implementation for building a Hive Metastore-compatible client that connects to the AWS Glue Data Catalog. It may be ported to other Hive Metastore-compatible platforms such as other Hadoop and Apache Spark distributions
Language:Java0 0 00
cascading
Cascading is a feature rich API for defining and executing complex and fault tolerant data processing flows on a Hadoop cluster. See https://github.com/Cascading/cascading for the release repository.
Language:Java0 2 00
corc
An ORC File Scheme for the Cascading data processing platform.
Language:Java0 1 00
GameOfLife
Playing around and learning scala
Language:Scala1 1 00
jdeb
This library provides an Ant task and a Maven plugin to create Debian packages from Java builds in a truly cross platform manner.
Language:Java0 1 00
maven-sandbox
0 1 00
plunger
A unit testing framework for the Cascading data processing platform.
Language:Java0 1 00

patduin's Repositories

patduin/GameOfLife
Playing around and learning scala
Language:Scala1 1 00
patduin/aws-glue-data-catalog-client-for-apache-hive-metastore
The AWS Glue Data Catalog is a fully managed, Apache Hive Metastore compatible, metadata repository. Customers can use the Data Catalog as a central repository to store structural and operational metadata for their data. AWS Glue provides out-of-box integration with Amazon EMR that enables customers to use the AWS Glue Data Catalog as an external Hive Metastore. This is an open-source implementation of the Apache Hive Metastore client on Amazon EMR clusters that uses the AWS Glue Data Catalog as an external Hive Metastore. It serves as a reference implementation for building a Hive Metastore-compatible client that connects to the AWS Glue Data Catalog. It may be ported to other Hive Metastore-compatible platforms such as other Hadoop and Apache Spark distributions
Language:Java0 0 00
patduin/cascading
Cascading is a feature rich API for defining and executing complex and fault tolerant data processing flows on a Hadoop cluster. See https://github.com/Cascading/cascading for the release repository.
Language:Java0 2 00
patduin/corc
An ORC File Scheme for the Cascading data processing platform.
Language:Java0 1 00
patduin/jdeb
This library provides an Ant task and a Maven plugin to create Debian packages from Java builds in a truly cross platform manner.
Language:Java0 1 00
patduin/maven-sandbox
0 1 00
patduin/plunger
A unit testing framework for the Cascading data processing platform.
Language:Java0 1 00

patduin

Pinned Repositories

cascading

circus-train

waggle-dance

aws-glue-data-catalog-client-for-apache-hive-metastore

cascading

corc

GameOfLife

jdeb

maven-sandbox

plunger

patduin's Repositories

patduin/GameOfLife

patduin/aws-glue-data-catalog-client-for-apache-hive-metastore

patduin/cascading

patduin/corc

patduin/jdeb

patduin/maven-sandbox

patduin/plunger