edgarRd
Currently working on Data Infrastructure and Big Data for AI. In the past, I built Graph Databases for Knowledge Graphs.
@airbnbSan Francisco, CA
Pinned Repositories
airflow
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
iceberg
Apache Iceberg
orc
Apache ORC - the smallest, fastest columnar storage for Hadoop workloads
spark
Apache Spark - A unified analytics engine for large-scale data processing
ClipperExperiments
Repository to modify and create experiments with clipper query re-writer.
gitignore
A collection of useful .gitignore templates
hanoitowers-piqle
Implementation of the Hanoi Towers using the Piqle Framework
hive
Mirror of Apache Hive
mlc-suite
Machine Learning Classification Suite - Implements several machine learning algorithms for classification, accepts input files in .arff format. Implemented in Python.
reair
ReAir is a collection of easy-to-use tools for replicating tables and partitions between Hive data warehouses.
edgarRd's Repositories
edgarRd/mlc-suite
Machine Learning Classification Suite - Implements several machine learning algorithms for classification, accepts input files in .arff format. Implemented in Python.
edgarRd/hanoitowers-piqle
Implementation of the Hanoi Towers using the Piqle Framework
edgarRd/ClipperExperiments
Repository to modify and create experiments with clipper query re-writer.
edgarRd/gitignore
A collection of useful .gitignore templates
edgarRd/hive
Mirror of Apache Hive
edgarRd/reair
ReAir is a collection of easy-to-use tools for replicating tables and partitions between Hive data warehouses.
edgarRd/dotfiles
My dotfiles collection
edgarRd/edgarrd.github.io
Personal Webpage
edgarRd/hadoop
Mirror of Apache Hadoop
edgarRd/hadoop-cluster-docker
Run Hadoop Custer within Docker Containers
edgarRd/iceberg
Apache Iceberg
edgarRd/iceberg-docs
Apache Iceberg Documentation Site
edgarRd/incubator-airflow
Apache Airflow (Incubating)
edgarRd/incubator-tinkerpop
Mirror of Apache TinkerPop (Incubating)
edgarRd/orc
Mirror of Apache Orc
edgarRd/presto
Distributed SQL query engine for big data
edgarRd/prezto
The configuration framework for Zsh
edgarRd/protege-ontology-client
Provides client functionality for the Protege Desktop application to connect to an OWL Ontology Server.
edgarRd/protege-ontology-server
An OWL ontology server for OWL API programs, e.g., Protege Desktop.
edgarRd/s3committer
Hadoop output committers for S3
edgarRd/spark
Apache Spark
edgarRd/spark-netflix
Netflix branches of Apache Spark