Pinned Repositories
algebird
Abstract Algebra for Scala
ambrose
A platform for visualization and real-time monitoring of data workflows
aniket486.github.com
git pages
archaius
Library for configuration management API
arrow
Mirror of Apache Arrow
avro
Apache Avro
ksql
KSQL - a Streaming SQL Engine for Apache Kafka
mapd-core
The MapD Core database
pig
Mirror of Apache Pig
StarCluster
StarCluster is a utility for creating and managing computing clusters hosted on Amazon's Elastic Compute Cloud (EC2).
aniket486's Repositories
aniket486/avro
Apache Avro
aniket486/Big-Data-Benchmark-for-Big-Bench
Big Bench Workload Development
aniket486/bigdata-interop
Libraries and tools for interoperability between Hadoop-related open-source software and Google Cloud Platform.
aniket486/cloud-sql-jdbc-socket-factory
aniket486/datafu
Mirror of Apache DataFu
aniket486/datahub
A Generalized Metadata Search & Discovery Tool
aniket486/dataproc-initialization-actions
Run in all nodes of your cluster before the cluster starts - lets you customize your cluster
aniket486/dataprocspawner
aniket486/flink
Apache Flink
aniket486/hadoop
Mirror of Apache Hadoop
aniket486/hive
Mirror of Apache Hive
aniket486/hive-bigquery-storage-handler
Hive Storage Handler for interoperability between BigQuery and Apache Hive
aniket486/hive-testbench
aniket486/iceberg
Apache Iceberg
aniket486/incubator-druid
Apache Druid (Incubating) - Column oriented distributed data store ideal for powering interactive applications
aniket486/inverting-proxy
Reverse proxy that inverts the direction of traffic
aniket486/metabase
The simplest, fastest way to get business intelligence and analytics to everyone in your company :yum:
aniket486/metacat
aniket486/OpenLineage
An Open Standard for lineage metadata collection
aniket486/parquet-mr
Apache Parquet
aniket486/presto
Distributed SQL query engine for big data
aniket486/professional-services
Common solutions and tools developed by Google Cloud's Professional Services team
aniket486/qUtils
Utility codes useful for random tasks
aniket486/RemoteShuffleService
aniket486/spark-1
Apache Spark - A unified analytics engine for large-scale data processing
aniket486/spark-bigquery-connector
The connector uses the Spark SQL Data Source API to read data from Google BigQuery.
aniket486/spark-on-k8s-operator
Kubernetes operator for managing the lifecycle of Apache Spark applications on Kubernetes.
aniket486/spydra
Ephemeral Hadoop clusters using Google Compute Platform
aniket486/unitycatalog
Open, Multi-modal Catalog for Data & AI
aniket486/velox
A new C++ vectorized database acceleration library aimed to optimizing query engines and data processing systems.