AurelienWa's Stars
CartoDB/analytics-toolbox-core
A set of UDFs and Procedures to extend BigQuery, Snowflake, Redshift, Postgres and Databricks with Spatial Analytics capabilities
rominirani/Google-Cloud-Shell-Tutorial-Creation
A Tutorial on how to create Google Cloud Shell tutorials
GoogleCloudPlatform/training-data-analyst
Labs and demos for courses for GCP Training (http://cloud.google.com/training).
GoogleCloudPlatform/kubernetes-bigquery-python
Example Kubernetes app that shows how to build a 'pipeline' to stream data into BigQuery. Uses Redis or Google Cloud PubSub
alexvanboxel/airflow-gcp-examples
Repository with examples and smoke tests for the GCP Airflow operators and hooks
googleapis/google-cloud-java
Google Cloud Client Library for Java
spotify/scio
A Scala API for Apache Beam and Google Cloud Dataflow.
spotify/luigi
Luigi is a Python module that helps you build complex pipelines of batch jobs. It handles dependency resolution, workflow management, visualization etc. It also comes with Hadoop support built in.
spotify/spydra
Ephemeral Hadoop clusters using Google Compute Platform
databricks/learning-spark
Example code from Learning Spark book
apache/hive
Apache Hive
apache/sqoop
Mirror of Apache Sqoop
apache/orc
Apache ORC - the smallest, fastest columnar storage for Hadoop workloads
apache/avro
Apache Avro is a data serialization system.
apache/zeppelin
Web-based notebook that enables data-driven, interactive data analytics and collaborative documents with SQL, Scala and more.
apache/nifi
Apache NiFi
apache/hadoop
Apache Hadoop
apache/kafka
Mirror of Apache Kafka
apache/spark
Apache Spark - A unified analytics engine for large-scale data processing