/datapipeline

Data pipeline

Primary LanguageJava

Data pipeline using Apache Beam, BigQuery and Datalab

Overview

  • Batch Data source(Database,s3, HDFS)------> DataFlow ----> ETL ------> (Data sink eg BigQuery, Database, GCS,s3)
  • Streaming Data sources(Mobile, Web etc) ------> PubSub ------> DataFlow ----> ETL ---> (Data sink eg BigQuery, Database, GCS,s3)

Prerequisites

  • Google Project with Billing