Telemetry Ingestion on Google Cloud Platform
A monorepo for documentation and implementation of the Mozilla telemetry ingestion system deployed to Google Cloud Platform (GCP).
There are currently four components:
- ingestion-edge: a simple Python service for accepting HTTP messages and delivering to Google Cloud Pub/Sub
- ingestion-beam: a Java module defining Apache Beam jobs for streaming and batch transformations of ingested messages
- ingestion-sink: a Java application that runs in Kubernetes, reading input from Google Cloud Pub/Sub and emitting records to outputs like GCS or BigQuery
- ingestion-core: a Java module for code shared between ingestion-beam and ingestion-sink
For more information, see the documentation.
Java 11 support is a work in progress for the Beam Java SDK, so this project requires
Java 8. Maven has been configured to compile for Java 8 when using newer versions of the
JDK, but support is only guaranteed for JDK 8.
To manage multiple local JDKs, consider jenv and the
jenv enable-plugin maven
command.