TWITER-FLOW
This repo houses the Equity Sim Twitter analytics Java pipeline for GCP Dataflow
Usage
Setup mvn project with com.equitysim package
mvn archetype:generate \
-DarchetypeGroupId=org.apache.beam \
-DarchetypeArtifactId=beam-sdks-java-maven-archetypes-examples \
-DarchetypeVersion=2.1.0 \
-DgroupId=com.equitysim \
-DartifactId=twitter_flow \
-Dversion="0.1" \
-Dpackage=com.equitysim \
-DinteractiveMode=false
Build and run twitter_flow pipeline locally
mvn compile exec:java \
-Dexec.mainClass=com.equitysim.TwitterFlowPipeline \
-Dexec.args="--output=./output/"
Build and run twitter_flow pipeline on GCP Dataflow
mvn compile exec:java \
-Dexec.mainClass=com.equitysim.TwitterFlowPipeline \
-Dexec.args="\
--project=${PROJECT_ID} \
--stagingLocation=gs://${BUCKET_NAME}/staging \
--runner=DataflowRunner \
--output=gs://${BUCKET_NAME}/output "
Build and run twitter_flow pipeline on GCP Dataflow using script
./pipeline.sh chc-admin twitter-flow fintech-tweets run
gcloud beta dataflow jobs list --status=(active|terminated|all)
https://drive.google.com/file/d/1oFADb-ePYLYWIBxGC3eakWABBrTtZohv/view?usp=sharing