Repo for learning about Apache Flink
DevOps is prepared for Kubernetes.
Pre-requisites:
- minikube (local development)
- Create Flink cluster
kubectl apply -f devops
Note:
There is only one instance of job-manager since it's assumed that k8s will restart it automatically.
However after restart of job-manager info about jobs is lot and must be restored manually.
So if failure recovery should be supported for job-manager then cluster of job-managers must be created using Zookeeper
.
Task managers can be auto-scaled and will be managed by job-manager.
More details can be found in presentation.
- Run k8s proxy
kubectl proxy
- Enter the UI
or check URL of job-manager service
:
kubectl get service flink-jobmanager
or for minikube
@ localhost:
minikube service --url flink-jobmanager
Pre-requisites:
-
Scala
-
Sbt
sbt run
or to build:
sbt package
You can run socket-based samples as following:
- Open socket
nc -l 9000
- Start job
You can run Kafka-based samples as following:
docker-compose up
Other commands:
kafka-avro-console-consumer --topic song-feed --bootstrap-server localhost:9092 --from-beginning
kafka-topics --list --zookeeper localhost:32181
-
Go to UI console ->
Submit new job
-
Upload jar
-
Run uploaded jar with chosen main class (i.e.
com.mwronski.flink.batch.BatchWordCount
)
Before automating you might want to check configuration and blob.*
properties (especially blob.storage.directory
- directory for storing blobs (such as user JARs) on the TaskManagers).
This section describes how to run task using REST API.
Note: All below steps should be placed in bash script for automation purposes.
- Deploy JAR file
curl -v -F upload=@a.jar http://192.168.99.100:31273/jars/upload
- Get ID of uploaded JAR File
curl http://192.168.99.100:31273/jars
- Run task based on found ID of a JAR file
curl -XPOST http://192.168.99.100:31273/jars/5935d344-c7a8-44ed-898e-8a005262ecc5_a.jar/run?entry-class=com.mwronski.flink.batch.BatchWordCount
- Check job status
curl http://192.168.99.100:31273/jobs