/flink-parcel

Flink parcel for Cloudera Manager

Primary LanguagePython

Flink Parcel

This repository contains a parcel for Apache Flink.

Currently it builds for Flink 1.0.3.

Usage

Move the parcel and the checksum file to the parcel repository of your CM server.


cp parcel/FLINK-1.0.3-p0-el7.parcel* /opt/cloudera/parcel-repo

Navigate to /cmf/parcel/status on the CM WebUI by clicking Parcels.

Click on Check for new Parcels. When Flink appears Click on distribute, then activate.

Log in to one of the machines having the active Flink parcel.

Start the Flink services (JobManager - master, TaskManager - slave):


export JAVA_HOME=/usr/java/jdk1.7.0_67-cloudera/; service flink-jobmanager start
export JAVA_HOME=/usr/java/jdk1.7.0_67-cloudera/; service flink-taskmanager start

Now you should be able to see the Flink WebUI on port 8081 on this host.

To run the WordCount example you can do the following:


export JAVA_HOME=/usr/java/jdk1.7.0_67-cloudera/;flink run /opt/cloudera/parcels/FLINK/usr/lib/flink/examples/batch/WordCount.jar --input /opt/cloudera/parcels/FLINK/usr/lib/flink/README.txt

It should produce the following output:


Usage: WordCount --input  --output 
Printing result to stdout. Use --output to specify output path.
06/28/2016 06:32:31	Job execution switched to status RUNNING.
06/28/2016 06:32:31	CHAIN DataSource (at main(WordCount.java:71) (org.apache.flink.api.java.io.TextInputFormat)) -> FlatMap (FlatMap at main(WordCount.java:79)) -> Combine(SUM(1), at main(WordCount.java:79)(1/1) switched to SCHEDULED 
06/28/2016 06:32:31	CHAIN DataSource (at main(WordCount.java:71) (org.apache.flink.api.java.io.TextInputFormat)) -> FlatMap (FlatMap at main(WordCount.java:79)) -> Combine(SUM(1), at main(WordCount.java:79)(1/1) switched to DEPLOYING 
06/28/2016 06:32:31	CHAIN DataSource (at main(WordCount.java:71) (org.apache.flink.api.java.io.TextInputFormat)) -> FlatMap (FlatMap at main(WordCount.java:79)) -> Combine(SUM(1), at main(WordCount.java:79)(1/1) switched to RUNNING 
06/28/2016 06:32:32	Reduce (SUM(1), at main(WordCount.java:79)(1/1) switched to SCHEDULED 
06/28/2016 06:32:32	Reduce (SUM(1), at main(WordCount.java:79)(1/1) switched to DEPLOYING 
06/28/2016 06:32:32	CHAIN DataSource (at main(WordCount.java:71) (org.apache.flink.api.java.io.TextInputFormat)) -> FlatMap (FlatMap at main(WordCount.java:79)) -> Combine(SUM(1), at main(WordCount.java:79)(1/1) switched to FINISHED 
06/28/2016 06:32:32	Reduce (SUM(1), at main(WordCount.java:79)(1/1) switched to RUNNING 
06/28/2016 06:32:32	DataSink (collect())(1/1) switched to SCHEDULED 
06/28/2016 06:32:32	DataSink (collect())(1/1) switched to DEPLOYING 
06/28/2016 06:32:32	Reduce (SUM(1), at main(WordCount.java:79)(1/1) switched to FINISHED 
06/28/2016 06:32:32	DataSink (collect())(1/1) switched to RUNNING 
06/28/2016 06:32:32	DataSink (collect())(1/1) switched to FINISHED 
06/28/2016 06:32:32	Job execution switched to status FINISHED.
(1,1)
(13,1)
(5d002,1)
(740,1)
(about,1)
(account,1)
(administration,1)
(algorithms,1)
(and,7)
(another,1)
...