streamersk

A realtime data streaming application to access, aggregate and format events

Architecture

Setup: Download and Start Flink

Download a binary from the downloads page. You can pick any Hadoop/Scala combination you like. If you plan to just use the local file system, any Hadoop version will work fine. Go to the download directory.

Unpack the downloaded archive.

$ cd ~/Downloads        # Go to download directory
$ tar xzf flink-*.tgz   # Unpack the downloaded archive
$ mv flink-1.9.1 ~/     # copy flink to home dir 
$ cd flink-1.9.1

Start a Local Flink Cluster

$ ./bin/start-cluster.sh  # Start Flink

Check the Dispatcher’s web frontend at http://localhost:8081 and make sure everything is up and running. The web frontend should report a single available TaskManager instance.

Dispatcher: Overview

You can also verify that the system is running by checking the log files in the logs directory:

$ tail log/flink-*-standalonesession-*.log

Download kafka

https://kafka.apache.org/downloads

start zookeeper server

$./bin/zookeeper-server-start.sh ./config/zookeeper.properties

start broker

$./bin/kafka-server-start.sh ./config/server.properties

create topic “mytopic”

$ ./bin/kafka-topics.sh --create --topic mytopic --zookeeper localhost:2181 --partitions 1 --replication-factor 1

Describe topic "mytopic"

$ ./bin/kafka-topics.sh --describe --zookeeper localhost:2181 --topic mytopic

produce something into the topic (write something and hit enter)

$ ./bin/kafka-console-producer.sh --topic mytopic --broker-list localhost:9092

consume from the topic using the console producer

$ ./bin/kafka-console-consumer.sh --topic mytopic --zookeeper localhost:2181

Clone Apache Ignite

$ git clone https://github.com/apache/ignite

Build Apache Ignite

$ mvn clean package install -DskipTests

Build the Flink program :

$ mvn clean package

Submit the Flink program :

$ ~/flink-1.9.1/bin/flink run ./target/streamersk-extensions-1.0.0-SNAPSHOT.jar src/main/resources/application.properties

produce something into the topic (write something and hit enter)

$ ./bin/kafka-console-producer.sh --topic mytopic --broker-list localhost:9092

If you are using print() The .out file will print the counts at the end of each time window as long as words are floating in, e.g.:

$ tail -f log/flink-*-taskexecutor-*.out
12314213 : 1
12314214 : 3

Install pip & kafka-python

$ curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py
$ python get-pip.py --user
$ pip install kafka-python --user

Use data_publisher.py

To run a sample data publisher you can use the data_publisher.py which will read the data from item.txt and publish data to kafka topic

cd data
python data_publisher.py

Ignite rest service

To check the cache key values you can use the Ignite rest service

$ curl -X GET http://localhost:8080/ignite\?cmd\=getall\&k1\=12314213\&cacheName\=testCache

Scan cache

To check all the keys from an Ignite cache the following rest service can be used

$ curl -X GET http://localhost:8080/ignite?cmd=qryscanexe&pageSize=10&cacheName=testCache

Ignite Web Console

Ignite Web Console Build Instructions

Install MongoDB (version >=3.2.0 <=3.4.15) using instructions from http://docs.mongodb.org/manual/installation.
Install Node.js (version >=8.0.0) using installer from https://nodejs.org/en/download/current for your OS.
Change directory to 'modules/web-console/backend' and run "npm install --no-optional" for download backend dependencies.
Change directory to 'modules/web-console/frontend' and run "npm install --no-optional" for download frontend dependencies.
Build ignite-web-agent module follow instructions from 'modules/web-console/web-agent/README.txt'.
Copy ignite-web-agent-.zip from 'modules/web-console/web-agent/target' to 'modules/web-console/backend/agent_dists' folder.
Unzip ignite-web-agent-.zip in 'modules/web-console/backend/agent_dists'
run './ignite-web-agent.sh' inside ignite-web-agent- folder

Steps 1 - 4 should be executed once.

Ignite Web Console Run In Development Mode

Configure MongoDB to run as service or in terminal change dir to $MONGO_INSTALL_DIR/server/3.2/bin and start MongoDB by executing "mongod".
In new terminal change directory to 'modules/web-console/backend'. If needed run "npm install --no-optional" (if dependencies changed) and run "npm start" to start backend.
In new terminal change directory to 'modules/web-console/frontend'. If needed run "npm install --no-optional" (if dependencies changed) and start webpack in development mode "npm run dev".
In browser open: http://localhost:9000

To stop Flink when you’re done type:

$ ./bin/stop-cluster.sh

samaitra/streamersk-extensions

streamersk

Architecture

Setup: Download and Start Flink

Unpack the downloaded archive.

Start a Local Flink Cluster

Download kafka

start zookeeper server

start broker

create topic “mytopic”

Describe topic "mytopic"

produce something into the topic (write something and hit enter)

consume from the topic using the console producer

Clone Apache Ignite

Build Apache Ignite

Build the Flink program :

Submit the Flink program :

produce something into the topic (write something and hit enter)

Install pip & kafka-python

Use data_publisher.py

Ignite rest service

Scan cache

Ignite Web Console

Ignite Web Console Build Instructions

Ignite Web Console Run In Development Mode

To stop Flink when you’re done type: