IoT - It's the thing you want! And so here's a full-stack demo.
This demo shows you how to setup a service on DCOS that
- streams tweets using the twitter stream API to Kafka
- processes those streams from Kafka using Spark
- stores the enriched data into Cassandra
- and make the data queryable easily via SQL by using Zeppelin
There are presentations about this demo:
- Cassandra summit 2015 - Simplifying Streaming Analytics by Brenden Matthews with an emphasis on data processing
- Hamburg Mesos Meetup - Deploying your Service on DCOS by Peter Kolloch with an emphasis on deployment of non-trivial services
Follow the instructions here.
- You'll need enough capacity to run all the services, which may require at least 5 worker nodes
- SSH access to the cluster
- Internet access from inside the cluster
When you open the dashboard, follow the instructions to install the DCOS CLI.
You can either execute ./bin/base-install.sh <your DCOS cluster base URL>
or run the commands yourself.
You want to dive in deep and do everything yourself? Then point your DCOS client installation at the correct cluster and execute the commands below.
If you just set up your CLI for the first time, you can probably skip this step.
Use dcos config set core.dcos_url <your DCOS core URL>
, e.g.
dcos config set core.dcos_url "http://peter-22f-elasticl-1ejv8oa4oyqw8-626125644.us-west-2.elb.amazonaws.com"
.
# Start DCOS services:
dcos package install marathon-lb
dcos package install cassandra
dcos package install kafka --options=kafka-options.json
# Check that Cassandra & Kafka are up
dcos cassandra connection
dcos kafka connection
- Copy
etc/config_template.yml
toetc/config.yml
- Create a Twitter account with API keys (see here for details)
- Insert your credentials into the configuration file
Execute ./bin/install.sh
.
NOTE: This calls a python 3 script with yaml and jinja modules. You can use pip and homebrew to update your system.
brew install python3
pip3 install pyyaml
pip3 install jinja2
The install.sh
script uses the ./bin/prepare-config.py
script to convert YAML configuration files into
JSON digestible by Marathon.
It produces a Marathon group that is then sent to the Marathon REST API for deployment:
target/demo.json
for the tweet producers and the tweet consumer.
The prepare-config.py supports some special processing instructions inside of your YAML files to
- include other files (
!include
) - use configuration values (
!cfg_str
,!cfg_path
) - or to loop over configuration and apply a template (
!map
)
Once Zeppelin is running, navigate to the UI and import the notebook from this link:
https://raw.githubusercontent.com/mesosphere/iot-demo/master/zeppelin-notebook.json
NOTE: if /service/zeppelin is showing as Disconnected (and hence can’t load the notebook), add the following labels to the zeppelin service and restart:
HAPROXY_0_VHOST = [elb hostname]
HAPROXY_GROUP = external
You can get the ELB hostname from the CCM “Public Server” link. Once Zeppelin restarts, this should allow you to use that link to reach the Zeppelin GUI in “connected” mode.
SSH into one of the masters or worker nodes in the cluster, and try cqlsh:
# Run cqlsh:
docker run -ti cassandra:2.2.5 cqlsh node-0.cassandra.mesos