The goal of this project is to implement a "News" processing pipeline composed of five Spring Boot
applications: producer-api
, categorizer-service
, collector-service
, publisher-api
and news-client
.
-
Spring Cloud Stream
to build highly scalable event-driven applications connected with shared messaging systems; -
Spring Cloud Schema Registry
that supports schema evolution so that the data can be evolved over time; besides, it lets you store schema information in a textual format (typically JSON) and makes that information accessible to various applications that need it to receive and send data in binary format; -
Spring Data Elasticsearch
to persist data inElasticsearch
; -
Spring Cloud OpenFeign
to write web service clients easily; -
Thymeleaf
as HTML template; -
Zipkin
to visualize traces between and within applications; -
Eureka
as service registration and discovery.
Note
|
In docker-swarm-environment repository, it is shown how to deploy this project into a cluster of Docker Engines in swarm mode.
|
-
producer-api
Spring Boot
Web Java application that creates news and pushes news events toproducer.news
topic inKafka
. -
categorizer-service
Spring Boot
Web Java application that listens to news events inproducer.news
topic inKafka
, categorizes and pushes them tocategorizer.news
topic. -
collector-service
Spring Boot
Web Java application that listens for news events incategorizer.news
topic inKafka
, saves them inElasticsearch
and pushes the news events tocollector.news
topic. -
publisher-api
Spring Boot
Web Java application that reads directly fromElasticsearch
and exposes a REST API. It doesn’t listen fromKafka
. -
news-client
Spring Boot
Web java application that provides a User Interface to see the news. It implements aWebsocket
that consumes news events from the topiccollector.news
. So, news are updated on the fly on the main page. Besides,news-client
communicates directly withpublisher-api
whenever search for a specific news or news update are needed.The
Websocket
operation is shown in the short gif below. A news is created inproducer-api
and, immediately, it is shown innews-client
.
-
In a terminal, make sure you are in
spring-cloud-stream-elasticsearch
root folder -
Run the following command to generate
NewsEvent
./mvnw clean install --projects commons-news
It will install
commons-news-1.0.0.jar
in you localMaven
repository, so that it can be visible by all services.
-
Open a terminal and inside
spring-cloud-stream-elasticsearch
root folder rundocker-compose up -d
-
Wait a until all containers are Up (healthy). You can check their status by running
docker-compose ps
Inside spring-cloud-stream-elasticsearch
root folder, run the following Maven
commands in different terminals
-
eureka-server
./mvnw clean spring-boot:run --projects eureka-server
-
producer-api
./mvnw clean spring-boot:run --projects producer-api -Dspring-boot.run.jvmArguments="-Dserver.port=9080"
-
categorizer-service
./mvnw clean spring-boot:run --projects categorizer-service -Dspring-boot.run.jvmArguments="-Dserver.port=9081"
-
collector-service
./mvnw clean spring-boot:run --projects collector-service -Dspring-boot.run.jvmArguments="-Dserver.port=9082"
-
publisher-api
./mvnw clean spring-boot:run --projects publisher-api -Dspring-boot.run.jvmArguments="-Dserver.port=9083"
-
news-client
./mvnw clean spring-boot:run --projects news-client
-
In a terminal, make sure you are in
spring-cloud-stream-elasticsearch
root folder -
In order to build the application’s docker images, run the following script
./build-apps.sh
-
producer-api
Environment Variable Description KAFKA_HOST
Specify host of the
Kafka
message broker to use (defaultlocalhost
)KAFKA_PORT
Specify port of the
Kafka
message broker to use (default29092
)SCHEMA_REGISTRY_HOST
Specify host of the
Schema Registry
to use (defaultlocalhost
)SCHEMA_REGISTRY_PORT
Specify port of the
Schema Registry
to use (default8081
)EUREKA_HOST
Specify host of the
Eureka
service discovery to use (defaultlocalhost
)EUREKA_PORT
Specify port of the
Eureka
service discovery to use (default8761
)ZIPKIN_HOST
Specify host of the
Zipkin
distributed tracing system to use (defaultlocalhost
)ZIPKIN_PORT
Specify port of the
Zipkin
distributed tracing system to use (default9411
) -
categorizer-service
Environment Variable Description KAFKA_HOST
Specify host of the
Kafka
message broker to use (defaultlocalhost
)KAFKA_PORT
Specify port of the
Kafka
message broker to use (default29092
)SCHEMA_REGISTRY_HOST
Specify host of the
Schema Registry
to use (defaultlocalhost
)SCHEMA_REGISTRY_PORT
Specify port of the
Schema Registry
to use (default8081
)EUREKA_HOST
Specify host of the
Eureka
service discovery to use (defaultlocalhost
)EUREKA_PORT
Specify port of the
Eureka
service discovery to use (default8761
)ZIPKIN_HOST
Specify host of the
Zipkin
distributed tracing system to use (defaultlocalhost
)ZIPKIN_PORT
Specify port of the
Zipkin
distributed tracing system to use (default9411
) -
collector-service
Environment Variable Description ELASTICSEARCH_HOST
Specify host of the
Elasticsearch
search engine to use (defaultlocalhost
)ELASTICSEARCH_NODES_PORT
Specify nodes port of the
Elasticsearch
search engine to use (default9300
)ELASTICSEARCH_REST_PORT
Specify rest port of the
Elasticsearch
search engine to use (default9200
)KAFKA_HOST
Specify host of the
Kafka
message broker to use (defaultlocalhost
)KAFKA_PORT
Specify port of the
Kafka
message broker to use (default29092
)SCHEMA_REGISTRY_HOST
Specify host of the
Schema Registry
to use (defaultlocalhost
)SCHEMA_REGISTRY_PORT
Specify port of the
Schema Registry
to use (default8081
)EUREKA_HOST
Specify host of the
Eureka
service discovery to use (defaultlocalhost
)EUREKA_PORT
Specify port of the
Eureka
service discovery to use (default8761
)ZIPKIN_HOST
Specify host of the
Zipkin
distributed tracing system to use (defaultlocalhost
)ZIPKIN_PORT
Specify port of the
Zipkin
distributed tracing system to use (default9411
) -
publisher-api
Environment Variable Description ELASTICSEARCH_HOST
Specify host of the
Elasticsearch
search engine to use (defaultlocalhost
)ELASTICSEARCH_NODES_PORT
Specify nodes port of the
Elasticsearch
search engine to use (default9300
)ELASTICSEARCH_REST_PORT
Specify rest port of the
Elasticsearch
search engine to use (default9200
)EUREKA_HOST
Specify host of the
Eureka
service discovery to use (defaultlocalhost
)EUREKA_PORT
Specify port of the
Eureka
service discovery to use (default8761
)ZIPKIN_HOST
Specify host of the
Zipkin
distributed tracing system to use (defaultlocalhost
)ZIPKIN_PORT
Specify port of the
Zipkin
distributed tracing system to use (default9411
) -
news-client
Environment Variable Description KAFKA_HOST
Specify host of the
Kafka
message broker to use (defaultlocalhost
)KAFKA_PORT
Specify port of the
Kafka
message broker to use (default29092
)SCHEMA_REGISTRY_HOST
Specify host of the
Schema Registry
to use (defaultlocalhost
)SCHEMA_REGISTRY_PORT
Specify port of the
Schema Registry
to use (default8081
)EUREKA_HOST
Specify host of the
Eureka
service discovery to use (defaultlocalhost
)EUREKA_PORT
Specify port of the
Eureka
service discovery to use (default8761
)ZIPKIN_HOST
Specify host of the
Zipkin
distributed tracing system to use (defaultlocalhost
)ZIPKIN_PORT
Specify port of the
Zipkin
distributed tracing system to use (default9411
)
Application | URL |
---|---|
producer-api |
|
publisher-api |
|
news-client |
-
Stop applications
-
If they were started with
Maven
, go to the terminals where they are running and pressCtrl+C
-
If they were started as a Docker container, make sure you are in
spring-cloud-stream-elasticsearch
and run the script below./stop-apps.sh
-
-
Stop and remove docker-compose containers, networks and volumes
docker-compose down -v
-
Eureka
Eureka
can be accessed at http://localhost:8761 -
Kafka Topics UI
Kafka Topics UI
can be accessed at http://localhost:8085 -
Zipkin
Zipkin
can be accessed at http://localhost:9411The figure below shows an example of the complete flow a news passes through. It goes since
producer-api
, where the news is created, untilnews-client
. -
Kafka Manager
Kafka Manager
can be accessed at http://localhost:9000The figure below shows the Kafka topics consumers. As we can see, the consumers are updated as the
lag
is0
Configuration
-
First, you must create a new cluster. Click on
Cluster
(dropdown button on the header) and then onAdd Cluster
-
Type the name of your cluster in
Cluster Name
field, for example:MyCluster
-
Type
zookeeper:2181
inCluster Zookeeper Hosts
field -
Enable checkbox
Poll consumer information (Not recommended for large # of consumers if ZK is used for offsets tracking on older Kafka versions)
-
Click on
Save
button at the bottom of the page.
-
-
Schema Registry UI
Schema Registry UI
can be accessed at http://localhost:8001 -
Elasticsearch REST API
Check ES is up and running
curl localhost:9200
Check indexes in ES
curl "localhost:9200/_cat/indices?v"
Check news index mapping
curl localhost:9200/news/_mapping
Simple search
curl "localhost:9200/news/_search?pretty"
-
add alias to the index: wait for this feature be available in Spring Data Elasticsearch (https://jira.spring.io/browse/DATAES-192)
-
news-client
: bug. everytime sync is clicked, it enables Websocket; -
news-client
: if websocket is enabled/disabled, sync button should be disabled/enabled; -
news-client
: implement pagination;