OpenNMS Drift deployment in Kubernetes.
This is basically the Kubernetes
version of the work done here for OpenNMS Horizon 24. For learning purposes, Helm
charts and operators
are avoided for this solution on the main components, with the exceptions of the Ingress Controller and Cert-Manager. In the future, that might change to take advantage of these technologies.
Instead of using discrete EC2 instances, this repository explains how to deploy basically the same solution with Kubernetes
.
In this case, there are some additional features available in this particular solution compared with the original one, like Hasura, Cassandra Reaper and Kafka Manager.
Kafka
uses the hostPort
feature to expose the advertise external listeners on port 9094, so applications outside Kubernetes
like Minion
can access it. For this reason, Kafka
can be scaled up to the number of worker nodes on the Kubernetes
cluster.
- Install the kubectl binary. Make sure to have version 1.14 to use the
kustomize
integration. - Install the kustomize binary on your machine [Optional, but good to have for troubleshooting]
NOTE: Depending on the chosen platform, additional requirements might be needed. Check the respective
README
files for more information.
Proceed with the preferred cluster technology:
- Using Kops on AWS.
- Using EKS on AWS.
- Using Google Compute Platform.
- Using Microsoft Azure.
- Using Minikube on your machine (with restrictions).
To facilicate the process, everything is done through kustomize
.
To update the default settings, find the common-settings
under configMapGenerator
inside kustomization.yaml.
To update the default passwords, find the onms-passwords
under secretGenerator
inside kustomization.yaml.
Each cluster technology explains how to deploy the manifests.
As part of the deployment, some complementary RBAC permissions will be added, in case there is a need for adding operators and/or administrators to the OpenNMS namespace. Check namespace.yaml for more details.
Use the following to check whether or not all the resources have been created:
kubectl get all --namespace opennms
This deployment already contains Minions inside the opennms namespace for monitoring devices within the cluster. In order to have Minions outside the Kubernetes cluster, they should use the following resources in order to connect to OpenNMS and the dependent applications.
For AWS
using the domain aws.agalue.net
, the resources should be:
- OpenNMS Core:
https://onms.aws.agalue.net/opennms
- Kafka:
kafka.aws.agalue.net:9094
For example:
export DOMAIN="aws.agalue.net"
export LOCATION="Apex"
export INSTANCE_ID="K8S" # Must match kustomization.yaml
export JAAS_CFG='org.apache.kafka.common.security.scram.ScramLoginModule required username="opennms" password="0p3nNM5";' # Must match kustomization.yaml
docker run -it --rm --entrypoint cat opennms/minion:25.2.1 -- etc/system.properties > system.properties
echo "org.opennms.instance.id=${INSTANCE_ID}" >> system.properties
docker run -it --name minion \
-e MINION_ID=$LOCATION-minion-1 \
-e MINION_LOCATION=$LOCATION \
-e OPENNMS_HTTP_URL=https://onms.$DOMAIN/opennms \
-e OPENNMS_HTTP_USER=admin \
-e OPENNMS_HTTP_PASS=admin \
-e KAFKA_RPC_BOOTSTRAP_SERVERS=kafka.$DOMAIN:9094 \
-e KAFKA_RPC_AUTO_OFFSET_RESET=latest \
-e KAFKA_RPC_COMPRESSION_TYPE=gzip \
-e KAFKA_RPC_SASL_JAAS_CONFIG="$JAAS_CFG" \
-e KAFKA_RPC_SECURITY_PROTOCOL=SASL_PLAINTEXT \
-e KAFKA_RPC_SASL_MECHANISM=SCRAM-SHA-512 \
-e KAFKA_SINK_BOOTSTRAP_SERVERS=kafka.$DOMAIN:9094 \
-e KAFKA_SINK_ACKS=1 \
-e KAFKA_SINK_SASL_JAAS_CONFIG="$JAAS_CFG" \
-e KAFKA_SINK_SECURITY_PROTOCOL=SASL_PLAINTEXT \
-e KAFKA_SINK_SASL_MECHANISM=SCRAM-SHA-512 \
-p 8201:8201 \
-p 1514:1514/udp \
-p 1162:1162/udp \
-v $(pwd)/system.properties:/opt/minion/etc/system.properties \
opennms/minion:25.2.1 -f
IMPORTANT: Make sure to use the same version as OpenNMS. The above contemplates using a custom content for the
INSTANCE_ID
. Make sure it matches the content of kustomization.yaml.
WARNING: Make sure to use your own Domain and Location, and use the same version tag as the OpenNMS manifests.
- OpenNMS Core:
https://onms.aws.agalue.net/opennms/
(for administrative tasks) - OpenNMS UI:
https://onmsui.aws.agalue.net/opennms/
(for users/operators) - Grafana:
https://grafana.aws.agalue.net/
- Kibana:
https://kibana.aws.agalue.net/
(remember to enable monitoring) - Kafka Manager:
https://kafka-manager.aws.agalue.net/
(make sure to register the cluster usingzookeeper.opennms.svc.cluster.local:2181/kafka
for theCluster Zookeeper Hosts
) - Hasura GraphQL API:
https://hasura.aws.agalue.net/v1alpha1/graphql
- Hasura GraphQL Console:
https://hasura.aws.agalue.net/console
- Jaeger UI:
https://tracing.aws.agalue.net/
- Cassandra Reaper:
https://cassandra-reaper.aws.agalue.net/webui/
WARNING: Make sure to use your own Domain.
- Add SSL encryption with SASL Authentication for external Kafka (for Minions outside K8S/AWS). The challenge here is which FQDN will be taken in consideration for the certificates.
- Add Network Policies to control the communication between components (for example, only OpenNMS needs access to PostgreSQL and Cassandra; other component should not access those resources). A network manager like Calico is required.
- Design a solution to manage OpenNMS Configuration files (the
/opt/opennms/etc
directory), or use an existing one like ksync. - Investigate how to provide support for
HorizontalPodAutoscaler
for the data clusters like Cassandra, Kafka and Elasticsearch. Check here for more information. Although, using operators seems more feasible in this regard, due to the complexities when expanding/shrinking these kind of applications. - Add support for Cluster Autoscaler. Check what
kops
offers on this regard. - Add support for monitoring through Prometheus using Prometheus Operator. Expose the UI (including Grafana) through the Ingress controller.
- Expose the Kubernetes Dashboard through the Ingress controller.
- Design a solution to handle scale down of Cassandra and decommission of nodes; or investigate the existing operators.
- Explore a
PostgreSQL
solution like Spilo/Patroni using their Postgres Operator, to understand how to build a HA Postgres within K8s. Alternatively, we might consider the Crunchy Data Operator - Add a sidecar container on PostgreSQL using hasura to expose the DB schema through GraphQL. If a Postgres Operator is used, Hasura can be managed through a deployment instead.
- Explore a
Kafka
solution like Strimzi, an operator that supports encryption and authentication. - Build a VPC with the additional security groups using Terraform. Then, use
--vpc
and--node-security-groups
when callingkops create cluster
, as explained here. - Explore Helm, and potentially add support for it.