/kubequery-aggregator

Primary LanguageGoApache License 2.0Apache-2.0

kubequery Aggregator using Postgres

centralized kubequery information in Postgres database.

Architecture

kubequery

Create a centralized 'loader' that polls a collection of kubernetes clusters using kubequery and store all results in a single Postgres database using the same schema as kubequery.

Schema: https://github.com/Uptycs/kubequery/blob/master/docs/schema.md

Features

Kubequery is usually installed on specific clusters and used to make 'live' queries against the local kubernetes resources of that cluster using SQL. This is a different approach where this software and kubequery is installed on only one cluster(the aggregator) and polls other clusters(the targets) using kubequery remote-access. The results are saved in a centralized Postgres database.

Benefits:

  • Store kube resources from hundreds of clusters in a single Postgres database.

  • Almost real-time queries against hundreds of clusters since all the data is in the same Postgres database.

  • Index can be added to the PG database for improved access time.

  • Polling the clusters and saving the results in the database is fast and can be done every few minutes. Clusters can be polled in parallel for improved speed.

  • On large clusters, SQL joins can be really slow against the sqlite database used by kubequery. The Postgres tables are not virtual and can be optimized to solve this problem.

  • Columns in the kubequery schema like labels and annotations are TEXT but contain json information. These columns can be converted to Postgres JSONB for easy access.

  • Only the basic "select * from kubernetes_.." queries are executed by kubequery on the target clusters during aggregation. All complex queries (JOINS, multi-cluster) are done on the Postgres database.

  • Note: For speed, the Postgres database should be located on the aggregator cluster.

Kubequery changes

For this application, kubequery had to be modified to support remote cluster access. Currently, kubequery assumes that it is running inside the k8s cluster being probed. This change allows kubequery to connect to a remote cluster and extract resource data. See ./kubequery/ for details.

Kubequery script using parameters generated by the loader:

#!/bin/sh
# runquery <table> <token> <cluster_addr> <cluster> <uuid> <ix>

TABLE=$1
export KUBEQUERY_TOKEN=$2
export KUBEQUERY_ADDR=$3
export CLUSTER_NAME=$4
export CLUSTER_UID=$5
export CLUSTER_IX=$6

echo "select * from $TABLE;" | /opt/uptycs/bin/basequery  --flagfile=/opt/uptycs/etc/kubequery.flags  --config_path=/opt/uptycs/etc/kubequery.conf --extensions_socket=/opt/uptycs/var/kubequeryi.em$TABLE.$CLUSTER_IX  --extensions_autoload=/opt/uptycs/etc/autoload.exts  --extensions_require=kubequery  --extension_event_tables=kubernetes_events  --disable_database  --json  --disable_events=false  -S > /tmp/${CLUSTER_NAME}-$TABLE.json

Example queries

# Count namespaces and pods by Cluster

=> select cluster_name , count(distinct namespace) ns_count , count(*) pod_cnt from kubernetes_pods group by cluster_name;
 cluster_name          | ns_count | pod_cnt 
-----------------------+---------+---------
 national1             |      343 |    9636
 national2             |       43 |   30324
 national3             |       29 |    9988
 regional1             |       49 |    8342
 regional2             |       50 |    6845
 regional4             |       16 |    2943
 regional5             |       33 |    3062
 regional6             |       38 |    4489

(11 rows)

Time: 314.390 ms


# MOST POPULAR CONTAINER IMAGES
=> select c.image, count(*) pod_cnt from kubernetes_pod_containers c group by c.image order by count(*) desc limit 10;
                            image                             | pod_cnt
--------------------------------------------------------------+--------
 hub.obfusca.net/xxxxx-packager/system/system:3.5.1-1         | 18080
 hub.obfusca.net/xxxxx-arch/varnish-image:7.0.2-dev1          | 11376
 hub.obfusca.net/xxxxx-packager/system/system:3.3.2-1         | 10820
 hub.obfusca.net/library/telegraf:1.13.4                      |  5434
 registry.xxxxxxx.net/anchorfree/twemproxy:latest             |  5384
 hub.obfusca.net/rio/services/rccs-decoder:1.1.13_959         |  4217
 hub.obfusca.net/k8s-eng/xxxxx/rdei/k8s-dns-node-cache:1.21.1 |  3727
 hub.obfusca.net/xxxxx/rdei/node-exporter:v1.3.1              |  3382
 hub.obfusca.net/k8s-eng/xxxxx/rdei/sumatra:0.42.07           |  3315
 hub.obfusca.net/rio/services/bmw:1.9.0_1473                  |  3299
(10 rows)

Time: 230.233 ms
 

Installation

  • Add the postgres schema from database/ to your Postgres instance.
  • Create the custom image (uptycs/kubequery:1.1.1-remote) using the info from kubequery/
  • Create the kubernetes Deployment using the info from kubernetes/

Sample Loader Run

# concurrency=5
$ /loader -c 5 

4 Cluster START testcluster-01 - https://testcluster-01:6443 
1 Cluster START testcluster2-01 - https://testcluster2-01:6443 
2 Cluster START testcluster3-01 - https://testcluster3-01:6443 
3 Cluster START testcluster5-01 - https://testcluster5-01:6443 
0 Cluster START testcluster6-01 - https://testcluster-02:6443 
------------------------------------------------------------------------------
2 Cluster REPORT testcluster3-01 - https://testcluster3-01:6443 
------------------------------------------------------------------------------
  testcluster3-01 kubernetes_namespaces TABLE
  testcluster3-01 kubernetes_namespaces Rows=19, Duration=1.05929269s
  testcluster3-01 kubernetes_nodes TABLE
  testcluster3-01 kubernetes_nodes Rows=6, Duration=976.341347ms
  testcluster3-01 kubernetes_pods TABLE
  testcluster3-01 kubernetes_pods Rows=204, Duration=1.688286795s
  testcluster3-01 kubernetes_pod_containers TABLE
  testcluster3-01 kubernetes_pod_containers Rows=274, Duration=1.883914113s

5 Cluster START testcluster5 - https://testcluster5:6443 
------------------------------------------------------------------------------
4 Cluster REPORT testcluster-01 - https://testcluster-01:6443 
------------------------------------------------------------------------------
  testcluster-01 kubernetes_namespaces TABLE
  testcluster-01 kubernetes_namespaces Rows=33, Duration=894.902131ms
  testcluster-01 kubernetes_nodes TABLE
  testcluster-01 kubernetes_nodes Rows=60, Duration=970.209593ms
  testcluster-01 kubernetes_pods TABLE
  testcluster-01 kubernetes_pods Rows=647, Duration=2.524690614s
  testcluster-01 kubernetes_pod_containers TABLE
  testcluster-01 kubernetes_pod_containers Rows=953, Duration=3.389727493s

...