rchakode/kube-opex-analytics

No cpuUsage and MeMusage metric popping up

agupta-ionos opened this issue · 19 comments

Hi Team,

I have installed Kubernetes Opex Analytics using Helm. But I am unable to see the metrics on the dashboard.

Can anyone would like to help?

Error:
The following items failed:

No cpuUsage metric on node X.X.X.X

Hi,
Can you check the pod logs please?

ubuntu@ecs-test:~/kube-opex-analytics$ kubectl describe po -n kube-opex-analytics
Name: kube-opex-analytics-845675bc58-6sm5j
Namespace: kube-opex-analytics
Priority: 0
Node: 192.168.0.214/192.168.0.214
Start Time: Fri, 26 Jul 2019 06:51:03 +0000
Labels: app.kubernetes.io/instance=kube-opex-analytics
app.kubernetes.io/name=kube-opex-analytics
pod-template-hash=4012316714
Annotations: kubernetes.io/availablezone: eu-de-01
Status: Running
IP: 172.16.0.14
Controlled By: ReplicaSet/kube-opex-analytics-845675bc58
Containers:
kube-opex-analytics:
Container ID: docker://f0b51b2e5f3e4f7bd189321a9688bb4fe84864a8836f72131b45c8d1e1884428
Image: rchakode/kube-opex-analytics:0.3.1
Image ID: docker-pullable://rchakode/kube-opex-analytics@sha256:3aad7d7c116da0f879a7099ef93e332d53ce16ecc230712cad04cf96403e1d18
Port: 5483/TCP
Host Port: 0/TCP
State: Running
Started: Fri, 26 Jul 2019 06:51:07 +0000
Ready: True
Restart Count: 0
Liveness: http-get http://:http/ delay=0s timeout=1s period=10s #success=1 #failure=3
Readiness: http-get http://:http/ delay=0s timeout=1s period=10s #success=1 #failure=3
Environment:
KOA_BILLING_CURRENCY_SYMBOL: $
KOA_BILLING_HOURLY_RATE: 0
KOA_COST_MODEL: CUMULATIVE_RATIO
KOA_DB_LOCATION: /data/db
KOA_K8S_API_ENDPOINT: https://kubernetes.default
KOA_K8S_API_VERIFY_SSL: false
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from kube-opex-analytics-token-s9dd5 (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
kube-opex-analytics-token-s9dd5:
Type: Secret (a volume populated by a Secret)
SecretName: kube-opex-analytics-token-s9dd5
Optional: false
QoS Class: BestEffort
Node-Selectors:
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message


Normal Scheduled 69s default-scheduler Successfully assigned kube-opex-analytics/kube-opex-analytics-845675bc58-6sm5j to 192.168.0.214
Normal Pulling 67s kubelet, 192.168.0.214 pulling image "rchakode/kube-opex-analytics:0.3.1"
Normal Pulled 66s kubelet, 192.168.0.214 Successfully pulled image "rchakode/kube-opex-analytics:0.3.1"
Normal SuccessfulMountVolume 65s (x2 over 69s) kubelet, 192.168.0.214 Successfully mounted volumes for pod "kube-opex-analytics-845675bc58-6sm5j_kube-opex-analytics(bc3da2dc-af71-11e9-96d3-fa163efd36b7)"
Normal SuccessfulCreate 65s kubelet, 192.168.0.214 Created container
Normal Started 65s kubelet, 192.168.0.214 Started container
Normal Healthy 60s (x2 over 62s) kubelet, 192.168.0.214 container docker://f0b51b2e5f3e4f7bd189321a9688bb4fe84864a8836f72131b45c8d1e1884428 in health status

joalx commented

hi @anubhav25gupta,

To get pod logs, use kubectl logs instead of kubectl describe.

I dont know which data just got populated but still showing error in the header a : No CPUUsage metric on node X.X.X.X

image

joalx commented

What is your version of Kubernetes?
You seem to have a RBAC issue to access metrics API:

2019-07-26 07:26:10,490 - kube-opex-analytics - ERROR - call to https://kubernetes.default/apis/metrics.k8s.io/v1beta1/nodes returned error ({"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"nodes.metrics.k8s.io is forbidden: User \"system:anonymous\" cannot list resource \"nodes\" in API group \"metrics.k8s.io\" at the cluster scope","reason":"Forbidden","details":{"group":"metrics.k8s.io","kind":"nodes"},"code":403}
)

The kubernetes version is : v1.11.3-r1.sp2

Its CCE 2.0

joalx commented

Do you use microk8s?
Anyway check that the metrics API is working, and if it's the case review your clusterrole and clusterbindingrole to check that kube-opex-analytics can access the metrics API

joalx commented

We haven't tested kube-opex-analytics with CCE, I suspect a specificity of RBAC handling on CCE. Or perhaps the metrics API is not enabled by default.

Please investigate in this direction.

I dont use Microk8s.

i have checked clusterrole and clusterbindings and everthing is looking normal to me.

Kindly advice me so as how to enable the metrics API by default.

Hi,

Please check the below note about the metrics server.

I have set the option prometheusOperator as enabled(true )during the deployment inHelm values.yaml file.

I am not able to manually run prometheus.yaml file. Getting the error as below:

ubuntu@ecs-test:~/kube-opex-analytics/tests/prometheus$ kubectl apply -f prometheus.yml
error: error validating "prometheus.yml": error validating data: [apiVersion not set, kind not set]; if you choose to ignore these errors, turn validation off with --validate=false

prometheus.yaml file :

global:
scrape_interval: 15s # By default, scrape targets every 15 seconds.
evaluation_interval: 15s

Attach these labels to any time series or alerts when communicating with

external systems (federation, remote storage, Alertmanager).

external_labels:
monitor: 'codelab-monitor'

rule_files:

- "first.rules"

- "second.rules"

A scrape configuration containing exactly one endpoint to scrape:

Here it's Prometheus itself.

scrape_configs:

The job name is added as a label job=<job_name> to any timeseries scraped from this config.

  • job_name: 'prometheus'

    Override the global default and scrape targets from this job every 5 seconds.

    scrape_interval: 5s

    static_configs:

    • targets: ['80.158.3.8:9090']
  • job_name: 'kube-opex-analytics'

    Override the global default and scrape targets from this job every 5 seconds.

    scrape_interval: 300s

    static_configs:

    • targets: ['80.158.3.8:5483']
joalx commented

The problem is not about Prometheus metrics, but Kubernetes metrics API.
And kube-opex-analytics is not able (likely due to RBAC permissions) to access the API to retrieve metrics.

ok doest it mean that only RBAC i.e. related to values.yaml file has to be changed?

joalx commented

Following up the error found in the pod' logs, the following link may help you to fix the issue with the access to your Kubernetes metrics API: kubernetes-sigs/metrics-server#95

It's a similar error.

Hi,

Where I need to paste this:

use kubeconfig, start metrics-server Add the following command to the container (added in YAML):
command:

  • /metrics-server
  • --kubelet-insecure-tls
  • --kubeconfig=/key/kubeconfig

--kubeconfig=/key/kubeconfig Use the specified kubeconfig to ensure that the inside of the container /key/kubeconfig is the kubeconfig content, you can use the way to hang on the volume

I am not getting what is the exact deployment file name?

Hi all,
@anubhav25gupta these settings should be done on your Kubernetes installation.
In your case with CCE it means may be to ask to your cloud provider

we don't have access to a CCE cluster to reproduce and eventually fix this issue.