kubernetes-sigs/kind

Resource Metrics API (metrics-server)

hjacobs opened this issue ยท 45 comments

The metrics-server should be installed out of the box to enable kubectl top .. and tools like kube-ops-view. I tried to install the metrics-server, but it did not work for me, it just logs errors like:

E0322 20:17:29.205246       1 reststorage.go:129] unable to fetch node metrics for node "kind-control-plane": no metrics known for node
E0322 20:17:29.212924       1 reststorage.go:144] unable to fetch pod metrics for pod kube-system/kube-proxy-gwhkb: no metrics known for pod
E0322 20:17:29.212948       1 reststorage.go:144] unable to fetch pod metrics for pod kube-system/weave-net-w2lsb: no metrics known for pod
E0322 20:17:29.212954       1 reststorage.go:144] unable to fetch pod metrics for pod kube-system/metrics-server-fc6d4999b-6mz4l: no metrics known for pod
E0322 20:17:29.212960       1 reststorage.go:144] unable to fetch pod metrics for pod kube-system/coredns-86c58d9df4-xl5rm: no metrics known for pod
E0322 20:17:29.212966       1 reststorage.go:144] unable to fetch pod metrics for pod kube-system/kube-scheduler-kind-control-plane: no metrics known for pod
E0322 20:17:29.212971       1 reststorage.go:144] unable to fetch pod metrics for pod kube-system/kube-controller-manager-kind-control-plane: no metrics known for pod
E0322 20:17:29.212977       1 reststorage.go:144] unable to fetch pod metrics for pod kube-system/kube-apiserver-kind-control-plane: no metrics known for pod
E0322 20:17:29.212982       1 reststorage.go:144] unable to fetch pod metrics for pod kube-system/etcd-kind-control-plane: no metrics known for pod
E0322 20:17:29.212988       1 reststorage.go:144] unable to fetch pod metrics for pod kube-system/coredns-86c58d9df4-c8nxt: no metrics known for pod

Our out of the box install is currently kubeadm + cni. I'd want to know why kubeadm doesn't configure it by default first ๐Ÿค”

I really want to replace my local Minikube environment with kind and the Metrics API is essential for this to happen ๐Ÿ˜„

I would also be fine with the Metrics API being an "addon" which can be easily enabled via kind CLI --- Minikube does it like this with the old (deprecated) "heapster" name: https://github.com/kubernetes/minikube/blob/master/docs/addons.md

try the manifests from here:
https://github.com/luxas/kubeadm-workshop/tree/master/demos/monitoring

but kubeadm is not bundling metrics support, because the kubeadm cluster is a minimal viable one - i.e. no addons except kube-proxy and a DNS server.

/triage support

@neolit123 thanks for the hint, but that manifest (which runs the old v0.2.1 of metrics server btw) also did not work for me. The error from metrics server:

E0323 09:05:05.004941       1 summary.go:97] error while getting metrics summary from Kubelet kind-control-plane(172.17.0.2:10255): Get http://172.17.0.2:10255/stats/summary/: dial tcp 172.17.0.2:10255: getsockopt: connection refused
E0323 09:06:05.006613       1 summary.go:97] error while getting metrics summary from Kubelet kind-control-plane(172.17.0.2:10255): Get http://172.17.0.2:10255/stats/summary/: dial tcp 172.17.0.2:10255: getsockopt: connection refused
E0323 09:07:05.007664       1 summary.go:97] error while getting metrics summary from Kubelet kind-control-plane(172.17.0.2:10255): Get http://172.17.0.2:10255/stats/summary/: dial tcp 172.17.0.2:10255: getsockopt: connection refused
E0323 09:08:05.003351       1 summary.go:97] error while getting metrics summary from Kubelet kind-control-plane(172.17.0.2:10255): Get http://172.17.0.2:10255/stats/summary/: dial tcp 172.17.0.2:10255: getsockopt: connection refused

I'm confused on what the correct way to fix this is: apparently the old version of metrics server (v0.2.1) is still the way to go? The flags differ between the current (v0.3.1) and the old version (v0.2.1), also not sure if I need to additionally deploy cadvisor (e.g. like here)?

Hoping for someone more knowledgeable than me to help here ๐Ÿ˜„

sorry, I'm not particularly familiar with this, I'll see if I can find someone who is

@neolit123 I tried the 0.3.1 manifests before and did not succeed so far ๐Ÿ˜ž

@hjacobs
please add this flags to your metric-deployment

    args:
        - --kubelet-insecure-tls
        - --kubelet-preferred-address-types=InternalIP

@bugbuilder yeah! That (adding the two args) worked with the official metrics server manifests in https://github.com/kubernetes-incubator/metrics-server/tree/master/deploy/1.8%2B ๐ŸŽ‰

$kubectl top nodes
NAME                 CPU(cores)   CPU%   MEMORY(bytes)   MEMORY%   
kind-control-plane   374m         4%     1104Mi          6%        
$ kubectl top pod --all-namespaces
NAMESPACE     NAME                                         CPU(cores)   MEMORY(bytes)   
kube-system   coredns-86c58d9df4-cmlzl                     6m           8Mi             
kube-system   coredns-86c58d9df4-jzr2n                     6m           9Mi             
kube-system   etcd-kind-control-plane                      46m          32Mi            
kube-system   kube-apiserver-kind-control-plane            75m          357Mi           
kube-system   kube-controller-manager-kind-control-plane   89m          66Mi            
kube-system   kube-proxy-vfxds                             11m          15Mi            
kube-system   kube-scheduler-kind-control-plane            27m          12Mi            
kube-system   metrics-server-76db6db868-mp8tb              2m           13Mi            
kube-system   weave-net-srjkt                              4m           120Mi           

kube-ops-view now also works fine on kind ๐ŸŽ‰
Screenshot_2019-03-31_11-14-20

So how can we get the Resource Metrics API deployment packaged into the default kind cluster creation (or as addon)?

@hjacobs I just create kind-dev (WIP) that will help me with addons like: metallb, ingress-nginx, metrics, etc.

I created a gist for the working Metrics Server API deployment manifests, so anybody can just try it out:

kubectl apply -f https://gist.githubusercontent.com/hjacobs/69b6844ba8442fcbc2007da316499eb4/raw/5b8678ac5e11d6be45aa98ca40d17da70dcb974f/kind-metrics-server.yaml

@bugbuilder thanks, but ideally standard addons should be part of the kind CLI (like kind enable addon metrics-server or similar). This is how Minikube does it.

Yep, but in the meantime I need something to start working with Kind. #253

@hjacobs thanks for your manifest

Hello, thank you for your manifest! Does anyone know if is it possible to create a ServiceMonitor to have the metrics from metrics-server?? So far, when I launch your manifest and a ServiceMonitor like the following one but I get the server returned HTTP status 403 Forbidden


> 
---
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  labels:
    k8s-app: metrics-server
  name: metrics-server
  namespace: kube-system
spec:
  endpoints:
 - interval: 10s
    port: https-metrics
    scheme: https
    tlsConfig:
      insecureSkipVerify: true
  jobLabel: k8s-app
  namespaceSelector:
    matchNames:
    - kube-system
  selector:
    matchLabels:
      k8s-app: metrics-server


HPA tests require the metrics server and are now in conformance, I want to see if kubeadm should be shipping this by default but one way or another kind needs to ship the metrics APIs :-)
/assign
cc @neolit123

@BenTheElder
my personal vote is -1 on including the metrics server addon as part of kind or kubeadm deployments by default.

@neolit123 why? We need metrics APIs to run Kubernetes tests at parity with the existing test setups, and it seems these are commonly available / depended on.

Note: we do not aim for mere conformance, but conformance is definitely the minimum bar.

definitely out of scope for kubeadm. a metrics system is not a essential dependency.
kind(er) might enable it if conformance demands it.

We need metrics APIs to run Kubernetes tests at parity with the existing test setups, and it seems these are commonly available / depended on

can you show examples?

HPA tests require the metrics server and are now in conformance,

latest comments suggest that this will be revised, possibly metrics tests will be skiped if no metrics system is present. i think that creating a mock metrics system in the framework is better but someone needs the bandwidth to work on that.

kind(er) might enable it if conformance demands it.

Conformance OR reasonable testing / user usage, including HPA / dashboard / ...

can you show examples?

The test that was promoted to conformance was already one of the default presubmit tests for Kubernetes (see also #701), eventually we should be able to run ~all of these (relatively trivial things block this currently, such as node exec mechanisms, we're working on it).

latest comments suggest that this will be revised, possibly metrics tests will be skiped if no metrics system is present.

That means we test less. A) currently skipping is banned in conformance and B) we don't want to do less testing with kind because we can't be bothered to configure something that enables a default-enabled v1 core API to function. We only want to skip tests that we can't technically accomplish.

I gather some people think this should not be a core API, but it already is and v1 / default available at this point so that seems like a moot argument.

i think that creating a mock metrics system in the framework is better but someone needs the bandwidth to work on that.

Metrics can't be in the framework as it's HPA reading them? I don't think we should HPA with fake metrics.

these are topics for sig-arch(conformance), testing commons and possibly sig-instrumentation.
i personally do not like that HPA requires a metrics system to begin with.

if we have to add the metrics server for kind(er) deployments so be it, but for kubeadm we cannot, as neither HPA or the metric server are essential for the "minimal viable cluster" concept.

these are topics for sig-arch(conformance), testing commons and possibly sig-instrumentation.
i personally do not like that HPA requires a metrics system to begin with.

HPA requiring a metrics system or not is at best sig-instrumentation, but realistically I don't see that changing. (also unclear that there's anything actually wrong with that).

if we have to add the metrics server for kind(er) deployments so be it, but for kubeadm we cannot, as neither HPA or the metric server are essential for the "minimal viable cluster" concept.

Sure, I can agree that kubeadm does not necessarily need to handle this, but I doubt we're going to significantly change HPA at this point and kind not supporting it seems less useful than kind supporting it, regardless of how individual test cases are handled.

As a kubernetes dev I should be able to legitimately test everything and as an end user I should be able to leverage common APIs when testing / developing my application.

Technically a "minimum viable cluster" for some form of "viable" (conformance?) does not need dynamic PVC either, but in practice not having it has been problematic for lots of real, otherwise relatively portable usage. We'll likely need to fix that for kind as well (though I'd expect it to also be out of scope for kubeadm).

also unclear that there's anything actually wrong with that

reading the docs it seems quite coupled to metrics, so i also don't expect this to change.

As a kubernetes dev I should be able to legitimately test everything and as an end user I should be able to leverage common APIs when testing / developing my application.

that is true. HPA on it's own as a feature seems great to be part of conformance, but i don't think the metric server addon requirement should be part of conformance. if yes, then all deployers have to enable it (if not already) - kops, kubeadm, kubespray, etc. i guess this will be decided before 1.16.

does not need dynamic PVC either

interestingly i have not seen kubeadm feature requests to enable PVC, while we do get the occasional "please enable {PSP|PDB|metric-server|dashboard}".

tentatively revisiting this during the next release. might slip to later.

If you came here because you want to deploy metrics-server to your kind cluster:

kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/download/v0.3.6/components.yaml
kubectl patch deployment metrics-server -n kube-system -p '{"spec":{"template":{"spec":{"containers":[{"name":"metrics-server","args":["--cert-dir=/tmp", "--secure-port=4443", "--kubelet-insecure-tls","--kubelet-preferred-address-types=InternalIP"]}]}}}}'

This will deploy metrics-server 0.3.6 and patch the deployment to fix kubernetes-sigs/metrics-server#131. It worked for me using kind 0.7.0.

I tried the above commands suggested by @rabenhorst but they don't work for me :(

commands:

kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/download/v0.3.6/components.yaml
	kubectl patch deployment metrics-server -n kube-system -p '{"spec":{"template":{"spec":{"containers":[{"name":"metrics-server","args":["--cert-dir=/tmp", "--secure-port=4443", "--kubelet-insecure-tls","--kubelet-preferred-address-types=InternalIP"]}]}}}}'

logs:

I0502 13:59:19.147717       1 serving.go:312] Generated self-signed cert (/tmp/apiserver.crt, /tmp/apiserver.key)
I0502 13:59:19.620092       1 secure_serving.go:116] Serving securely on [::]:4443

after I run command kubectl top nodes here other logs:

E0502 14:00:08.367258       1 reststorage.go:135] unable to fetch node metrics for node "kind-control-plane": no metrics known for node
E0502 14:00:08.367286       1 reststorage.go:135] unable to fetch node metrics for node "kind-worker": no metrics known for node
E0502 14:00:08.367292       1 reststorage.go:135] unable to fetch node metrics for node "kind-worker2": no metrics known for node
E0502 14:00:09.866095       1 reststorage.go:135] unable to fetch node metrics for node "kind-control-plane": no metrics known for node
E0502 14:00:09.866125       1 reststorage.go:135] unable to fetch node metrics for node "kind-worker": no metrics known for node
E0502 14:00:09.866130       1 reststorage.go:135] unable to fetch node metrics for node "kind-worker2": no metrics known for node

I have a simple KinD configuration:

apiVersion: kind.x-k8s.io/v1alpha4
kind: Cluster
nodes:
  - role: control-plane
    image: kindest/node:v1.18.0@sha256:0e20578828edd939d25eb98496a685c76c98d54084932f76069f886ec315d694
  - role: worker
    image: kindest/node:v1.18.0@sha256:0e20578828edd939d25eb98496a685c76c98d54084932f76069f886ec315d694
  - role: worker
    image: kindest/node:v1.18.0@sha256:0e20578828edd939d25eb98496a685c76c98d54084932f76069f886ec315d694

KinD version:
kind v0.7.0 go1.13.6 darwin/amd64

Just leaving this here since I (and anyone else who might be using Ansible with KinD to deploy metrics-server) will probably find it in search results again at some point: To add the two required options to the official metrics-server manifest prior to deploying it into the cluster, here's a few Ansible tasks you can use. Hacky but I like to rely on the official upstream manifest for my testing:

---
- name: Download metrics-server manifest.
  get_url:
    url: https://github.com/kubernetes-sigs/metrics-server/releases/download/{{ metrics_server_version }}/components.yaml
    dest: /tmp/metrics-server.yaml
    mode: 0644

- name: Modify the manifest to allow insecure TLS for testing.
  lineinfile:
    path: /tmp/metrics-server.yaml
    state: present
    regexp: "^.+{{ item }}$"
    line: "          - --{{ item }}"
    insertafter: "^.+args:$"
  with_items:
    - kubelet-preferred-address-types=InternalIP
    - kubelet-insecure-tls

- name: Deploy metrics-server into the cluster.
  community.kubernetes.k8s:
    state: present
    src: /tmp/metrics-server.yaml
    wait: true

For posterity; Here is the current solution with helm helm install ms stable/metrics-server -n kube-system --set=args={--kubelet-insecure-tls} Sorry for necro. ๐Ÿ™‡โ€โ™€๏ธ

Circling back to this:

interestingly i have not seen kubeadm feature requests to enable PVC, while we do get the occasional "please enable {PSP|PDB|metric-server|dashboard}".

  • PSP is officially deprecated in the coming Kubernetes v1.21.0
  • PDB: actually not all that up on PDB, unclear on this one
  • metric-server: seems to be necessary, unless you want to use the prometheus adapter instead (which seems overkill for kind imo)
  • dashboard: we've had ~0 requests in kind, I think there is also a lot of alternatives in this space now.

PVC on the other hand, is very commonly necessary, not going anywhere, but generally not difficult to deploy or part of the cloud provider, it seems understandable that kubeadm wouldn't see demand for this, but kind absolutely did (and pretty much any other cluster level tool I think).

That said, I actually haven't seen that much demand yet.

that is true. HPA on it's own as a feature seems great to be part of conformance, but i don't think the metric server addon requirement should be part of conformance. if yes, then all deployers have to enable it (if not already) - kops, kubeadm, kubespray, etc. i guess this will be decided before 1.16.

To be clear: conformance nearly required that some implementation must be present. There was never an attempt to make it metrics-server specific. In fact the PR was from redhat / openshift, and IIRC they ship the prometheus adapter instead of metrics-server.

AIUI kops, kubespray and most managed solutions do already provide it (though perhaps optionally). minikube has it as an "addon" built-in, but not by default.

i don't love the insecure TLS that seems to be in use so far here, but I think we should evaluate how "heavy" this is to ship by default for those that aren't using it (memory, cpu, node image size...), and at the very least get a doc up somewhere similar to the loadbalancer guide we just added.

If you want to use metrics-server 0.5, the default chart registry is not updated anymore. Here is the latest config that works (I'm using Terraform to deploy it):

resource "helm_release" "metrics-server" {
  name       = "metrics-server"
  namespace  = "kube-system"
  repository = "https://charts.bitnami.com/bitnami"
  chart      = "metrics-server"
  version    = "5.8.11"

  set {
    name  = "extraArgs.kubelet-insecure-tls"
    value = true
  }

  set {
    name  = "extraArgs.kubelet-preferred-address-types"
    value = "InternalIP"
  }

  set {
    name  = "apiService.create"
    value = true
  }
}

If someone blindly uses above patch argument it may not work with latest metric-server release.
I have created gist for steps I followed to get metric-server 0.5.0 working,
https://gist.github.com/sanketsudake/a089e691286bf2189bfedf295222bd43

Until this is official, here is how I installed it today using helm cli:

helm repo add metrics-server https://kubernetes-sigs.github.io/metrics-server/

helm repo update

helm upgrade --install --set args={--kubelet-insecure-tls} metrics-server metrics-server/metrics-server --namespace kube-system

This installs the latest version of the chart/app from https://artifacthub.io/packages/helm/metrics-server/metrics-server

Any updates on this?

Any updates on this?

If there were, they'd be posted here to this tracking issue ๐Ÿ™ƒ

This is an optional addition (these metrics are not part of conformance), and possible to install after creating the cluster.
If someone wanted to write up a guide, we could point to it from the docs, which are under site/.

The current approach is not something we want to bake in by default. It will make the node images larger and is generally non-essential. The most recent suggested approach #398 (comment) looks straightforward to run after kind create cluster.

hjaco

Hi hjaco,

Did not work for me.

PS C:\Users\lpolli> kubectl apply -f https://gist.githubusercontent.com/hjacobs/69b6844ba8442fcbc2007da316499eb4/raw/5b8678ac5e11d6be45aa98ca40d17da70dcb974f/kind-metrics-server.yaml
clusterrole.rbac.authorization.k8s.io/system:aggregated-metrics-reader created
serviceaccount/metrics-server created
service/metrics-server created
clusterrole.rbac.authorization.k8s.io/system:metrics-server created
clusterrolebinding.rbac.authorization.k8s.io/system:metrics-server created
resource mapping not found for name: "metrics-server:system:auth-delegator" namespace: "" from "https://gist.githubusercontent.com/hjacobs/69b6844ba8442fcbc2007da316499eb4/raw/5b8678ac5e11d6be45aa98ca40d17da70dcb974f/kind-metrics-server.yaml": no matches for kind "ClusterRoleBinding" in version "rbac.authorization.k8s.io/v1beta1"
ensure CRDs are installed first
resource mapping not found for name: "metrics-server-auth-reader" namespace: "kube-system" from "https://gist.githubusercontent.com/hjacobs/69b6844ba8442fcbc2007da316499eb4/raw/5b8678ac5e11d6be45aa98ca40d17da70dcb974f/kind-metrics-server.yaml": no matches for kind "RoleBinding" in version "rbac.authorization.k8s.io/v1beta1"
ensure CRDs are installed first
resource mapping not found for name: "v1beta1.metrics.k8s.io" namespace: "" from "https://gist.githubusercontent.com/hjacobs/69b6844ba8442fcbc2007da316499eb4/raw/5b8678ac5e11d6be45aa98ca40d17da70dcb974f/kind-metrics-server.yaml": no matches for kind "APIService" in version "apiregistration.k8s.io/v1beta1"
ensure CRDs are installed first
resource mapping not found for name: "metrics-server" namespace: "kube-system" from "https://gist.githubusercontent.com/hjacobs/69b6844ba8442fcbc2007da316499eb4/raw/5b8678ac5e11d6be45aa98ca40d17da70dcb974f/kind-metrics-server.yaml": no matches for kind "Deployment" in version "extensions/v1beta1"
ensure CRDs are installed first
PS C:\Users\lpolli>

Until this is official, here is how I installed it today using helm cli:

helm repo add metrics-server https://kubernetes-sigs.github.io/metrics-server/

helm repo update

helm upgrade --install --set args={--kubelet-insecure-tls} metrics-server metrics-server/metrics-server --namespace kube-system

This installs the latest version of the chart/app from https://artifacthub.io/packages/helm/metrics-server/metrics-server

I've installed it successfully, but still i don't see any metrics in Lens. Do i have to configure something else?

I've installed it successfully, but still i don't see any metrics in Lens. Do i have to configure something else?

If kubectl top nodes works then metrics server is fine and everything else is configuration of other product. Raise it with Lens support.

I've installed it successfully, but still i don't see any metrics in Lens. Do i have to configure something else?

If kubectl top nodes works then metrics server is fine and everything else is configuration of other product. Raise it with Lens support.

you're right. The kubectl top nodes and kubectl top pods commands work successfully, it should be a configuration problem of Lens. Thanks!

Do we still need this? I agree, I don't think we want it included by default. The only action item I saw was potentially adding docs for adding it after cluster creation. But there has also been recent discussion about removing Ingress docs since it's not really within the scope, or resources, of the kind team to maintain docs for external projects. Unless there's something very unique and kind-specific, I'd think it would be better to leave it out.

I might be jumping the gun, but I'm going to close this. If one of the maintainers thinks there is something to be done in kind itself for this, please reopen.

/close

@stmcginnis: Closing this issue.

In response to this:

Do we still need this? I agree, I don't think we want it included by default. The only action item I saw was potentially adding docs for adding it after cluster creation. But there has also been recent discussion about removing Ingress docs since it's not really within the scope, or resources, of the kind team to maintain docs for external projects. Unless there's something very unique and kind-specific, I'd think it would be better to leave it out.

I might be jumping the gun, but I'm going to close this. If one of the maintainers thinks there is something to be done in kind itself for this, please reopen.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

recent discussion about removing Ingress docs

#3625 (comment) I suppose.