Azure/aks-periscope

Memory requested too high in aks-periscope daemonSet

Closed this issue ยท 5 comments

Describe the bug
[Not a bug but a potential point of improvement]

$ az aks kollect -n xxx -g xxx --kube-objects 'default/deployment/ratings-v1' --storage-account xxx
...
$ kubectl describe -n aks-periscope ds aks-periscope
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: aks-periscope
...
          resources:
            limits:
              cpu: '1'
              memory: 2000Mi
            requests:
              cpu: 250m
              memory: 500Mi      *******<<<<
...

From AKS user's practical perspective, this amount of ram request is too much for a debugging feature/plugin. Consumers like omsagent and osm-controller cost around 250Mi, which is still considered high.
To Reproduce
Deploy periscope through az aks kollect (in preview).

Expected behavior
Consider:

  • Use less sub processes while collecting things sequentially
  • The debug deploy does not have to be a ds.

Hi @kuzhao - thank you for highlighting this. We'll do some investigation to see how much memory we actually need, as I suspect you're right that it could be significantly lower.

Quick question: what's the 'debug deploy' you're referring to in your last bullet point?

Hi @kuzhao - thank you for highlighting this. We'll do some investigation to see how much memory we actually need, as I suspect you're right that it could be significantly lower.

Quick question: what's the 'debug deploy' you're referring to in your last bullet point?

Appreciate the attention, Peter. For "debug deploy", i might be using a fumble word imao... im just referring to the aks-periscope (deployed via az aks kollect for debugging, the source of 'debug deploy').
To expand more on this bullet, the periscope daemonSet runs on all nodes, although I set a specific k8s object to debug. Ideally, periscope should launch pod and collect only on nodes where the pods of target k8s object is running.
The alternative could be either rs with podAffinity or ds with nodeSelector.

Hi @kuzhao - I've just merged a PR that sets lower requests (for non-Windows nodes), so that will be included in the next release (I'll update here when that happens).

We don't currently plan to support restricting Periscope deployments to specific nodes. Even if you select a specific k8s object in the configuration, it still collects a lot of other data/logs from all the nodes, by design.

If that is a need for you, feel free to submit a feature request issue. It's always good to understand people's use cases. @Tatsinnit might recall if that's ever been discussed before?

If that is a need for you, feel free to submit a feature request issue. It's always good to understand people's use cases. @Tatsinnit might recall if that's ever been discussed before?

Thank you so much for this discussion to you both โค๏ธ ๐Ÿ™ Thanks @peterbom for the question and sorry for delayed reply, to the question the answer is --> no. we had no plan for this in initial days of this tool. Day-1 we were focused on broadcast mechanism hence the use of daemonset to run in all nodes.

๐Ÿ’ก Just some built up Idea: we could always move this to discussion and see how many folks really want this scenario et. al. (what do you think?)

Moved the discussion to #225 - thank you @Tatsinnit and @kuzhao for the suggestions and feel free to contribute to the discussions.