Beyla traces not showing up
revathyrams opened this issue · 15 comments
Hey! I'm running Beyla on a local kubernetes (colima) cluster on an M2 mac.
I'm following the example here with just a few tweaks to the beyla-config to enable debug logs and discovering k8s namespaces,
apiVersion: v1
kind: ConfigMap
metadata:
namespace: beyla
name: beyla-config
data:
beyla-config.yml: |
# this is required to enable kubernetes discovery and metadata
attributes:
kubernetes:
enable: true
# this will provide automatic routes report while minimizing cardinality
print_traces: true
log_level: DEBUG
routes:
unmatched: heuristic
discovery:
services:
- k8s_namespace: beyla
- k8s_namespace: default
- k8s_namespace: kube-system
From the logs I think beyla is able to watch other pods in different namespaces.
level=DEBUG msg="inserting pod" component=kube.Metadata informer=Pod name=beyla-lzgxh namespace=beyla uid=a4df9638-84ce-4b32-a3d8-b6ad2148f375 owner=k8s.daemonset.name:beyla node=colima startTime="2024-07-26 00:05:29 +0000 UTC" containerIDs=[5c9dbe4c8a9efd132c26dfee176765a6df8bfb6bf8fc4a258a49ebcbe55aeb17]
But I do not see any traces in Grafana Cloud. How should I debug further?
I do see these logs about traces
time=2024-07-26T00:19:52.441Z level=DEBUG msg="new process" component=discover.WatcherKubeEnricher pid=1244
time=2024-07-26T00:19:52.441Z level=DEBUG msg="Found namespace" component=httpfltr.Tracer nsPid=pid:[4026531836]
time=2024-07-26T00:19:52.441Z level=DEBUG msg="new process" component=discover.WatcherKubeEnricher pid=2968
time=2024-07-26T00:19:52.441Z level=DEBUG msg="Found namespace" component=httpfltr.Tracer nsPid=pid:[4026531836]
time=2024-07-26T00:19:52.441Z level=DEBUG msg="can't get container info for PID" component=discover.WatcherKubeEnricher pid=2968 error="/proc/2968/cgroup: couldn't find any docker entry for process with PID 2968"
Do these logs mean beyla is able to receive the traces but unable to send it to Grafana cloud?
It's important to see if there are any errors/warnings reported in Beyla logs, they might be hidden between the debug messages. Not seeing data in Grafana cloud typically means it's unable to send them, usually related to wrong credentials or OTLP endpoint.
I couldn't find any error/warning logs. Just a debug level log of
level=DEBUG msg="can't get container info for PID" component=discover.watcherKubeEnricher pid=141004 error="/proc/141004/cgroup: couldn't find any docker entry for process with PID 141004"
I double checked the credentials and OTLP endpoint too, and they look right.
Do you have any other pointers to debug this? Are there are any specific logs to expect when Beyla creates a trace or sends it to the OTLP endpoint?
I don't see other logs besides these
time=2024-07-26T18:06:24.808Z level=DEBUG msg="process stopped" component=discover.watcherKubeEnricher pid=134837
time=2024-07-26T18:06:24.808Z level=DEBUG msg="filtering processes" component=discover.CriteriaMatcher len=1
time=2024-07-26T18:06:24.808Z level=DEBUG msg="deleted untracked process. Ignoring" component=discover.CriteriaMatcher pid=134837
time=2024-07-26T18:06:24.808Z level=DEBUG msg="processes matching selection criteria" component=discover.CriteriaMatcher len=0
Thanks for the info. Let's then see first if Beyla is capturing anything. Can you please set BEYLA_PRINT_TRACES=1
? This will show on the standard output which requests is Beyla capturing. If we see output there, then it's likely something with the connection to cloud, otherwise it's something to do with the configuration of Beyla.
Attaching a full log, doesn't have to be with BEYLA_LOG_LEVEL=debug, will help.
I already have print_traces: true
in my beyla config.
Attaching the complete logs here, in debug level:
logs.txt
It seems to be a permission thing, we don't find any services we can instrument. Can you please share with us the full Daemonset yaml file (clearing any secrets)?
Sure, here's the full beyla-config file
apiVersion: v1
kind: ConfigMap
metadata:
namespace: beyla
name: beyla-config
data:
beyla-config.yml: |
attributes:
kubernetes:
enable: true
print_traces: true
log_level: INFO
routes:
unmatched: heuristic
discovery:
services:
- k8s_namespace: beyla
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
namespace: beyla
name: beyla
spec:
selector:
matchLabels:
instrumentation: beyla
template:
metadata:
labels:
instrumentation: beyla
spec:
serviceAccountName: beyla
hostPID: true # mandatory!
containers:
- name: beyla
image: grafana/beyla:latest
imagePullPolicy: IfNotPresent
securityContext:
privileged: true # mandatory!
readOnlyRootFilesystem: true
volumeMounts:
- mountPath: /config
name: beyla-config
- mountPath: /var/run/beyla
name: var-run-beyla
env:
- name: BEYLA_CONFIG_PATH
value: "/config/beyla-config.yml"
- name: OTEL_EXPORTER_OTLP_ENDPOINT
value: "redacted"
- name: OTEL_EXPORTER_OTLP_HEADERS
value: "Authorization=Basic redacted"
volumes:
- name: beyla-config
configMap:
name: beyla-config
- name: var-run-beyla
emptyDir: {}
And this is the serviceaccount
apiVersion: v1
kind: ServiceAccount
metadata:
namespace: beyla
name: beyla
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: beyla
rules:
- apiGroups: ["apps"]
resources: ["replicasets"]
verbs: ["list", "watch"]
- apiGroups: [""]
resources: ["pods"]
verbs: ["list", "watch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: beyla
subjects:
- kind: ServiceAccount
name: beyla
namespace: beyla
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: beyla
I think I might have to give additional cluster role permissions for services and deployments
- apiGroups: [""]
resources: ["services"]
verbs: ["list"]
- apiGroups: ["*"]
resources: ["deployments"]
verbs: ["get", "list", "watch"]
Still don't see any traces in the logs, any permissions besides what's mentioned here needs to be set?
not sure if this is relevant but i noticed this during an upgrade but no mention of it
print_traces: true
- removed
trace_printer: text
- new
https://github.com/grafana/beyla/pull/1035/files
but the corresponding docs / notes havent been updated
@tbgbeansbot I don't believe the print_traces
changes are related - as a matter of fact, print_traces
should still work (if it's set, it will have the same effect as trace_printer=text
, we just deprecated it and you should now get a warning when using it).
By the way, we've updated these docs - if I missed anything on that front, could you point me to it so that I can update it?
Some points:
- I recently had problems with freshly-created Colima instances. It seems something has changed in the way Colima (or its Lima backend) create the VMs. Could you give a try with Rancher Desktop + Kind? I also verified that Docker Desktop currently works well
- Can you share the debug logs? also the deployment files for you applications would help.
We had a meeting and managed to get Beyla working on your environment, so closing the task now. @revathyrams feel free to reopen it if you think that this task must not be closed.
thanks @mariomac! We tried yesterday in GKE and that works perfectly well!
But in my local (Colima on mac) the error is
level=ERROR msg="Unable to load eBPF watcher for process events" component=discover.ProcessWatcher interval=5s error="instrumenting function \"sys_bind\": setting kprobe: creating perf_kprobe PMU (arch-specific fallback for \"sys_bind\"): token sys_bind: opening perf event: permission denied"
I'm not focussed on setting it up on my local now as it works fine in Google Cloud, so let the issue be closed 👍
Hi @revathyrams that looks like your local Linux requires the SYS_ADMIN
capability. Under certain Linux kernel configurations, you would need it anyway.
However, we recently realized that latest versions of Colima don't properly work well with the pattern we use to run unprivileged containers (a pod mounting the BPF filesystem and another using it).
I verified that Rancher Desktop and Docker Desktop would work. You can follow the instructions here: https://github.com/mariomac/local-beyla-demo/