`proc.ReadFds()` hangs
keisku opened this issue · 5 comments
Description
This function hangs. Especially, dest, err := os.Readlink(path.Join(fdDir, entry.Name()))
Lines 17 to 40 in 8e1fa82
Reproduction
$ git log -1
──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
commit 8e1fa825ad97ce88d587e8991cd8357c19f90dd4 (HEAD -> main, origin/main, origin/HEAD, dd-trace)
Author: Nikolay Sivko <n.sivko@gmail.com>
Date: Wed Dec 20 17:19:07 2023 +0300
CRI-O: fix container log discovery
$ uname -a
Linux ip-10-0-133-150 6.2.0-1017-aws #17~22.04.1-Ubuntu SMP Fri Nov 17 21:07:13 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
$ docker version
Client: Docker Engine - Community
Version: 24.0.7
API version: 1.43
Go version: go1.20.10
Git commit: afdd53b
Built: Thu Oct 26 09:07:41 2023
OS/Arch: linux/amd64
Context: default
Server: Docker Engine - Community
Engine:
Version: 24.0.7
API version: 1.43 (minimum version 1.12)
Go version: go1.20.10
Git commit: 311b9ff
Built: Thu Oct 26 09:07:41 2023
OS/Arch: linux/amd64
Experimental: false
containerd:
Version: 1.6.26
GitCommit: 3dd1e886e55dd695541fdcd67420c2888645a495
runc:
Version: 1.1.10
GitCommit: v1.1.10-0-g18a0cb0
docker-init:
Version: 0.19.0
GitCommit: de40ad0
$ pwd
/home/ubuntu/workspace/coroot-node-agent
$ docker build . -t coroot-node-agent-dev
[+] Building 58.9s (18/18) FINISHED docker:default
=> [internal] load .dockerignore 0.0s
=> => transferring context: 59B 0.0s
=> [internal] load build definition from Dockerfile 0.0s
=> => transferring dockerfile: 553B 0.0s
=> [internal] load metadata for docker.io/library/debian:bullseye 0.6s
=> [internal] load metadata for docker.io/library/golang:1.19-bullseye 0.6s
=> [builder 1/9] FROM docker.io/library/golang:1.19-bullseye@sha256:2fdfcb03b1445f06f1cf8a342516bfd34026b527fef8427f40ea7b140168fda2 0.0s
=> [stage-1 1/3] FROM docker.io/library/debian:bullseye@sha256:71f0e09d55a4042ddee1f114a0838d99266e185bf33e71f15c15bf6b9545a9a0 0.0s
=> [internal] load build context 0.0s
=> => transferring context: 22.23kB 0.0s
=> CACHED [builder 2/9] RUN apt update && apt install -y libsystemd-dev 0.0s
=> CACHED [builder 3/9] COPY go.mod /tmp/src/ 0.0s
=> CACHED [builder 4/9] COPY go.sum /tmp/src/ 0.0s
=> CACHED [builder 5/9] WORKDIR /tmp/src/ 0.0s
=> CACHED [builder 6/9] RUN go mod download 0.0s
=> [builder 7/9] COPY . /tmp/src/ 0.1s
=> [builder 8/9] RUN CGO_ENABLED=1 go test ./... 51.7s
=> [builder 9/9] RUN CGO_ENABLED=1 go install -mod=readonly -ldflags "-X main.version=unknown" /tmp/src 5.7s
=> CACHED [stage-1 2/3] RUN apt update && apt install -y ca-certificates && apt clean 0.0s
=> [stage-1 3/3] COPY --from=builder /go/bin/coroot-node-agent /usr/bin/coroot-node-agent 0.2s
=> exporting to image 0.3s
[docker.log](https://github.com/coroot/coroot-node-agent/files/13790552/docker.log)
=> => exporting layers 0.2s
=> => writing image sha256:52fd0dd6da8116dae22bee78bb8c62f24917e5332f5d3ec880b0b68a2fc35f27 0.0s
=> => naming to docker.io/library/coroot-node-agent-dev 0.0s
$ docker run --detach --name coroot-node-agent-dev --privileged --pid host -p 8080:80 -v /sys/kernel/debug:/sys/kernel/debug:rw -v /sys/fs/cgroup:/host/sys/fs/cgroup:ro coroot-node-agent-dev --cgroupfs-root=/host/sys/fs/cgroup
I've inserted additional trace logs to precisely identify the code segment responsible for this issue.
Also followed this doc.
See the result of git diff
.
Logs
See the attachment, docker logs coroot-node-agent-dev
result.
Thanks, @keisku, for the detailed report. The agent uses a rate limiter for logging, which may create the impression of it hanging. Other than the log, what other problems do you see with the agent?
@def Thanks! I overlooked the rate limit.
Other than the log, what other problems do you see with the agent?
curl failed for these endpoints and I didn't see these logs. Then I thought some operation hanged.
Lines 143 to 145 in 8e1fa82
But the actual problem was that I didn't set --port
for docker run
.
So, my issue has been solved 👍
Btw, why do we need the rate limit?