ECONNREFUSED when deploy csi node plugin with node-driver-registrar
ltson4121994 opened this issue · 1 comments
ltson4121994 commented
I am trying to deploy a simple CSI plugin but my node server does not seems to work properly with node-driver-registrar.
This is my .yaml file:
kind: DaemonSet
apiVersion: apps/v1
metadata:
name: demo-csi-node
spec:
selector:
matchLabels:
app: demo-csi-node
template:
metadata:
labels:
app: demo-csi-node
spec:
serviceAccountName: demo-csi-sa
containers:
- name: node-driver-registrar
image: k8s.gcr.io/sig-storage/csi-node-driver-registrar:v2.5.0
args:
- --csi-address=/csi/csi.sock
- --kubelet-registration-path=/var/lib/kubelet/plugins/demo-csi/csi.sock
volumeMounts:
- name: socket-dir
mountPath: /csi
- name: registration-dir
mountPath: /registration
- name: demo-csi-node
securityContext:
privileged: true
capabilities:
add: ["SYS_ADMIN"]
allowPrivilegeEscalation: true
image: ltson1/demo-csi-img
args:
- "--endpoint=$(CSI_ENDPOINT)"
env:
- name: CSI_ENDPOINT
value: unix:///csi/csi.sock
imagePullPolicy: "IfNotPresent"
volumeMounts:
- name: socket-dir
mountPath: /csi
- name: pods-mount-dir
mountPath: /var/lib/kubelet/
mountPropagation: "Bidirectional"
volumes:
- name: socket-dir
hostPath:
path: /var/lib/kubelet/plugins/demo-csi
type: DirectoryOrCreate
- name: pods-mount-dir
hostPath:
path: /var/lib/kubelet/
type: Directory
- hostPath:
path: /var/lib/kubelet/plugins_registry
type: Directory
name: registration-dir
This is the output when I run strace with node-driver-registrar:
connect(3, {sa_family=AF_UNIX, sun_path="/csi/csi.sock"}, 16) = -1 ECONNREFUSED (Connection refused)
close(3) = 0
epoll_pwait(4, [], 128, 375, NULL, 1) = 0
futex(0xc000050550, FUTEX_WAKE_PRIVATE, 1) = 1
epoll_pwait(4, [{EPOLLIN, {u32=16476080, u64=16476080}}], 128, 4999, NULL, 2) = 1
read(5, "\0", 16) = 1
epoll_pwait(4, [], 128, 556, NULL, 87161855399915) = 0
futex(0xc000050550, FUTEX_WAKE_PRIVATE, 1) = 1
socket(AF_UNIX, SOCK_STREAM|SOCK_CLOEXEC|SOCK_NONBLOCK, 0) = 3
connect(3, {sa_family=AF_UNIX, sun_path="/csi/csi.sock"}, 16) = -1 ECONNREFUSED (Connection refused)
close(3) = 0
epoll_pwait(4, [], 128, 1010, NULL, 1) = 0
futex(0xc000050550, FUTEX_WAKE_PRIVATE, 1) = 1
futex(0xc000050950, FUTEX_WAKE_PRIVATE, 1) = 1
futex(0xc000050950, FUTEX_WAKE_PRIVATE, 1) = 1
futex(0xc000050950, FUTEX_WAKE_PRIVATE, 1) = 1
socket(AF_UNIX, SOCK_STREAM|SOCK_CLOEXEC|SOCK_NONBLOCK, 0) = 3
connect(3, {sa_family=AF_UNIX, sun_path="/csi/csi.sock"}, 16) = -1 ECONNREFUSED (Connection refused)
close(3) = 0
futex(0xc000050950, FUTEX_WAKE_PRIVATE, 1) = 1
futex(0xc000050950, FUTEX_WAKE_PRIVATE, 1) = 1
write(6, "\0", 1) = 1
futex(0xf86ed0, FUTEX_WAIT_PRIVATE, 0, NULL) = 0
futex(0xf86ed0, FUTEX_WAIT_PRIVATE, 0, NULL) = 0
nanosleep({tv_sec=0, tv_nsec=3000}, NULL) = 0
futex(0xf86ed0, FUTEX_WAIT_PRIVATE, 0, NULL^Cstrace: Process 1647459 detached
<detached ...>
I do not have much clue on how to debug and have been stuck for quite a while, any help would be much appreciated. Thanks.
ltson4121994 commented
Turns out this issue is due to the socket not being released from previous run when the pod is terminate before the socket is closed