seed-peer 无法启动
Closed this issue · 4 comments
nicefuture2016 commented
作者,你好,当我部署seed-peer的时候无法启动,以下是我的配置信息
scheduler:
image: xxxxx/dragonflyoss/scheduler
tag: v2.1.20
config:
verbose: true
pprofPort: 18066
manager:
schedulerClusterID: 10
console: true
verbose: true
metrics:
enable: true
serviceMonitor:
enable: true
prometheusRule:
enable: true
initContainer:
image: xxxxx/dragonflyoss/busybox
tag: latest
pullPolicy: Always
seedPeer:
image: xxxxx/dragonflyoss/dfdaemon
tag: v2.1.20
persistence:
enable: true
storageClass: nfs-storage
config:
scheduler:
manager:
seedPeer:
clusterID: 10
console: true
verbose: true
metrics:
enable: true
serviceMonitor:
enable: true
prometheusRule:
enable: true
initContainer:
image: xxxxx/dragonflyoss/busybox
tag: latest
pullPolicy: Always
dfdaemon:
image: xxxxx/dragonflyoss/dfdaemon
tag: v2.1.20
config:
host:
idc: pingxiang
console: true
verbose: true
metrics:
enable: true
serviceMonitor:
enable: true
prometheusRule:
enable: true
initContainer:
image: xxxxx/dragonflyoss/busybox
tag: latest
pullPolicy: Always
manager:
enable: false
externalManager:
enable: true
host: xxxxxxxxxxx
restPort: 443
grpcPort: 840
redis:
enable: false
externalRedis:
addrs:
- xxxxxxxx:839
password: xxxxxxx
mysql:
enable: false
containerRuntime:
initContainerImage: xxxxx/dragonflyoss/openssl
docker:
enable: true
injectHosts: true
registryDomains:
- 'harbor-test.cn'
pod启动信息
[root@master01 dragonfly]# kubectl get pod -n dragonfly-system
NAME READY STATUS RESTARTS AGE
dragonfly-dfdaemon-27rzg 1/1 Running 0 11m
dragonfly-dfdaemon-2zrzx 1/1 Running 0 11m
dragonfly-dfdaemon-6vrdq 1/1 Running 0 11m
dragonfly-dfdaemon-8lgrw 1/1 Running 0 11m
dragonfly-dfdaemon-9m6df 1/1 Running 0 11m
dragonfly-dfdaemon-bczbd 1/1 Running 0 11m
dragonfly-dfdaemon-fcvg6 1/1 Running 0 11m
dragonfly-dfdaemon-qhnpb 1/1 Running 0 11m
dragonfly-dfdaemon-rrz6l 1/1 Running 0 11m
dragonfly-dfdaemon-wf7g8 1/1 Running 0 11m
dragonfly-dfdaemon-wfzkq 1/1 Running 0 11m
dragonfly-dfdaemon-xzjds 1/1 Running 0 11m
dragonfly-scheduler-0 1/1 Running 0 11m
dragonfly-scheduler-1 1/1 Running 0 11m
dragonfly-scheduler-2 1/1 Running 0 11m
dragonfly-seed-peer-0 0/1 CrashLoopBackOff 7 11m
pod启动日志
[root@master01 dragonfly]# kubectl describe pod -n dragonfly-system dragonfly-seed-peer-0
Name: dragonfly-seed-peer-0
Namespace: dragonfly-system
Priority: 0
Node: 10.49.8.25/10.49.8.25
Start Time: Tue, 02 Jan 2024 15:08:57 +0800
Labels: app=dragonfly
component=seed-peer
controller-revision-hash=dragonfly-seed-peer-869448c586
release=dragonfly
statefulset.kubernetes.io/pod-name=dragonfly-seed-peer-0
Annotations: <none>
Status: Running
IP: 172.20.113.182
IPs:
IP: 172.20.113.182
Controlled By: StatefulSet/dragonfly-seed-peer
Init Containers:
wait-for-manager:
Container ID: docker://453e4f4f3e04aaa62d1535a91fdc3f727894e1209fa6a01f8caf3d3211a8115c
Image: xxxxx/dragonflyoss/busybox:latest
Image ID: docker-pullable://xxxxxxx/dragonflyoss/busybox@sha256:62ffc2ed7554e4c6d360bce40bbcf196573dd27c4ce080641a2c59867e732dee
Port: <none>
Host Port: <none>
Command:
sh
-c
until nslookup xxxxxxx && xxxxxxx 443; do echo waiting for external manager; sleep 2; done;
State: Terminated
Reason: Completed
Exit Code: 0
Started: Tue, 02 Jan 2024 15:08:58 +0800
Finished: Tue, 02 Jan 2024 15:09:03 +0800
Ready: True
Restart Count: 0
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-pjs8k (ro)
Containers:
seed-peer:
Container ID: docker://40e4abe69d26ea94555cdfbc8ead2ec30b84990062accf044249d8ded12eedb3
Image: xxxxxxx/dragonflyoss/dfdaemon:v2.1.20
Image ID: docker-pullable://xxxxxxx/dragonflyoss/dfdaemon@sha256:c541df9ddcde172350cf30084d9b8f28a2ab9974a5e9f2b3050c77a03790076f
Ports: 65000/TCP, 65002/TCP, 8000/TCP
Host Ports: 0/TCP, 0/TCP, 0/TCP
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 2
Started: Tue, 02 Jan 2024 15:18:14 +0800
Finished: Tue, 02 Jan 2024 15:18:59 +0800
Ready: False
Restart Count: 7
Limits:
cpu: 2
memory: 4Gi
Requests:
cpu: 0
memory: 0
Liveness: exec [/bin/grpc_health_probe -addr=:65000] delay=15s timeout=1s period=10s #success=1 #failure=3
Readiness: exec [/bin/grpc_health_probe -addr=:65000] delay=5s timeout=1s period=10s #success=1 #failure=3
Environment: <none>
Mounts:
/etc/dragonfly from config (rw)
/var/lib/dragonfly from storage (rw)
/var/log/dragonfly/daemon from logs (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-pjs8k (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
storage:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: storage-dragonfly-seed-peer-0
ReadOnly: false
config:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: dragonfly-seed-peer
Optional: false
logs:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit: <unset>
default-token-pjs8k:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-pjs8k
Optional: false
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled <unknown> default-scheduler Successfully assigned dragonfly-system/dragonfly-seed-peer-0 to 10.49.8.25
Normal Pulling 11m kubelet, 10.49.8.25 Pulling image "xxxxxxx/dragonflyoss/busybox:latest"
Normal Pulled 11m kubelet, 10.49.8.25 Successfully pulled image "xxxxxxx/dragonflyoss/busybox:latest"
Normal Created 11m kubelet, 10.49.8.25 Created container wait-for-manager
Normal Started 11m kubelet, 10.49.8.25 Started container wait-for-manager
Normal Pulled 11m (x2 over 11m) kubelet, 10.49.8.25 Container image "xxxxxxx/dragonflyoss/dfdaemon:v2.1.20" already present on machine
Normal Created 11m (x2 over 11m) kubelet, 10.49.8.25 Created container seed-peer
Normal Started 11m (x2 over 11m) kubelet, 10.49.8.25 Started container seed-peer
Warning Unhealthy 10m (x7 over 11m) kubelet, 10.49.8.25 Readiness probe failed: timeout: failed to connect service ":65000" within 1s
Normal Killing 10m (x2 over 11m) kubelet, 10.49.8.25 Container seed-peer failed liveness probe, will be restarted
Warning BackOff 6m43s (x7 over 7m46s) kubelet, 10.49.8.25 Back-off restarting failed container
Warning Unhealthy 116s (x24 over 11m) kubelet, 10.49.8.25 Liveness probe failed: timeout: failed to connect service ":65000" within 1s
以下附件是dfdaemon、seed-peer、scheduler 日志:
gaius-qi commented
@nicefuture2016 Please use English, thanks.
gaius-qi commented
Your seed peer cannot dial the scheduler address.
nicefuture2016 commented
Your seed peer cannot dial the scheduler address.
weird why seed peer dial another dragonfly cluster`s scheduler
{"level":"warn","ts":"2024-01-05 02:10:40.508","caller":"config/dynconfig_manager.go:125","msg":"scheduler 172.20.117.177 dragonfly-scheduler-2.scheduler.dragonfly-system.svc.cluster.local 8002 has not reachable addresses","stacktrace":"d7y.io/dragonfly/v2/client/config.(*dynconfigManager).GetResolveSchedulerAddrs\n\t/go/src/d7y.io/dragonfly/v2/client/config/dynconfig_manager.go:125\nd7y.io/dragonfly/v2/pkg/resolver.(*SchedulerResolver).ResolveNow\n\t/go/src/d7y.io/dragonfly/v2/pkg/resolver/scheduler_resolver.go:73\nd7y.io/dragonfly/v2/pkg/resolver.(*SchedulerResolver).OnNotify\n\t/go/src/d7y.io/dragonfly/v2/pkg/resolver/scheduler_resolver.go:98\nd7y.io/dragonfly/v2/client/config.(*dynconfigManager).Notify\n\t/go/src/d7y.io/dragonfly/v2/client/config/dynconfig_manager.go:223\nd7y.io/dragonfly/v2/client/config.(*dynconfigManager).Serve\n\t/go/src/d7y.io/dragonfly/v2/client/config/dynconfig_manager.go:249\nd7y.io/dragonfly/v2/client/daemon.(*clientDaemon).Serve.func10\n\t/go/src/d7y.io/dragonfly/v2/client/daemon/daemon.go:744\ngolang.org/x/sync/errgroup.(*Group).Go.func1\n\t/go/pkg/mod/golang.org/x/sync@v0.4.0/errgroup/errgroup.go:75"}
scheduler 172.20.117.177 dragonfly-scheduler-2.scheduler.dragonfly-system.svc.cluster.local 8002
the ip address 172.20.117.177 below another k8s cluster
gaius-qi commented
For multi-cluster deployment, please refer to https://d7y.io/docs/next/getting-started/quick-start/multi-cluster-kubernetes/.