dragonflyoss/Dragonfly2

distribute a 3G zip file to 1300+ nodes, massive back to source occered

Opened this issue · 7 comments

distribute a 3G zip file to 1300+ nodes, massive back to source occered

{"level":"error","ts":"2024-04-25 19:03:54.292","caller":"peer/peertask_piecetask_synchronizer.go:407","msg":"synchronizer receives with error: rpc error: code = Internal desc = peer task failed: peer task failed: 4000/digest not set","targetPeerID":"10.132.80.158-628-a68fc0c3-0a6d-4c2c-9581-36cb41d39ee7","peer":"10.132.80.154-568-9c4cce34-0993-4e54-9cb8-fe435f9beda5","task":"4a3dcdade0092c15dfdd78146ae83c707baffc12c50397def00c589322e0b669","component":"PeerTask","trace":"7ac392c3aa4d468eadd0a7756511a5a7","stacktrace":"d7y.io/dragonfly/v2/client/daemon/peer.(*pieceTaskSynchronizer).receive\n\t/home/runner/work/Dragonfly2/Dragonfly2/client/daemon/peer/peertask_piecetask_synchronizer.go:407\nd7y.io/dragonfly/v2/client/daemon/peer.(*pieceTaskSynchronizer).start\n\t/home/runner/work/Dragonfly2/Dragonfly2/client/daemon/peer/peertask_piecetask_synchronizer.go:297"}

10.132.80.154.txt

@jim3ma digest not set

@kuafu-run Can you upload seed peer logs ?

I changed the config:

job:
enable: true
globalWorkerNum: 500
schedulerWorkerNum: 500
localWorkerNum: 1000
in scheduler, clean the dynconfig in cache, restart 3 schedulers

AND changed config:
disableAutoBackSource: true
calculateDigest: false

in BOTH seed-peer and peer, clean the dynconfig in cache, restart all seed-peers, and peers

and start the 3G zip file distritution to 1300+ nodes in every 5 minutes.
Initially, things went well,
hours later , massive back to source occered, and their was no traffic in seed-peer
here 's a log from a peer, with a 5006 error code

peer.10.133.82.138-5006-7052d9180c3e7109941aab53673546a3204fc3cd3463de20ac5c04597b07d7f7.txt
and log from scheduler grep by taskid
scheduler-7052d9180c3e7109941aab53673546a3204fc3cd3463de20ac5c04597b07d7f7.txt

@kuafu-run Please upload daemon logs.

@kuafu-run Please upload daemon logs.

Uploading peer-10.133.82.138-7052d9180c3e7109941aab53673546a3204fc3cd3463de20ac5c04597b07d7f7-2.txt…

The link is invalid.

@kuafu-run Please upload daemon logs.

Uploading peer-10.133.82.138-7052d9180c3e7109941aab53673546a3204fc3cd3463de20ac5c04597b07d7f7-2.txt…

The link is invalid.

peer-10.133.82.138-7052d9180c3e7109941aab53673546a3204fc3cd3463de20ac5c04597b07d7f7-2.txt