filecoin-project/boost

Boost doesn't retry network deals with libp2p transfers and fails them immediately

RobQuistNL opened this issue · 2 comments

Checklist

  • This is not a question or a support request. If you have any boost related questions, please ask in the discussion forum.
  • This is not a new feature request. If it is, please file a feature request instead.
  • This is not an enhancement request. If it is, please file a improvement suggestion instead.
  • I have searched on the issue tracker and the discussion forum, and there is no existing related issue or discussion.
  • I am running the Latest release, or the most recent RC(release canadiate) for the upcoming release or the dev branch(master), or have an issue updating to any of these.
  • I did not make any code changes to boost.

Boost component

  • boost daemon - storage providers
  • boost client
  • boost UI
  • boost data-transfer
  • boost index-provider
  • Other

Boost Version

2.1.1

Describe the Bug

Boost doesn't retry network deals with libp2p transfers and fails them immediately

Logging Information

2023-12-21 01:09:14.798	19s	
start deal data transfer
transfer client id: 
2023-12-21 01:09:14.798	0ms	
http-transport: execute transfer
deal size: 31,991,237,344
output file: /home/xxx/.boost/incoming/8f43e4a0-1016-4e18-bb8c-307deb1b9192.download
time before context deadline: 23h59m59.999991991s
2023-12-21 01:09:14.799	1ms	
http-transport: existing file size
deal size: 31,991,237,344
file size: 0
2023-12-21 01:09:14.799	0ms	
http-transport: libp2p-http url
multiaddr: /ip4/xxx/tcp/4001
peer id: 12D3xxx
url: libp2p://12D3xxx
2023-12-21 01:09:14.799	0ms	
http-transport: started async http transfer
2023-12-21 01:09:32.021	17s	
deal failed
err: data-transfer failed: failed to send HEAD http req: Head "libp2p://12D3xxx": failed to dial: failed to dial 12D3xxx: all dials failed * [/ip4/xxx/tcp/4001] dial tcp4 0.0.0.0:xxx->xxx:4001: i/o timeout
2023-12-21 01:09:32.023	2ms	
cleaning up deal
2023-12-21 01:09:32.023	0ms	
deal finished
2023-12-21 01:09:32.024	1ms	
untagged funds for deal as deal finished
err: 
untagged collateral: 8,302,094,077,429,460
untagged publish: 200,000,000,000,000,000
2023-12-21 01:09:32.025	1ms	
untagged storage space for deal
2023-12-21 01:09:32.025	0ms	
finished cleaning up deal
2023-12-21 01:09:32.026	1ms	
deal go-routine finished execution

Repo Steps

  1. Run network deal with libp2p transfer
  2. Make transfer fail
  3. See immediate deal failure

This is a must requirement for online deals. The server should be available to make the initial head request. There are retries for the download timeouts and other stages of the deal processing. We don't plan to change this requirement as keeping these deals alive will unnecessary clog the storage space.

Thanks for the explaination