Need an API to tell how much data could be transferred in a single bulk_transfer()
Opened this issue · 1 comments
bozhang-hpc commented
Is your feature request related to a problem? Please describe.
I got a HG_PROTOCOL_ERROR
when the data chunk is not small enough.
# [1017487.534554] mercury->op: [warning] /work2/07555/bz186/frontera/apps/mercury/src/na/na_ofi.c:5103
# na_ofi_cq_readerr(): fi_cq_readerr() got err: 5 (Input/output error), prov_errno: 1 (local length error)
# [1019973.528566] mercury->rma: [debug] /work2/07555/bz186/frontera/apps/mercury/src/na/na_ofi.c:4965
# na_ofi_rma_post(): Posting RMA op (fi_writemsg, context=0x3f3c810), iov_count=1, desc[0]=0x4887770, msg_iov[0].iov_base=0x2b93d0e48010, msg_iov[0].iov_len=1679616000, addr=9, rma_iov_count=1, rma_iov[0].addr=47744078381072, rma_iov[0].len=1679616000, rma_iov[0].key=20992, d
Describe the solution you'd like
I would like to know what is the max size that can be handled in a single bulk_transfer(). Then I can cut the data and call the bulk_transfer multiple times.
Describe alternatives you've considered
Or maybe Mercury can handle this internally?
Additional context
soumagne commented
we might be able to do that once libfabric gives us the ability to query the maximum size of RMA messages.