Why we tried ib_read_bw and ib_write_bw testings without FFO installed but succeeded? And why we installed libibverbs but can't find drivers?
ling0329 opened this issue · 8 comments
In Section 4.3 where one-sided operations are discussed, we see there are two problems to support one-sided operations, and the first is the local FFR does not know the corresponding s-mem on the other side. To solve this problem, FreeFlow builds a central key-value store in FFO for all FFRs to learn the mapping between mem’s pointer in application’s virtual memory space and the corresponding s-mem’s pointer in FFR’s virtual memory space. However, our testings of ib_read_bw and ib_write_bw all succeeded without FFO installed, though we don't know how to install FFO.
It should be noted that all of our ib_send/read/write_bw testings are based on rdma_cm mode, because if we install libibverbs, we will encounter a warning of 'no userspace device-specific driver found'.
So we only install libmlx4 and librdmacm, and all testings are based on standard libibvers of rdma. Then if we test based on non rdma_cm mode, it will not go through router.
Did you met this problem before? We tried to solve this problem, and found that the function try_driver in init.c fails to find dirvers when executing
Then we think it is caused by driver initialization, and locate to function mlx4_driver_init defined in mlx4.c in libmlx4. We also found in file mlx4.c, you cut many lines, that make us confused. The problem we finally located to is in the following code, it doesn't 'goto found', so 'return NULL' early.
But why? Why rdma_cm mode doesn't met this problem? But with libibvers installed, both modes are influenced?
Wish your answer!
Did you install Mellanox OFED driver outside the container, and mount the user space driver path into the container, like -v /sys/class/:/sys/class/ ? You can find this in the README.md command line. Do you have /sys/class/infiniband_verbs/uverbs0 ?
We have found out why it fails to find devices. Because the abi_version of Mellanox NICs we used is 1, not within 3 to 4, so it needs to match libmlx5, not libmlx4. However, we must use high version Mellanox NICs. Anyway, thank you for your attentions.
Sorry to bother you again. But we still want to know why we tried ib_read_bw and ib_write_bw testings without FFO installed but succeeded under rdma_cm mode? According to the analyses what have been discussed in your paper, we can see FFO is indispensable when executing one-sided operations, but how is it reflected in the open source environment.
Here is our test case of ib_read_bw.
In the server side,
In the client side,
And we got the output
The above testing was executed between two containers from different hosts, and all succeeded.
Maybe our testing method was wrong, but it really went through FFR.
Look forward to your reply.
Thanks.
@ling0329 I think in this implementation, they hardcoded the one-sided mapping information in code.
From README:
the released implementation hard-codes the host IPs and virtual IP to host IP mapping in https://github.com/Microsoft/Freeflow/blob/master/ffrouter/ffrouter.cpp#L215 and https://github.com/Microsoft/Freeflow/blob/master/ffrouter/ffrouter.h#L76.
Also, it looks like you are also trying to run Freeflow with newer NICs. Do you get it to work successfully? And can you share what version of Ubuntu and OFED you are running? (both container and host OS)
We are still trying to solve this problem but failed. Actually, we are not ready to modify libmlx5, because there are much differences between libmlx4 and libmlx5.
This is our Ubuntu version on host OS
The Ubuntu version of container is the same
Our OFED is MLNX_OFED_LINUX-4.0-2.0.0.1-ubuntu14.04-x86_64. We use ConnectX-4 40G NICs, and you can see details here
MT27700 Family is not listed in libmlx4
and we want to try more newer NICs, like ConnectX-5 25G.
I see. Thanks for sharing your setup.
Yeah, porting these changes to libmlx5 is probably gonna take a lot of effort.
It seems that FreeFlow only works with ConnectX-3. I saw your workaround for the hca_table
check and used that to get rdma_client/rdma_server
working, albeit it still hangs 20% of the time.
The current architecture of Freeflow works only with libmlx4. It's possible to use the LD_PRELOAD trick to re-implement a cross-driver-version solution by intercepting relevant calls. However, it requires quite a bit efforts, and all the authors of this project are now busy with something else...