zrlio/jNVMf

have you verified the if this java host works well with kernel nvme-of target

Closed this issue · 53 comments

gwnet commented

I like this idea, host use this java lib, and target use kernel nvme-of target. can we work it together to make it happen?

Yes, using the Linux kernel nvmf target works and is how I run it most of the time. Just be aware since the kernel does not follow the specification 100% I created a legacy flag which you need to set in order to talk to kernel targets (-Dnvmf.legacy=true).

gwnet commented

Yes, using the Linux kernel nvmf target works and is how I run it most of the time. Just be aware since the kernel does not follow the specification 100% I created a legacy flag which you need to set in order to talk to kernel targets (-Dnvmf.legacy=true).

got it. thank you so much. I will try it very soon. did you also handle the admin keep live timer well? and if host ctrl has different state as kernel host?

The API provides a keepAlive() method in the Controller class. You as a user of the API are responsible to call this method at appropriate intervals. You can query the maximum timeout value from the Controller if need be. Ctrl state is updated whenever necessary. What scenario did you have in mind?

gwnet commented

buddy kernel host has a lot of state and state flow. like resetting, reconnecting, live, deleting and dead. do this product has this kind of error handling logic.
we want to use it in one java based object storage. we see this product fit our goal. can you add me as one developer for this project?

There are timeouts for connections, requests, etc. In general we try to handle all errors gracefully and give the user of the API a chance to act accordingly. Since we want to give users the most freedom in how to react to these scenarios we do not e.g. do automatic reconnection but give this responsibility to the user of the API. Feel free to create pull requests and sign-off your commits agreeing to https://developercertificate.org/

gwnet commented

There are timeouts for connections, requests, etc. In general we try to handle all errors gracefully and give the user of the API a chance to act accordingly. Since we want to give users the most freedom in how to react to these scenarios we do not e.g. do automatic reconnection but give this responsibility to the user of the API. Feel free to create pull requests and sign-off your commits agreeing to https://developercertificate.org/

buddy, can you give me one example how to run it to one kernel target. can you give me the detail step. I now can build this OK. I can create kernel target. I need step by step guide how to use this application to connect kernel target

I updated the README to reflect the latest version in the path of the examples. To test your setup you can run a small benchmark with the NvmfClientBenchmark:

java -cp target/jnvmf-1.5-jar-with-dependencies.jar:target/jnvmf-1.5-tests.jar com.ibm.jnvmf.benchmark.NvmfClientBenchmark -a 10.100.0.22 -p 4420 -g 4096 -i 3 -m RANDOM -n 10 -nqn nqn.2016-06.io.spdk:cnode1 -qd 1 -rw read -s 4096 -qs 64 -H -I

This is a random read test which conntects to a target with ip 10.100.0.22 and port 4420 with controller nqn nqn.2016-06.io.spdk:cnode1. You need to change those according to your setup. Queue depth is 1 and size of request is 4096. Queue size is 64. -H creates a histogram and -I uses inline RDMA data.

gwnet commented

I updated the README to reflect the latest version in the path of the examples. To test your setup you can run a small benchmark with the NvmfClientBenchmark:

java -cp target/jnvmf-1.5-jar-with-dependencies.jar:target/jnvmf-1.5-tests.jar com.ibm.jnvmf.benchmark.NvmfClientBenchmark -a 10.100.0.22 -p 4420 -g 4096 -i 3 -m RANDOM -n 10 -nqn nqn.2016-06.io.spdk:cnode1 -qd 1 -rw read -s 4096 -qs 64 -H -I

This is a random read test which conntects to a target with ip 10.100.0.22 and port 4420 with controller nqn nqn.2016-06.io.spdk:cnode1. You need to change those according to your setup. Queue depth is 1 and size of request is 4096. Queue size is 64. -H creates a histogram and -I uses inline RDMA data.

got it. it will be better we can have one simple Java ping test application that we just modify code and quick start on that. I will try this command first. my target is kernel, so the nqn name is different.

Of course, your NQN name is different also your ip and port, you need to change it accordingly as pointed out above.

This is a random read test which conntects to a target with ip 10.100.0.22 and port 4420 with controller nqn nqn.2016-06.io.spdk:cnode1. You need to change those according to your setup.

Take a look at the example code in the README for an small application.

gwnet commented

buddy another two question.
is there any famous product or open source project already use it?
linux kernel has LPT test that may cover NVMeOF, did we run through any industry standard test suites on this code?

gwnet commented

buddy
below is my configuration, I allow any hosts can connect my kernel target.
but when I use the nqn=testnqn, I suffer below error. do I need hostnqn here? please advise.
and also -Dnvmf.legacy=true totally cannot be recongized. where I should configure this for kernel target?

java -cp target/jnvmf-1.5-jar-with-dependencies.jar:target/jnvmf-1.5-tests.jar com.ibm.jnvmf.benchmark.NvmfClientBenchmark -a 192.168.147.130 -p 4420 -g 4096 -i 3 -m RANDOM -n 10 -nqn testnqn -qd 1 -rw read -s 4096 -qs 64 -H -I
Exception in thread "main" java.lang.IllegalArgumentException: Invalid NQN
at com.ibm.jnvmf.NvmeQualifiedName.validate(NvmeQualifiedName.java:47)
at com.ibm.jnvmf.NvmeQualifiedName.(NvmeQualifiedName.java:36)
at com.ibm.jnvmf.benchmark.NvmfClientBenchmark.(NvmfClientBenchmark.java:177)
at com.ibm.jnvmf.benchmark.NvmfClientBenchmark.main(NvmfClientBenchmark.java:551)

o- / ......................................................................................................................... [...]
o- hosts ................................................................................................................... [...]
o- ports ................................................................................................................... [...]
o- subsystems .............................................................................................................. [...]
o- testnqn ................................................................. [version=1.3, allow_any=1, serial=3a2d141200bb17b7]
o- allowed_hosts ....................................................................................................... [...]
o- namespaces .......................................................................................................... [...]
o- 1 ............................................... [path=/dev/nvme0n1, uuid=4fae3cf9-85ce-46af-86fc-f87498cc9edb, enabled]
/> cd ports
/ports> ls
o- ports ..................................................................................................................... [...]
/ports> create 1
/ports> ls
o- ports ..................................................................................................................... [...]
o- 1 ................................................................................................ [trtype=, traddr=, trsvcid=]
o- referrals ............................................................................................................. [...]
o- subsystems ............................................................................................................ [...]
/ports> cd 1
/ports/1> set addr trtype=rdma
Parameter trtype is now 'rdma'.
/ports/1> set addr adrfam=ipv4
Parameter adrfam is now 'ipv4'.
/ports/1> set addr traddr=192.168.147.130
Parameter traddr is now '192.168.147.130'.
/ports/1> set addr trsvcid=4420
Parameter trsvcid is now '4420'.
/ports/1> ls
o- 1 ........................................................................... [trtype=rdma, traddr=192.168.147.130, trsvcid=4420]
o- referrals ............................................................................................................... [...]
o- subsystems .............................................................................................................. [...]
/ports/1> cd subsystems
/ports/1/subsystems> ls
o- subsystems ................................................................................................................ [...]
/ports/1/subsystems> create testnqn
/ports/1/subsystems> ls
o- subsystems ................................................................................................................ [...]
o- testnqn ................................................................................................................. [...]
/ports/1/subsystems> cd /
/> saveconfig test.json

gwnet commented

buddy, I have made progress. now it is new error, looks like I need that RDMA lib,
Could you please specify me detail step by step?

wayne@ubuntu:~/jNVMf$ java -cp target/jnvmf-1.5-jar-with-dependencies.jar:target/jnvmf-1.5-tests.jar com.ibm.jnvmf.benchmark.NvmfClientBenchmark -a 192.168.147.130 -p 4420 -g 4096 -i 3 -m RANDOM -n 10 -nqn nqn.2014-08.org.nvmexpress:uuid:dba6e945-7f15-482f-a913-e215b4560356 -qd 1 -rw read -s 4096 -qs 64 -H -I
read 4096bytes with QD = 1, time[s] = 3, pattern = RANDOM, runs = 10
log4j:WARN No appenders could be found for logger (com.ibm.disni).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
Exception in thread "main" java.lang.UnsatisfiedLinkError: no disni in java.library.path
at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1867)
at java.lang.Runtime.loadLibrary0(Runtime.java:870)
at java.lang.System.loadLibrary(System.java:1122)
at com.ibm.disni.verbs.impl.NativeDispatcher.(NativeDispatcher.java:36)
at com.ibm.disni.verbs.impl.RdmaProviderNat.(RdmaProviderNat.java:43)
at com.ibm.disni.verbs.RdmaProvider.provider(RdmaProvider.java:58)
at com.ibm.disni.verbs.RdmaCm.open(RdmaCm.java:49)
at com.ibm.disni.verbs.RdmaEventChannel.createEventChannel(RdmaEventChannel.java:66)
at com.ibm.disni.RdmaCmProcessor.(RdmaCmProcessor.java:48)
at com.ibm.disni.RdmaEndpointGroup.(RdmaEndpointGroup.java:61)
at com.ibm.jnvmf.NvmfRdmaEndpointGroup.(NvmfRdmaEndpointGroup.java:75)
at com.ibm.jnvmf.Controller.(Controller.java:58)
at com.ibm.jnvmf.Nvme.connect(Nvme.java:50)
at com.ibm.jnvmf.Nvme.connect(Nvme.java:44)
at com.ibm.jnvmf.benchmark.NvmfClientBenchmark.connect(NvmfClientBenchmark.java:216)
at com.ibm.jnvmf.benchmark.NvmfClientBenchmark.(NvmfClientBenchmark.java:203)
at com.ibm.jnvmf.benchmark.NvmfClientBenchmark.main(NvmfClientBenchmark.java:551)

gwnet commented

hello buddy, I have fixed the RDMA issue. now I suffer the code logic issue.
Could you please help me? regular the kernel nvme connect tool works well. only this lib failed.

wayne@ubuntu:/jNVMf$ export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/lib
wayne@ubuntu:
/jNVMf$ echo $LD_LIBRARY_PATH
:/usr/local/lib
wayne@ubuntu:~/jNVMf$ java -cp target/jnvmf-1.5-jar-with-dependencies.jar:target/jnvmf-1.5-tests.jar com.ibm.jnvmf.benchmark.NvmfClientBenchmark -a 192.168.147.130 -p 4420 -g 4096 -i 3 -m RANDOM -n 10 -nqn nqn.2014-08.org.nvmexpress:uuid:dba6e945-7f15-482f-a913-e215b4560356 -qd 1 -rw read -s 4096 -qs 64 -H -I
read 4096bytes with QD = 1, time[s] = 3, pattern = RANDOM, runs = 10
log4j:WARN No appenders could be found for logger (com.ibm.disni).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
Exception in thread "main" com.ibm.jnvmf.UnsuccessfulComandException: Command was not successful. {StatusCodeType: 0 - Generic, SatusCode: 2 - A reserved coded value or an unsupported value in a defined field(other than opcode field), CID: 0, Do_not_retry: true, More: false, SQHD: 0}
at com.ibm.jnvmf.QueuePair.connect(QueuePair.java:128)
at com.ibm.jnvmf.AdminQueuePair.connect(AdminQueuePair.java:36)
at com.ibm.jnvmf.QueuePair.(QueuePair.java:195)
at com.ibm.jnvmf.QueuePair.(QueuePair.java:134)
at com.ibm.jnvmf.AdminQueuePair.(AdminQueuePair.java:31)
at com.ibm.jnvmf.Controller.(Controller.java:66)
at com.ibm.jnvmf.Nvme.connect(Nvme.java:50)
at com.ibm.jnvmf.Nvme.connect(Nvme.java:44)
at com.ibm.jnvmf.benchmark.NvmfClientBenchmark.connect(NvmfClientBenchmark.java:216)
at com.ibm.jnvmf.benchmark.NvmfClientBenchmark.(NvmfClientBenchmark.java:203)
at com.ibm.jnvmf.benchmark.NvmfClientBenchmark.main(NvmfClientBenchmark.java:551)

is there any famous product or open source project already use it?

Apache Crail is using this library. https://crail.apache.org/

linux kernel has LPT test that may cover NVMeOF, did we run through any industry standard test suites on this code?

LTP does not cover NVMf from looking at the LTP repository. I'm not aware of any free NVMf compliance tests. There is an official compliance test through University of New Hampshire but prices are quite steep. You do require a membership: https://www.iol.unh.edu/membership/fees
There is testing available in the library though, feel free to add tests that you seem appropriate (PR).

hello buddy, I have fixed the RDMA issue. now I suffer the code logic issue.
Could you please help me? regular the kernel nvme connect tool works well. only this lib failed.

As I told you above, unfortunately the Linux target is not standard conform. If you like I can elaborate why. However to work with the kernel target I introduced a legacy flag. After java add -Dnvmf.legacy=true.

gwnet commented

is there any famous product or open source project already use it?

Apache Crail is using this library. https://crail.apache.org/

linux kernel has LPT test that may cover NVMeOF, did we run through any industry standard test suites on this code?

LTP does not cover NVMf from looking at the LTP repository. I'm not aware of any free NVMf compliance tests. There is an official compliance test through University of New Hampshire but prices are quite steep. You do require a membership: https://www.iol.unh.edu/membership/fees
There is testing available in the library though, feel free to add tests that you seem appropriate (PR).

hello buddy, I have fixed the RDMA issue. now I suffer the code logic issue.
Could you please help me? regular the kernel nvme connect tool works well. only this lib failed.

As I told you above, unfortunately the Linux target is not standard conform. If you like I can elaborate why. However to work with the kernel target I introduced a legacy flag. After java add -Dnvmf.legacy=true.

thank you. I last time I add this legacy at the last of cmd line. I need add it right after java, right?
when I add at last, it told me no valid parameter.

Yes you need to add it before the class name. It is a java parameter.

gwnet commented

hello buddy, I add it as you suggested, but still not working. my kernel is wayne@ubuntu:~/jNVMf$ uname -a
Linux ubuntu 4.18.0-21-generic #22-Ubuntu SMP Wed May 15 13:13:21 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux

java -cp target/jnvmf-1.5-jar-with-dependencies.jar:target/jnvmf-1.5-tests.jar -Dnvmf.legacy=true com.ibm.jnvmf.benchmark.NvmfientBenchmark -a 192.168.147.130 -p 4420 -g 4096 -i 3 -m RANDOM -n 10 -nqn nqn.2014-08.org.nvmexpress:uuid:2b31ae51-8aa7-4781-93f3-1876199e2eea -qd 1 -rw read -s 4096 -qs 64 -H -I
read 4096bytes with QD = 1, time[s] = 3, pattern = RANDOM, runs = 10
log4j:WARN No appenders could be found for logger (com.ibm.disni).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
Exception in thread "main" com.ibm.jnvmf.UnsuccessfulComandException: Command was not successful. {StatusCodeType: 0 - Generic, SatusCode: 2 - A reserved coded value or an unsupported value in a defined field(other than opcode field), CID: 0, Do_not_retry: true, More: false, SQHD: 0}
at com.ibm.jnvmf.QueuePair.connect(QueuePair.java:128)
at com.ibm.jnvmf.AdminQueuePair.connect(AdminQueuePair.java:36)
at com.ibm.jnvmf.QueuePair.(QueuePair.java:195)
at com.ibm.jnvmf.QueuePair.(QueuePair.java:134)
at com.ibm.jnvmf.AdminQueuePair.(AdminQueuePair.java:31)
at com.ibm.jnvmf.Controller.(Controller.java:66)
at com.ibm.jnvmf.Nvme.connect(Nvme.java:50)
at com.ibm.jnvmf.Nvme.connect(Nvme.java:44)
at com.ibm.jnvmf.benchmark.NvmfClientBenchmark.connect(NvmfClientBenchmark.java:216)
at com.ibm.jnvmf.benchmark.NvmfClientBenchmark.(NvmfClientBenchmark.java:203)
at com.ibm.jnvmf.benchmark.NvmfClientBenchmark.main(NvmfClientBenchmark.java:551)

I'm sorry, there was a letter missing. It is -Djnvmf.legacy=true.

Since I have not heard from you I assume everyhing works now. I'm going to close this issue. Feel free to open a new issue if you run into other problems.

gwnet commented

buddy, I am just busy on other things, not time to verify it yet. :) can you give me more time.

No worries. Feel free to open a new issue at any time.

gwnet commented

hello buddy,
I fixed the typo issue. but I still suffer error. it told me invalid namespace. but I copy the string from the dmesg. Could you please help me?

wayne@ubuntu:~/jNVMf$ java -cp target/jnvmf-1.5-jar-with-dependencies.jar:target/jnvmf-1.5-tests.jar -Djnvmf.legacy=true com.ibm.jnvmf.benchmark.NvmfClientBenchmark -a 192.168.147.130 -p 4420 -g 4096 -i 3 -m RANDOM -n 10 -nqn nqn.2014-08.org.nvmexpress:uuid:ea997850-3aa5-47e0-b1eb-bd11046df1e5 -qd 1 -rw read -s 4096 -qs 64 -H -I
read 4096bytes with QD = 1, time[s] = 3, pattern = RANDOM, runs = 10
log4j:WARN No appenders could be found for logger (com.ibm.disni).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
Exception in thread "main" com.ibm.jnvmf.UnsuccessfulComandException: Command was not successful. {StatusCodeType: 1 - Command Specific, SatusCode: 130 - One or more of the parameters (Host NQN, Subsystem NQN, Host Identifier, Controller ID, Queue ID) specified are not valid., CID: 0, Do_not_retry: true, More: false, SQHD: 0}
at com.ibm.jnvmf.QueuePair.connect(QueuePair.java:128)
at com.ibm.jnvmf.AdminQueuePair.connect(AdminQueuePair.java:36)
at com.ibm.jnvmf.QueuePair.(QueuePair.java:195)
at com.ibm.jnvmf.QueuePair.(QueuePair.java:134)
at com.ibm.jnvmf.AdminQueuePair.(AdminQueuePair.java:31)
at com.ibm.jnvmf.Controller.(Controller.java:66)
at com.ibm.jnvmf.Nvme.connect(Nvme.java:50)
at com.ibm.jnvmf.Nvme.connect(Nvme.java:44)
at com.ibm.jnvmf.benchmark.NvmfClientBenchmark.connect(NvmfClientBenchmark.java:216)
at com.ibm.jnvmf.benchmark.NvmfClientBenchmark.(NvmfClientBenchmark.java:203)
at com.ibm.jnvmf.benchmark.NvmfClientBenchmark.main(NvmfClientBenchmark.java:551)

==below is the dmesg that I get the nqn name to put into argument list above==
[34926.170782] RPC: Registered rdma transport module.
[34926.170783] RPC: Registered rdma backchannel transport module.
[34993.373669] nvmet_rdma: enabling port 1 (192.168.147.130:4420)
[35059.810576] nvmet: creating controller 1 for subsystem nqn.2014-08.org.nvmexpress.discovery for NQN nqn.2014-08.org.nvmexpress:uuid:ea997850-3aa5-47e0-b1eb-bd11046df1e5.
[35059.810765] nvme nvme1: new ctrl: NQN "nqn.2014-08.org.nvmexpress.discovery", addr 192.168.147.130:4420
[35059.810932] nvme nvme1: Removing ctrl: NQN "nqn.2014-08.org.nvmexpress.discovery"
[35161.418867] nvmet: connect request for invalid subsystem nqn.2014-08.org.nvmexpress:uuid:ea997850-3aa5-47e0-b1eb-bd11046df1e5!
[35474.273405] nvmet: creating controller 1 for subsystem nqn.2014-08.org.nvmexpress.discovery for NQN nqn.2014-08.org.nvmexpress:uuid:ea997850-3aa5-47e0-b1eb-bd11046df1e5.
[35474.273544] nvme nvme1: new ctrl: NQN "nqn.2014-08.org.nvmexpress.discovery", addr 192.168.147.130:4420
[35474.275321] nvme nvme1: Removing ctrl: NQN "nqn.2014-08.org.nvmexpress.discovery"
[35523.940618] nvmet: connect request for invalid subsystem nqn.2014-08.org.nvmexpress:uuid:ea997850-3aa5-47e0-b1eb-bd11046df1e5!
wayne@ubuntu:~/jNVMf$

connect request for invalid subsystem nqn.2014-08.org.nvmexpress:uuid:ea997850-3aa5-47e0-b1eb-bd11046df1e5

You are confusing subsystem NQN and host NQN. There is no subsystem with NQN nqn.2014-08.org.nvmexpress:uuid:ea997850-3aa5-47e0-b1eb-bd11046df1e5 running on your NVMf server. So make sure to use the correct subsystem NQN when starting the benchmark. The subsystem NQN for your seems to be nqn.2014-08.org.nvmexpress.discovery

gwnet commented

buddy, I remember I have tried nqn.2014-08.org.nvmexpress.discovery, and failed too with this reason. can you show me how can I find a correct NQN name?

gwnet commented

buddy, here is the error if I follow your idea.

wayne@ubuntu:~/jNVMf$ java -cp target/jnvmf-1.5-jar-with-dependencies.jar:target/jnvmf-1.5-tests.jar -Djnvmf.legacy=true com.ibm.jnvmf.benchmark.NvmfClientBenchmark -a 192.168.147.130 -p 4420 -g 4096 -i 3 -m RANDOM -n 10 -nqn nqn.2014-08.org.nvmexpress.discovery -qd 1 -rw read -s 4096 -qs 64 -H -I
Exception in thread "main" java.lang.IllegalArgumentException: Invalid NQN
at com.ibm.jnvmf.NvmeQualifiedName.validate(NvmeQualifiedName.java:47)
at com.ibm.jnvmf.NvmeQualifiedName.(NvmeQualifiedName.java:36)
at com.ibm.jnvmf.benchmark.NvmfClientBenchmark.(NvmfClientBenchmark.java:177)
at com.ibm.jnvmf.benchmark.NvmfClientBenchmark.main(NvmfClientBenchmark.java:551)

gwnet commented

hello buddy, I redo the test. now this time, it looks like hang there for long time, but I did not see the exception error any more. there is no log for me to check? no console print too.

wayne@ubuntu:~/jNVMf$ java -cp target/jnvmf-1.5-jar-with-dependencies.jar:target/jnvmf-1.5-tests.jar -Djnvmf.legacy=true com.ibm.jnvmf.benchmark.NvmfClientBenchmark -a 192.168.147.130 -p 4420 -g 4096 -i 3 -m RANDOM -n 10 -nqn nqn.2014-08.org.nvmexpress:uuid:b29af7ed-79b8-4aac-83ee-bfa685c3fe07 -qd 1 -rw read -s 4096 -qs 64 -H -I
read 4096bytes with QD = 1, time[s] = 3, pattern = RANDOM, runs = 10
log4j:WARN No appenders could be found for logger (com.ibm.disni).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.

image

gwnet commented

hello buddy, last issue is my bad, I give the wrong target IP address.
now after I give the correct target IP address, my target and host are in the same machine for test purpose. I suffer new error now. please help me. I also use softroce, did you test this before?

java -cp target/jnvmf-1.5-jar-with-dependencies.jar:target/jnvmf-1.5-tests.jar -Djnvmf.legacy=true com.ibm.jnvmf.benchmark.NvmfClientBenchmark -a 192.168.10.129 -p 4420 -g 4096 -i 3 -m RANDOM -n 10 -nqn nqn.2014-08.org.nvmexpress:uuid:b29af7ed-79b8-4aac-83ee-bfa685c3fe07 -qd 1 -rw read -s 4096 -qs 64 -H -I
read 4096bytes with QD = 1, time[s] = 3, pattern = RANDOM, runs = 10
log4j:WARN No appenders could be found for logger (com.ibm.disni).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
Exception in thread "main" java.io.IOException: java.net.UnknownHostException: Address not defined
at com.ibm.jnvmf.QueuePair.connect(QueuePair.java:81)
at com.ibm.jnvmf.AdminQueuePair.connect(AdminQueuePair.java:36)
at com.ibm.jnvmf.QueuePair.(QueuePair.java:195)
at com.ibm.jnvmf.QueuePair.(QueuePair.java:134)
at com.ibm.jnvmf.AdminQueuePair.(AdminQueuePair.java:31)
at com.ibm.jnvmf.Controller.(Controller.java:66)
at com.ibm.jnvmf.Nvme.connect(Nvme.java:50)
at com.ibm.jnvmf.Nvme.connect(Nvme.java:44)
at com.ibm.jnvmf.benchmark.NvmfClientBenchmark.connect(NvmfClientBenchmark.java:216)
at com.ibm.jnvmf.benchmark.NvmfClientBenchmark.(NvmfClientBenchmark.java:203)
at com.ibm.jnvmf.benchmark.NvmfClientBenchmark.main(NvmfClientBenchmark.java:551)
Caused by: java.net.UnknownHostException: Address not defined
at com.ibm.disni.util.NetUtils.getIntIPFromInetAddress(NetUtils.java:46)
at com.ibm.disni.verbs.impl.RdmaCmNat.resolveAddr(RdmaCmNat.java:157)
at com.ibm.disni.verbs.RdmaCmId.resolveAddr(RdmaCmId.java:143)
at com.ibm.disni.RdmaEndpoint.connect(RdmaEndpoint.java:100)
at com.ibm.jnvmf.NvmfRdmaEndpoint.connect(NvmfRdmaEndpoint.java:59)
at com.ibm.jnvmf.QueuePair.connect(QueuePair.java:79)
... 10 more

gwnet commented

bear in mind the full kernel host and target is working with the same machine with softroce

Ok, let's go through this one by one:

  1. Can you please paste the output of ip addr or ifconfig. It seems that it does not find the IP address when trying to connect.
  2. Can you provide the output of find /sys/kernel/config/nvmet/subsystems
gwnet commented

got it. I will paste you ifconfig and subsystem when I arrive office next day. I can run nvme discover command well with the same IP and subsystem.
is it possible, disni compatility issue, I just build the disni with your steps myself. Do you have time to run through my scenario from your side too?

gwnet commented

wayne@ubuntu:~/disni/libdisni$ sudo ifconfig
[sudo] password for wayne:
ens32: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 192.168.10.129 netmask 255.255.255.0 broadcast 192.168.10.255
inet6 fe80::7391:d822:36cd:bcc7 prefixlen 64 scopeid 0x20
ether 00:0c:29:9a:c9:ff txqueuelen 1000 (Ethernet)
RX packets 55350 bytes 67606149 (67.6 MB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 11930 bytes 1106924 (1.1 MB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

wayne@ubuntu:~/disni/libdisni$ find /sys/kernel/config/nvmet/subsystems
/sys/kernel/config/nvmet/subsystems
/sys/kernel/config/nvmet/subsystems/testnqn
/sys/kernel/config/nvmet/subsystems/testnqn/allowed_hosts
/sys/kernel/config/nvmet/subsystems/testnqn/namespaces
/sys/kernel/config/nvmet/subsystems/testnqn/namespaces/1
/sys/kernel/config/nvmet/subsystems/testnqn/namespaces/1/enable
/sys/kernel/config/nvmet/subsystems/testnqn/namespaces/1/device_uuid
/sys/kernel/config/nvmet/subsystems/testnqn/namespaces/1/device_nguid
/sys/kernel/config/nvmet/subsystems/testnqn/namespaces/1/device_path
/sys/kernel/config/nvmet/subsystems/testnqn/attr_serial
/sys/kernel/config/nvmet/subsystems/testnqn/attr_version
/sys/kernel/config/nvmet/subsystems/testnqn/attr_allow_any_host

/> cd subsystems
/subsystems> ls
o- subsystems ................................................................................................................ [...]
/subsystems> create testnqn
/subsystems> ls
o- subsystems ................................................................................................................ [...]
o- testnqn ................................................................................................................. [...]
o- allowed_hosts ......................................................................................................... [...]
o- namespaces ............................................................................................................ [...]
/subsystems> cd testnqn
/subsystems/testnqn> ls
o- testnqn ................................................................................................................... [...]
o- allowed_hosts ........................................................................................................... [...]
o- namespaces .............................................................................................................. [...]
/subsystems/testnqn> set attr allow_any_host=1
Parameter allow_any_host is now '1'.
/subsystems/testnqn> ls
o- testnqn ................................................................................................................... [...]
o- allowed_hosts ........................................................................................................... [...]
o- namespaces .............................................................................................................. [...]
/subsystems/testnqn> cd namespaces
/subsystems/testnqn/namespaces> ls
o- namespaces ................................................................................................................ [...]
/subsystems/testnqn/namespaces> create 1
/subsystems/testnqn/namespaces> ls
o- namespaces ................................................................................................................ [...]
o- 1 ....................................................................................................................... [...]
/subsystems/testnqn/namespaces> cd 1
/subsystems/t.../namespaces/1> set device path=/dev/nvme0n1
Parameter path is now '/dev/nvme0n1'.
/subsystems/t.../namespaces/1> enable
The Namespace has been enabled.
/subsystems/t.../namespaces/1> cd ..
/subsystems/testnqn/namespaces> cd ..
/subsystems/testnqn> ls
o- testnqn ................................................................................................................... [...]
o- allowed_hosts ........................................................................................................... [...]
o- namespaces .............................................................................................................. [...]
o- 1 ..................................................................................................................... [...]
/subsystems/testnqn> cd ..
/subsystems> cd ..
/> ls
o- / ......................................................................................................................... [...]
o- hosts ................................................................................................................... [...]
o- ports ................................................................................................................... [...]
o- subsystems .............................................................................................................. [...]
o- testnqn ............................................................................................................... [...]
o- allowed_hosts ....................................................................................................... [...]
o- namespaces .......................................................................................................... [...]
o- 1 ................................................................................................................. [...]
/> cd ports
/ports> create 1
/ports> cd 1
/ports/1> ls
o- 1 ......................................................................................................................... [...]
o- referrals ............................................................................................................... [...]
o- subsystems .............................................................................................................. [...]
/ports/1> set device path=/dev/nvme0n1
Unknown configuration group: device
/ports/1> set addr trtype=rdma
Parameter trtype is now 'rdma'.
/ports/1> set addr adrfam=ipv4
Parameter adrfam is now 'ipv4'.
/ports/1> set addr traddr=192.168.10.129
Parameter traddr is now '192.168.10.129'.
/ports/1> set addr trsvcid=4420
Parameter trsvcid is now '4420'.
/ports/1> ls
o- 1 ......................................................................................................................... [...]
o- referrals ............................................................................................................... [...]
o- subsystems .............................................................................................................. [...]
/ports/1> cd subsystems
/ports/1/subsystems> ls
o- subsystems ................................................................................................................ [...]
/ports/1/subsystems> create testnqn
/ports/1/subsystems>

buddy, here is the log, let me know if you need anything else.

Ok, so the kernel NVMf allows non-standard subsystem NQNs. In this case "testnqn". Check out the NVMe Spec: https://nvmexpress.org/wp-content/uploads/NVM_Express_Revision_1.3.pdf 7.8 (page 215) of how a NQN is supposed to look like. In jNVMf I enforce this pattern. So you need to create a new subsystem with a valid NQN, e.g. "nqn.2019-06.com.gwnet:test". You should then be able to use this with the benchmark tool of jNVMf.

gwnet commented

Thank you so much!
I will test it. will keep you posted.

gwnet commented

buddy, we make progress now. new error please help again.

wayne@ubuntu:~/jNVMf$ java -cp target/jnvmf-1.5-jar-with-dependencies.jar:target/jnvmf-1.5-tests.jar -Djnvmf.legacy=true com.ibm.jnvmf.benchmark.NvmfClientBenchmark -a 192.1680.129 -p 4420 -g 4096 -i 3 -m RANDOM -n 10 -nqn nqn.2019-06.com.gwnet:test -qd 1 -rw read -s 4096 -qs 64 -H -I
read 4096bytes with QD = 1, time[s] = 3, pattern = RANDOM, runs = 10
log4j:WARN No appenders could be found for logger (com.ibm.disni).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
Identify Controller Data:
PCI Vendor ID: 0
PCI Subsystem Vendor ID: 0
Serial Number: ad06e81bca493e11
Model Number: Linux
Firmware Revision: 4.18.0-2
Maximum Data Transfer Size: 1073741824
Controller ID: 1
NVMe Version: 1.3.0
Required Submission Queue Entry Size: 64
Maximum Submission Queue Entry Size: 64
Required Completion Queue Entry Size: 16
Maximum Completion Queue Entry Size: 16
IO Queue Command Capsule Size: 4160
IO Queue Response Capsule Size: 16
In Capsule Data Offset: 0
Maximum SGL Data Block Descriptors: 1

Controller Capabilities:
Maximum Queue Entries Supported: 1024
Contiguous Queues Required: false
Arbitration Mechanism Supported:

Timeout: 7500 milliseconds
NVM Subsystem Reset Supported: false
Memory Page Size Minimum: 4096
Memory Page Size Maximum: 4096

Namespaces:
Id: 1
Size: 41943040
Capacity: 41943040
Formatted LBA:
Data Size: 512
Metadata Size: 0
Relative Performance: 0

Exception in thread "main" java.io.IOException: No command with CID 18
at com.ibm.jnvmf.QueuePair.handleSendWc(QueuePair.java:257)
at com.ibm.jnvmf.QueuePair.poll(QueuePair.java:304)
at com.ibm.jnvmf.benchmark.NvmfClientBenchmark.run(NvmfClientBenchmark.java:530)
at com.ibm.jnvmf.benchmark.NvmfClientBenchmark.main(NvmfClientBenchmark.java:552)

gwnet commented

just review the code, why command is NULL?
private final void handleSendWc(IbvWC wc) throws IOException {
int wrId = (int) wc.getWr_id();
Command command = commandMap[wrId];
if (command == null) {
throw new IOException("No command with CID " + wrId);
}
commandMap[wrId] = null;
if (wc.getStatus() != IbvWC.IbvWcStatus.IBV_WC_SUCCESS.ordinal()) {
command.getCallback().onFailure(RdmaException.fromInteger(wc.getOpcode(), wc.getStatus()));
} else {
command.getCallback().onComplete();
}
}

I'm not sure why this happens. Never had this error in my test environment. We need to debug this. Can you please print the "commandId" in the method "post" above handleSendWc?
EDIT:
I setup softroce on my system and ran into the same issue. While debugging I found that softroce sometimes introduces additional send completions. This is a bug in softroce. Check out this log where the last post with workrequest id 40 generates 2 send completions:

post: 38
complete send: 38
complete recv: 38
post: 39
complete send: 39
complete recv: 39
post: 40
complete send: 40
complete send: 40
129129 129130
40 40
Exception in thread "main" java.io.IOException: No command with CID 40
	at com.ibm.jnvmf.QueuePair.handleSendWc(QueuePair.java:271)
	at com.ibm.jnvmf.QueuePair.poll(QueuePair.java:321)
	at com.ibm.jnvmf.benchmark.NvmfClientBenchmark.run(NvmfClientBenchmark.java:530)
	at com.ibm.jnvmf.benchmark.NvmfClientBenchmark.main(NvmfClientBenchmark.java:552)

I was able to reproduce this with an SPDK client.

I recommend using a softRoCE alternative like SoftiWARP or real hardware.

gwnet commented

OK, I have HW, I will use HW to test. but with softRoCE, I can run kernel host well, I can run fio on host device. can you figure out the reason and fix this? this will help us develop code quickly. as you can see discover command is run well with softroce

gwnet commented

buddy, do you still need me to print the command ID?

gwnet commented

buddy, I remember one thing, I used to use softroce to play SPDK perf on NVMe Kernel Target. I remember it works. I feel it should not be one common issue. and also it is java code raise null pointer now. can you try to fix it? I will find one HW to debug too

gwnet commented

ah, so bad, my HW cannot compile libdisni, I need more time to test HW. but I am pretty sure my SPDK perf client works well with Kernel Target in my VM with softRoCE.

A few things:

  1. The kernel client runs different RDMA provider code than the user library. So it might very well be that the kernel client does not have this problem.
  2. No need to print anything from your side.
  3. I ran SPDK perf for a longer time (e.g. 30s) and ran into the same issue. Either completions not arriving, being generated twice or for the wrong operation type. In these scenarios the SPDK perf hangs forever and tries to poll from the CQ.

The SPDK command that hung indefinitely:

perf -r 'trtype:RDMA adrfam:IPv4 traddr:10.100.0.23 trsvcid:4420 subnqn:nqn.2017-06.io.crail:cnode1' -q 1 -o 4096 -w randread -t 30
gwnet commented

hello buddy
I used to talk with intel SPDK guy, they did not implement the keep alive timer well at host. I guess the hang is caused by that. for my error, I do not need wait 30s, it happen right away. I will fix my HW env and verify again. Could you please paste your result to me? I feel softroce will be OK.

Keep alive of the kernel target is 120s, so no problem for the 30s run. Which results?

gwnet commented

I mean your HW test result and your softroce test result with jNVMf?
and also your SPDK perf test result on softroce?

Do you want to see the error or actual results in terms of latency/throughput? For performance numbers take a look at this slidedeck: https://www.openfabrics.org/images/2018workshop/presentations/111_BMetzler_NVMfLessons.pdf
It has performance numbers for jNVMf.

gwnet commented

buddy good news. I do it again at my home, it works now.
wayne@ubuntu:~/jNVMf$ java -cp target/jnvmf-1.5-jar-with-dependencies.jar:target/jnvmf-1.5-tests.jar -Djnvmf.legacy=true com.ibm.jnvmf.benchmark.NvmfClientBenchmark -a 192.168.147.130 -p 4420 -g 4096 -i 3 -m RANDOM -n 10 -nqn nqn.2019-06.com.gwnet:test -qd 1 -rw read -s 4096 -qs 64 -H -I
read 4096bytes with QD = 1, time[s] = 3, pattern = RANDOM, runs = 10
log4j:WARN No appenders could be found for logger (com.ibm.disni).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
Identify Controller Data:
PCI Vendor ID: 0
PCI Subsystem Vendor ID: 0
Serial Number: e8dc883736c0631f
Model Number: Linux
Firmware Revision: 4.18.0-2
Maximum Data Transfer Size: 1073741824
Controller ID: 1
NVMe Version: 1.3.0
Required Submission Queue Entry Size: 64
Maximum Submission Queue Entry Size: 64
Required Completion Queue Entry Size: 16
Maximum Completion Queue Entry Size: 16
IO Queue Command Capsule Size: 4160
IO Queue Response Capsule Size: 16
In Capsule Data Offset: 0
Maximum SGL Data Block Descriptors: 1

Controller Capabilities:
Maximum Queue Entries Supported: 1024
Contiguous Queues Required: false
Arbitration Mechanism Supported:

Timeout: 7500 milliseconds
NVM Subsystem Reset Supported: false
Memory Page Size Minimum: 4096
Memory Page Size Maximum: 4096

Namespaces:
Id: 1
Size: 41943040
Capacity: 41943040
Formatted LBA:
Data Size: 512
Metadata Size: 0
Relative Performance: 0

t[ns] 3000036600, #ops 52431, iops 17476, mean[ns] 56552, tp[MB/s] 68.27, min[ns] 39909, max[ns] 1126397, p1.0% 44470, p10.0% 44729, p20.0% 44866, p30.0% 45010, p40.0% 45181, p50.0% 45346, p75.0% 46825, p90.0% 58355, p95.0% 77176, p98.0% 102993, p99.0% 161042, p99.9% 1012069, p99.99% 1042264
t[ns] 3000029012, #ops 60043, iops 20014, mean[ns] 49417, tp[MB/s] 78.18, min[ns] 40303, max[ns] 4108952, p1.0% 44580, p10.0% 44882, p20.0% 45083, p30.0% 45211, p40.0% 45309, p50.0% 45396, p75.0% 45962, p90.0% 57566, p95.0% 75658, p98.0% 89892, p99.0% 107221, p99.9% 168185, p99.99% 297741
t[ns] 3000034476, #ops 62386, iops 20795, mean[ns] 47553, tp[MB/s] 81.23, min[ns] 44023, max[ns] 186082, p1.0% 44390, p10.0% 44620, p20.0% 44728, p30.0% 44813, p40.0% 44894, p50.0% 44980, p75.0% 45369, p90.0% 51735, p95.0% 65269, p98.0% 84023, p99.0% 90303, p99.9% 112268, p99.99% 154547
t[ns] 3000029898, #ops 61647, iops 20548, mean[ns] 48125, tp[MB/s] 80.27, min[ns] 44109, max[ns] 4011031, p1.0% 44462, p10.0% 44695, p20.0% 44824, p30.0% 44936, p40.0% 45051, p50.0% 45167, p75.0% 45514, p90.0% 53921, p95.0% 67576, p98.0% 85840, p99.0% 95250, p99.9% 131971, p99.99% 190153
t[ns] 3000027191, #ops 62069, iops 20689, mean[ns] 47791, tp[MB/s] 80.82, min[ns] 43961, max[ns] 4055123, p1.0% 44368, p10.0% 44595, p20.0% 44702, p30.0% 44786, p40.0% 44869, p50.0% 44969, p75.0% 45431, p90.0% 52629, p95.0% 66937, p98.0% 84890, p99.0% 92780, p99.9% 116874, p99.99% 192580
t[ns] 3000034053, #ops 62232, iops 20743, mean[ns] 47667, tp[MB/s] 81.03, min[ns] 39347, max[ns] 905297, p1.0% 44338, p10.0% 44581, p20.0% 44683, p30.0% 44760, p40.0% 44829, p50.0% 44905, p75.0% 45266, p90.0% 53077, p95.0% 66129, p98.0% 84845, p99.0% 93291, p99.9% 125315, p99.99% 184707
t[ns] 3000028854, #ops 57109, iops 19036, mean[ns] 51955, tp[MB/s] 74.36, min[ns] 40056, max[ns] 10681582, p1.0% 44733, p10.0% 45050, p20.0% 45183, p30.0% 45287, p40.0% 45396, p50.0% 45567, p75.0% 47118, p90.0% 60208, p95.0% 75245, p98.0% 89002, p99.0% 102351, p99.9% 405224, p99.99% 4316869
t[ns] 3000041285, #ops 61516, iops 20505, mean[ns] 48220, tp[MB/s] 80.10, min[ns] 41841, max[ns] 4286078, p1.0% 44607, p10.0% 44878, p20.0% 45028, p30.0% 45135, p40.0% 45222, p50.0% 45301, p75.0% 45571, p90.0% 53345, p95.0% 68105, p98.0% 84852, p99.0% 93962, p99.9% 155554, p99.99% 220105
t[ns] 3000007731, #ops 61132, iops 20377, mean[ns] 48514, tp[MB/s] 79.60, min[ns] 44069, max[ns] 4046099, p1.0% 44387, p10.0% 44610, p20.0% 44722, p30.0% 44825, p40.0% 44945, p50.0% 45102, p75.0% 45514, p90.0% 53166, p95.0% 70260, p98.0% 89212, p99.0% 101903, p99.9% 166211, p99.99% 222320
t[ns] 3000032004, #ops 62400, iops 20799, mean[ns] 47528, tp[MB/s] 81.25, min[ns] 43967, max[ns] 523793, p1.0% 44388, p10.0% 44604, p20.0% 44702, p30.0% 44776, p40.0% 44844, p50.0% 44911, p75.0% 45194, p90.0% 51606, p95.0% 65572, p98.0% 84624, p99.0% 91501, p99.9% 116844, p99.99% 212214

gwnet commented

can you also share me the performance testing tips. how to match your parameter with FIO test. :)
my boss may ask me to compare.

Good for you, keep in mind that you might have just been lucky. I had a few runs with SoftRoCE that went through, that does not mean the error is gone.
For real testing I recommend using real hardware and different machines for client/server. Make sure to discard all blocks on the device before doing any serious write benchmarks otherwise write performance might be really low. Also keep in mind once all blocks have been discarded make sure to at least write them once before starting any read test (otherwise reads will be super fast since zero blocks will be returned).
Check out the help for an explanation of the command line arguments. (Generating a histogram like above with "-H" costs some performance so if you just want the mean of the operations don't use "-H")

gwnet commented

got it. I will test on HW. and can you give me all the parameter details? and for example
fio -filename=/dev/nvme0n1 -direct=1 -rw=read -ioengine=libaio -iodepth=16 -bs=8k -size 10G -numjobs=1 -runtime=30 -group_reporting -name=wayne
how to translate this fio command to your perf tool?

gwnet commented

buddy, BTW what is the difference between inline and incapsule

I'm not sure what you are planing to show with your fio example, but keep in mind if you want to show the device's IOPS performance use random IO, small io sizes (512b or 4K whatever is sector size) and large queue depth (e.g. 64 or 128), if you want to show throughput use sequential IO with large transfer sizes (128KB is the typical max transfer size of most NVMe devices so going above that does seldom make sense). If you want to measure latency use random IO and queue depth of 1. Your fio command translated:

  • -direct=1 nothing to do for jNVMf since all IO is direct, i.e. there is no buffering
  • -rw=read => -rw read -m SEQUENTIAL
  • -ioengine=libaio => nothing to do for jNVMf since there is only an async API
  • -iodepth=16 => -qd 16 (-qs allows to change queue size and should at least be qd, default is 128)
  • -bs=8k => -s 8192
  • -size 10G does not exist in the jNVMf benchmark, i.e. the whole device is always used
  • -numjobs=1 => does not exist in the jNVMf benchmark, always single threaded, if you want >1 you need to run multiple benchmarks in parallel
  • -runtime=30 => -i 30 -n 1 however I do not recommend this but instead use something like -i 3 -n 10 which shows stats every 3 seconds (10x).

Regarding inline vs incapsule. For incapsule take a look at page 11:
https://www.openfabrics.org/images/2018workshop/presentations/111_BMetzler_NVMfLessons.pdf
For inline search the web for inline RDMA you will find tons of explanations.

My tip:

  • Latency and IOPS measurements: use incapsule but not inline
  • Throughput: use inline but not incapsule

Essentially:

  • large IO (>4KB) => inline
  • Small IO (<=4KB) => incapsule

Do not use inline and incapsule together.

gwnet commented

Thank you so much Man. can we follow up with another issue. I think this issue is resovled. and I do have further questions on another issue. this issue is too long to load for me now. please check another issue.