I am confused about the initElapsedS & readElapsedS.
CKopoer opened this issue · 4 comments
So, As these lines shows, we get the initElapsedS & readElapsedS from the difference of each other. Is't a mistake or something meaningful I haven't understood?
Otherwise, I get the results on H800 using another closed-source NV-STREAM tool. It seems that it provided better bandwidth performance result compared with BabelStream because of the optimized block size parameters. What's more, it also and show Read & Write results. Could I take the Init_kernel as the Write result and read_arrays as Read result in BabelStream?
I think the time read_arrays function consumed actually depend on the PCIe bandwidth? That function copy the data to Host from device
Yes, Init and Read is a new thing we report that measures the setup and read-back time of the buffers.
BabelStream does not have an direct equivalent of the Read and Write kernel in NV-STREAM.
It seems that it provided better bandwidth performance result compared with BabelStream because of the optimized block size parameters.
What's the performance difference you're observing?
@CKopoer the time intervals used for Init
and Read
are incorrect.
#186 fixes it (among other things).
Could I take the Init_kernel as the Write result and read_arrays as Read result in BabelStream?
The Init
and Read
timings stem from a single measurement, and are not intended to be a measure of Read
and Write
bandwidth.
After #186 is merged, adding proper Read
& Write
bandwidth benchmarks is straightforward (although still quite a bit of work since they need to be added to all languages). I agree that these two help paint a more complete picture about the hardware than Copy
.