quiver-team/quiver-feature

TLB test results

Closed this issue · 0 comments

256GB feature

  • Warp-based: 13.3GB/s
  • Warp-based + sort: 20.2GB/s
  • Warp-based + sort + multi-kernel: 21.1GB/s
  • Block-based: 12.4GB/s
  • Block-based + sort: 17.9GB/s
  • Block-based + sort + multi-kernel: 22.9GB/s
  • Upper bound (cudamemcpy): 26.3GB/s

64GB feature

  • Warp-based: 19.9GB/s
  • Warp-based + sort: 22.7GB/s
  • Warp-based + sort + multi-kernel: 22.0GB/s
  • Block-based: 16.3GB/s
  • Block-based + sort: 19.4GB/s
  • Block-based + sort + multi-kernel: 22.2GB/s

16GB feature

  • Warp-based: 20.2GB/s
  • Warp-based + sort: 22.1GB/s
  • Warp-based + sort + multi-kernel: 22.4GB/s
  • Block-based: 16.8GB/s
  • Block-based + sort: 19.5GB/s
  • Block-based + sort + multi-kernel: 22.3GB/s