berkeleyauv/perception

Convert code base to work with CUDA

Opened this issue · 1 comments

Use cuPy, runs numpy on cuda; very slow, unusable results

  • parallelizing Cython code (am currently facing issues with GIL, will try to convert code to pure C arrays to avoid the issue)
  • Convert repetitive code to for loops for easier parallelization
  • Convert pythonic code to static typed numpy arrays for better C performance
  • Profile current code to find other areas for performance improvement.