HuwCampbell/grenade

OpenCL support

Opened this issue · 6 comments

I know its too much to ask, but can i expect OpenCL support for AMD/nVidia gpu cards?

I think haskell is best for pretty much anything, so I wish to have a haskell library that supports gpu computations and can ease the level of experimentation/research that I can do with neural nets!

TL, DR: Can grenade provide me performance on par with PyTorch using GPU?

I'm not a contributor but I noticed there's this effort to provide an accelerate implementation: #38

And, as far as AMD hardware goes, this seems to be the state of things at the moment: https://www.reddit.com/r/haskell/comments/4ehggz/accelerate_library_and_opencl/d20z05p/

At the moment, grenade's numerical capabilities are provided through hmatrix. So its BLAS operations are determined by what hmatrix supports. Now hmatrix can support a few different BLAS backends, so I don't think it's unfeasible that CUDA or OpenCL could be supported through that path.

There has been an attempt at an accelerate addition, but it has stalled. I had a chat to Trevor about it and we thought that the particular approach wasn't quite right. I think there's more upfront and explicit design required on that front before we try again.

@HuwCampbell Do you mean that if hmatrix supports gpu backend then it's done?

That could be one path. I know hmatrix can use different BLAS backends, so I don't think it would be impossible.

Really, TBH, I think haskell could use a new, really solid NDArray library. hmatrix is pretty good but it's support for 3D and 4D matrices is poor.

I might close this issue and there's nothing currently actionable to it. But feel free to continue the conversation.

@HuwCampbell Currently are there any new NDArray library that we can work with?

I read few other issues, and above provided reddit thread, it seems Trevor agreed to get things done. Hopefully, accelerate would find required manpower to develop an OpenCL backend, someday. TBH, at one point, I thought of getting into literature of heterogenous parallel programming and getting it done myself, but then it all costs time.