outer product ger and batched outer product batched_ger
Closed this issue · 0 comments
jli05 commented
Can we assume that 1-dimensional array are always contiguously allocated on CPU & GPUs? I plan to change the incx
and incy
to both 1 in calls to cblas_<x>ger()
from within expr::BLASEngine::ger()
, then to use the natural column-stride of matrices X
and Y
in expr::BLASEngine::batched_ger()
. I'm implementing the Khatri-Rao product for tensor computations.