hidet-org/hidet

Will hidet launch all cuda kernel on the same cudaStream?

Closed this issue · 2 comments

Hi, I noticed in generated cuda kernel has launch methods like this:

void launch(float * __restrict__ b, float * __restrict__ y, float * __restrict__ data, float * __restrict__ c) {
  batch_matmul_kernel<<<dim3(196, 1, 1), dim3(64, 1, 1), 0, (cudaStream_t)get_cuda_stream()>>>(b, data, y, c);
  {cudaError_t err = cudaGetLastError(); if (err != cudaSuccess) LOG(ERROR) << "CUDA error: " << cudaGetErrorString(err) << "\n";}
}

And the cudaStream paramater is set to (cudaStream_t)get_cuda_stream(). This is returned by a singleton class CudaContext::global(). Is this for all kernels without changing(or arranging)?

Thanks.

Hi @VincentXWD,

Yes, it is for all cuda kernels. Hidet comes with two libraries: libhidet.so and libhidet_runtime.so (see also here). The former is used to implement the compilation part (if we find anything that is efficient to implement in python, we can implement in C++). The latter one implements the runtime to support the hidet's compiled model (e.g., such as the current cuda stream the next cuda kernel should be launched on: get_cuda_stream()).

We can use this api to change the current cuda stream to launch the kernels.

I see. It seems that it is determined by users. Thanks @yaoyaoding .