Bug in creating `TensorGPU` when `stream` key is `None` in CUDA array interface

Version

1.36.0

Describe the bug.

A TensorGPU can be created from any object conforming to a CUDA Array Interface. Version 3 of this interface (accodring to numba docs) has a stream property, which can be either an integer or None.

DALI, however, always assumes that it will be an integer - which can lead to bugs in some cases, like in the example where I am trying to convert a gpuarray from pycuda package.

The code responsible for this bug is

DALI/dali/python/backend_impl.cc

Lines 283 to 286 in dedcfae

    
           if (cu_a_interface.contains("stream")) { 
        
              auto order = AccessOrder(cudaStream_t(PyLong_AsVoidPtr(cu_a_interface["stream"].ptr()))); 
        
              batch->set_order(order); 
        
           }

and the bug was introduced with #5125, which was released in 1.32.0

Minimum reproducible example

import numpy as np
import nvidia.dali.backend as backend
import pycuda.autoinit  # noqa
import pycuda.gpuarray as gpuarray

test_input = np.random.randn(4, 4).astype(np.float32)
g = gpuarray.to_gpu(test_input)
print(g.__cuda_array_interface__)
# {'shape': (4, 4), 'strides': (16, 4), 'data': (139775629590528, False), 'typestr': '<f4', 'stream': None, 'version': 3}
backend.TensorGPU(g)

Relevant log output

SystemError: <built-in method __init__ of PyCapsule object at 0x7f3cca6db060> returned a result with an exception set

Other/Misc.

No response

Check for duplicates

I have searched the open bugs/issues and have found no duplicates for this bug report

@tadejsv Thanks for reporting. It looks indeed like an omission. We'll look into that.

I think #5425 should fix the issue. Please check the nightly build after it is merged.

The 1.37 is available. Please reopen if this still doesn't work.

	if (cu_a_interface.contains("stream")) {
	auto order = AccessOrder(cudaStream_t(PyLong_AsVoidPtr(cu_a_interface["stream"].ptr())));
	batch->set_order(order);
	}