Expected behaviour of nd_item get_global_id(x) where x>NDims
Opened this issue · 4 comments
What is the expected behaviour of nd_item::get_global_id(x)
where x
is greater than the number of dimensions templating the nd_item
.
e.g.
constexpr int NDims = 1;
q.submit([&](sycl::handler &cgh) {
cgh.parallel_for<KernelName>(sycl::nd_range<NDims>{globalSize, localSize},
[=](sycl::nd_item<NDims> it) {
size_t dim0 = it.get_global_id(0);
size_t dim1 = it.get_global_id(1);
size_t dim2 = it.get_global_id(2);
});
});
What should it.get_global_id(1)
and it.get_global_id(2)
return in this case? The spec only states in the nd_item
member description of size_t get_global_id(int dimension) const
:
Return the constituent element of the global id representing the work-item’s position in the nd-range in the given Dimension.
I see a few potential options to remedy the spec with:
- Return
0
- Throw an asynchronous error
- Undefined behaviour
Ideally, IMO, returning 0
would be preferable since it is precisely defined and does not affect the execution of the kernel.
As a point of context, OpenCL returns 0
in "out of bound" cases:
The global work-item ID specifies the work-item ID based on the number of global work-items specified to execute the kernel. Valid values of dimindx are 0 to get_work_dim- 1. For other values of dimindx, get_global_id() returns 0.
That seems a little surprising. Presumably, providing a guarantee that 0
is returned requires a runtime check when dimindx
is a variable. Does OpenCL assume that it will almost always be a constant? Specifying UB would require the least overhead.
The WG has some concerns about mandating that get_global_id(int dim)
returns 0 when dim
is out of bounds:
-
If we guarantee that
get_global_id(int)
does an out-of-bounds check, what aboutid[n]
? Does theid
subscript operator also need to do an out-of-bounds check? -
Some implementations (e.g. safety critical) might prefer to abort the kernel if
get_global_id(int)
is out of bounds. If the spec mandates that it must return0
in this case, such an implementation would not be conformant. If the spec says that the behavior is UB in this case, that allows an implementation to report an error if it wants.