Passing Int128 as kernel parameter is not working
MoFtZ opened this issue · 3 comments
MoFtZ commented
I expected the new Int128
data type to just work, even if not necessarily performant. However, I have found an unexpected issue.
The following kernel does not work on Cuda, and generates the wrong output:
class Program
{
static void MyKernel(Index1D index, ArrayView<Int128> dataView, Int128 constant)
{
dataView[index] = index.X + constant;
}
static void Main()
{
using var context = Context.CreateDefault();
foreach (var device in context)
{
using var accelerator = device.CreateAccelerator(context);
var kernel = accelerator.LoadAutoGroupedStreamKernel<
Index1D, ArrayView<Int128>, Int128>(MyKernel);
using var buffer = accelerator.Allocate1D<Int128>(1024);
kernel((int)buffer.Length, buffer.View, 42);
var data = buffer.GetAsArray1D();
for (int i = 0, e = data.Length; i < e; ++i)
{
if (data[i] != 42 + i)
Console.WriteLine($"Error at element location {i}: {data[i]} found");
}
}
}
}
Expected Output:
data[0] = { Lower = 42, Upper = 0 }
data[1] = { Lower = 43, Upper = 0 }
data[2] = { Lower = 44, Upper = 0 }
etc
Actual Output on Cuda:
data[0] = { Lower = 0, Upper = 42 }
data[1] = { Lower = 1, Upper = 42 }
data[2] = { Lower = 2, Upper = 42 }
etc
m4rs-mt commented
@MoFtZ thanks for reporting this. A quick investigation revealed that it looks like that 64bit additions with carry are not mapped properly to the data structure.