Reduce WebNN/DML tensor memory usage due to misalignment.
bbernhar opened this issue · 1 comments
bbernhar commented
WebNN/DML allocations are always a power-of-two size when sub-allocating. Unfortunately, this causes significant memory fragmentation when the tensor size is not a power-of-two. Since tensors can be very large in size, fragmentation becomes a significant bottleneck (or nearly 2x overhead). To remedy, I plan to fix via intel/GPGMM#130.