NPU Acceleration

Question

NPU Acceleration

sikhness opened this issue 2 months ago · 4 comments

Currently, only a subset of devices can be passed through, with GPUs being one of them (albeit limited with only DirectX based frameworks).
With the rise and push of NPU/IPUs built into processors, it would be beneficial to provide NPU acceleration in Windows Containers to be able to containerize our AI/ML workloads.

Answer 1 · 2024-06-04T14:40:43.000Z

Hey @sikhness, similar question to your other issue. Can you help me understand what sort of workloads are you trying to run with NPU acceleration? Understanding this use case will help us better prioritize this request as we explore AI/ML workloads.

Answer 2 · 2024-06-05T00:45:19.000Z

Hey @fady-azmy-msft!
Similar to my other question, I did list out a few AI related workloads that would benefit from GPU Acceleration from vendor specific graphics APIs.

Some of those same AI workloads can also benefit from offloading that work to the NPU now and here is an example of Ryzen AI which provides instructions on how to install, prep and run your AI models on the NPU on Windows. It would be very beneficial to be able to containerize these applications for isolation & portability benefits and still leverage the hardware.

Answer 3 · 2024-06-19T17:23:31.000Z

Got it. Tagging @NAWhitehead to look into this. He's driving the Windows containers GPU scenarios, and this is related.

Answer 4 · 2024-07-11T15:51:34.000Z

I think you should get the class GUID for "Neural processors", try passing it as a --device class/the_guid, copy the drivers from the FileRepository into the container, and then see if the NPU works. Odds are low but crazier things have been true.