Question: Will this design include online-learning / on-device-training spec
WenheLI opened this issue · 8 comments
Hi there, I am wondering if this design will cover the on-device-training / capability of updating-param in the model. From my use case, it is kinda useful (and interesting) to enable the ability of training model for every single user and this gives users a customized model.
@WenheLI It is not in the scope of the current phase of the proposal, but if you have ideas of specific APIs that you would like to add, let's have a discussion. Thanks for your interest!
@mingqiusun - Thanks for your response! May I know what is the procedure of adding/discussing API design
?
I'm interested in participating as well. One thing that's not clear to me is what level of abstraction wasi-nn is intended to be at?
EDIT: I see there's some links to https://webmachinelearning.github.io/webnn/ and https://github.com/webmachinelearning/webnn/blob/master/explainer.md as background.
Your feedback would be great! The primary ways to discuss are GitHub issues like this one but @mingqiusun and I are also usually online in the BytecodeAlliance Zulip channel.
One thing that's not clear to me is what level of abstraction wasi-nn is intended to be at?
Can you clarify? Here's the current API documentation if that helps.
@abrown thank you for the invitation!
I think my questions about the level of abstraction are partially answered here:
https://github.com/webmachinelearning/webnn/blob/master/explainer.md#stay-the-course-and-build-machine-learning-solutions-on-webglwebgpu
and
https://github.com/webmachinelearning/model-loader/blob/master/explainer.md
I'm new to this effort but my initial assumption as to how NN's would work was for wasi to be responsible for making resources available and for standard NN frameworks (pytorch, tensorflow, etc.) to lean on a wasm-fied variants of their workflows - for deployment in the typical case, but on device training would also work out of the box if you could get the framework compiled on wasm so long as one can tolerate larger binaries for now.
I get the points about hardware access, but the flipside is there's also a usability and maintenance cost with impedance mismatches between model development framework and a common-denominator wasi standard.
At any rate, it seems like there's been quite a few iterations on these design discussions so I'll try to catch up and join the zulip chat. This is an exciting initiative.
@austinvhuang What we had done so far is that you don't need to compile any framework into Wasm, if all you want to do is to load a model and do inferencing.
But you are right about training would work if you compile your framework into Wasm. If you need hardware acceleration for some operations on AVX512, GPU, etc., WASI calls would need to be added to support that. This would be in scope for the next phase of wasi-nn. We would certainly welcome inputs on priorities of those operations.
@mingqiusun thanks for being so open to input.
My personal perspective is that hardware acceleration access is much more valuable a narrow common-denominator inference-only specification.
There's a lot that goes into a framework like pytorch and with research the important components are evolving (for example, transformer layers in the past few years, maybe more graph NNs in the future, etc.). If wasm can piggyback on all that continuous development by providing access to hardware acceleration that's a high point of leverage that will return dividends with time.
A baked-in spec for model inference is nice to have but not nearly as useful. There's always going to be some degree of impedence mismatches with the frameworks. That said, for an inference spec definition I would stay close ONNX over the current tight integration with openvino. ONNX might be as good as it gets in terms of a common denominator and there's still pain points with framework mismatches. I don't see openvino widely used in the community and its stated scope (computer vision, CNNs) doesn't cover NLP which has a huge fraction of use cases.
@austinvhuang Thanks for the good inputs. The rational for starting with influencing is that we want to support Wasm as a machine learning deployment vehicle for popular frameworks first. The main reason we selected OpenVino in our reference implementation on Wasmtime is its multi-hardware support on CPU/GPU/fPGA. There is a multi-framework support in OpenVino via its model converter utility.
We will be looking at supporting acceleration for individual operations in the next phase, and would like to welcome your involvement. If you could, please provide inputs on top ops you would like to have hardware acceleration supported.