dynamic dimension shape input
Closed this issue · 1 comments
zhuzhu18 commented
Does QNN support input of dynamic dimensions, such as the width and height of input pictures are usually not fixed in super-resolution models. Any plans to implement this feature?
ONNX runtime supports dynamic dimension input through CPU and CUDA inference, but when it uses QNN as the execution provider, the documentation also states that it does not support dynamic dimension input.
heydavid525 commented
To my knowledge, currently QNN doesn't support dynamic input shapes out of box. However, there are a few ways to emulate that behaviors:
- Multiple graph with shared weights. QNN supports creating a context binary from multiple shared model library. The context binary would dedup the weights, so you effectively have multiple models / functions to call within a single binary. Search in qnn doc for
"context": {
"weight_sharing_enabled": True
}
This is how Llama or LLMs in general able to process prompt tokens and generate tokens efficiently.
- Using slice / mask to turn fixed shape input into variable shape inputs. This may not necessarily reduce computation for shorter sequence, but it can get the correct outputs.