AlexanderLutsenko/nobuco

multihead attention - no converter found

Closed this issue · 3 comments

❌ Validation exception on node 'MultiheadAttention':
PyTorch op: MultiheadAttention(
(out_proj): NonDynamicallyQuantizableLinear(in_features=128, out_features=128, bias=True)
)
Keras op: ChangeOrderingLayer(func=<function converter_MultiheadAttention..func at 0x75b6b7e9f380>)
Input args: ('Tensor(shape=[4096, 1, 128], dtype=torch.float32)', 'Tensor(shape=[4096, 1, 128], dtype=torch.float32)', 'Tensor(shape=[4096, 1, 128], dtype=torch.float32)')
Input kwargs: {}
Output tensors: ['Tensor(shape=[4096, 1, 128], dtype=torch.float32)', 'Tensor(shape=[1, 4096, 4096], dtype=torch.float32)']
Exception: You called set_weights(weights) on layer "multi_head_attention" with a weight list of length 8, but the layer was expecting 0 weights. Provided weights: [array([[[-0.05060041, -0.01487129, 0.10044055, ....
Traceback:

❌ Validation exception on node 'MultiheadAttentionModel':
PyTorch op: MultiheadAttentionModel(
(multihead_attn): MultiheadAttention(
(out_proj): NonDynamicallyQuantizableLinear(in_features=128, out_features=128, bias=True)
)
)
Keras op: <nobuco.layers.container.TransientContainer object at 0x75b6b7d08290>
Input args: ('Tensor(shape=[1, 128, 4096], dtype=torch.float32)',)
Input kwargs: {}
Output tensors: ['Tensor(shape=[1, 128, 4096], dtype=torch.float32)']
Exception: You called set_weights(weights) on layer "multi_head_attention_1" with a weight list of length 8, but the layer was expecting 0 weights. Provided weights: [array([[[-0.05060041, -0.01487129, 0.10044055, ....
Traceback:

[Nobuco] Converting (DONE): |████████████████████████████████████████████████████████████████████████████████| 26/26 ops [00:00]
Legend:
Green — conversion successful
Yellow — conversion imprecise
Red — conversion failed
Red — no converter found
Bold — conversion applied directly
* — subgraph reused
Tensor — this output is not dependent on any of subgraph's input tensors
Tensor — this input is a parameter / constant
Tensor — this tensor is useless

sample code is in
#64

is this a known issue - or just something that's slipped by without anyone needing?

basically crafted some keras / torch classes - mostly working now
https://github.com/johndpope/IMF/blob/feat/tensorflow-cips/tf-export2.py

Exception: You called set_weights(weights) on layer "multi_head_attention_1" with a weight list of length 8, but the layer was expecting 0 weights.

Ah, I see. Some Keras layers do not initialize their weights until the first forward pass. If that's the case, it needs to be done inside that specific node's converter. I'll take a look at it later.