Question about L1Norm when training

Question

Question about L1Norm when training

Closed this issue 7 months ago · 4 comments

Hi @Raincleared-Song ,
Confusion about performing L1norm on which dim given the input x of shape [bs, seq_len, hidden].
Is the L1 Norm output in the shape of [bs,] or [bs, seq_len]?
Thanks~

Answer 1 · 2024-03-07T13:18:45.000Z

I'm sorry but our operator on the master branch does not support batch operations at present. In other words, the batch size must be 1 to run properly.
However, the operator with batch processing is already under active development in this repo. After the naive implementation is completed, it will be pushed to the current repo as a new branch.

Answer 2 · 2024-03-08T01:45:59.000Z

Thanks for your reply.
But actually I'm focusing on the output shape of L1 Norm in formula 2 during training stage:

The bs is unlikely to be 1 during training, is that right?

Answer 3 · 2024-03-08T04:25:59.000Z

You are right. As our operators are tailored for inference at the beginning, we did not add the batch processing feature. Also, we did not apply the operators in the training stage.
Still, you may pay attention to our developing feature in the following repo.

However, the operator with batch processing is already under active development in this repo. After the naive implementation is completed, it will be pushed to the current repo as a new branch.

Answer 4 · 2024-03-08T06:37:29.000Z

OK，thanks~