Doctr meets parseq

Question

Doctr meets parseq

felixdittrich92 opened this issue a year ago · 8 comments

Hey @baudm 👋,

First of all, thank you for your great work and the paper, it was really interesting to read.
We are currently integrating PARSeq into doctr (as well as a tensorflow port of the architecture).
Would you be interested in reviewing our implementation ?
We would be happy to do so. 🔥

PR Link:
mindee/doctr#1205

Best regards
Felix

Answer 1 · 2023-06-16T12:29:27.000Z

Hello @felixdittrich92! Didn't notice this until today. Thanks for integrating PARSeq into doctr. Will review it soon and add comments to the linked PR.🙂

Answer 2 · 2023-06-16T12:35:28.000Z

Nice thanks 👍
We have faiced some problems so updates are currently on progress at: https://github.com/felixdittrich92/doctr/tree/parseq-fixes

Happy to get your feedback

Answer 3 · 2023-06-16T19:16:29.000Z

Hi @felixdittrich92. I left some comments on the original PR, but I realized I could have commented on this commit instead.

Answer 4 · 2023-06-19T14:49:05.000Z

Hey @baudm 👋 could you make take a look again at https://github.com/mindee/doctr/pull/1205/files/02bf218ddd5aaae1d8dff34a6b8bac9ea49532b6#diff-1d5f7225deae1be1f87693e0c36a2eb0810f25e9f29ddaaecd174df372935472
? I still stuck a bit with the perm attention masks :)

@nikokks asked also if you could have a look 😅
mindee/doctr#1226

Sry for all this links and thanks for your help 👍
If we have it correctly working in PyTorch the TF port should be also much easier

Answer 5 · 2023-06-21T02:37:48.000Z

Sure. Will take a look again tomorrow. Which PR is the final one though?

Answer 6 · 2023-06-21T07:59:57.000Z

Sure. Will take a look again tomorrow. Which PR is the final one though?

For Pytorch:
https://github.com/mindee/doctr/pull/1227/files still not sure if the ANDed mask is correct 😅

For decoding i have checked the masks and compared with yours that should be fine to work with our MHA which takes 0 for masked and 1 for visible

I have also added some test outputs

Answer 7 · 2023-06-23T06:48:25.000Z

Hey @baudm 👋
Finally PARSeq will be available in doctr in PyTorch and TensorFlow after mindee/doctr#1228 and mindee/doctr#1227 are merged :)

Btw. the AND worked well also while training for padding + target masks (with our own MHA implementation) ... the biggest problem was that I totally forgot that pad_sequences only pads to the longest sequence in the batch and not to the given maximum length, which is the exact opposite for us.
Of course that brokes everything 😅

But now all works well 👍

Answer 8 · 2023-06-25T14:15:03.000Z

Hi @felixdittrich92 sorry for the really late reply. Got swamped with my day job + random personal stuff this weekend. Glad to see that it's working fine now. :)