baudm/parseq

Doctr meets parseq

felixdittrich92 opened this issue ยท 8 comments

Hey @baudm ๐Ÿ‘‹,

First of all, thank you for your great work and the paper, it was really interesting to read.
We are currently integrating PARSeq into doctr (as well as a tensorflow port of the architecture).
Would you be interested in reviewing our implementation ?
We would be happy to do so. ๐Ÿ”ฅ

PR Link:
mindee/doctr#1205

Best regards
Felix

baudm commented

Hello @felixdittrich92! Didn't notice this until today. Thanks for integrating PARSeq into doctr. Will review it soon and add comments to the linked PR.๐Ÿ™‚

Nice thanks ๐Ÿ‘
We have faiced some problems so updates are currently on progress at: https://github.com/felixdittrich92/doctr/tree/parseq-fixes

Happy to get your feedback

baudm commented

Hi @felixdittrich92. I left some comments on the original PR, but I realized I could have commented on this commit instead.

Hey @baudm ๐Ÿ‘‹ could you make take a look again at https://github.com/mindee/doctr/pull/1205/files/02bf218ddd5aaae1d8dff34a6b8bac9ea49532b6#diff-1d5f7225deae1be1f87693e0c36a2eb0810f25e9f29ddaaecd174df372935472
? I still stuck a bit with the perm attention masks :)

@nikokks asked also if you could have a look ๐Ÿ˜…
mindee/doctr#1226

Sry for all this links and thanks for your help ๐Ÿ‘
If we have it correctly working in PyTorch the TF port should be also much easier

baudm commented

Sure. Will take a look again tomorrow. Which PR is the final one though?

Sure. Will take a look again tomorrow. Which PR is the final one though?

For Pytorch:
https://github.com/mindee/doctr/pull/1227/files still not sure if the ANDed mask is correct ๐Ÿ˜…

For decoding i have checked the masks and compared with yours that should be fine to work with our MHA which takes 0 for masked and 1 for visible

I have also added some test outputs

Hey @baudm ๐Ÿ‘‹
Finally PARSeq will be available in doctr in PyTorch and TensorFlow after mindee/doctr#1228 and mindee/doctr#1227 are merged :)

Btw. the AND worked well also while training for padding + target masks (with our own MHA implementation) ... the biggest problem was that I totally forgot that pad_sequences only pads to the longest sequence in the batch and not to the given maximum length, which is the exact opposite for us.
Of course that brokes everything ๐Ÿ˜…

But now all works well ๐Ÿ‘

baudm commented

Hi @felixdittrich92 sorry for the really late reply. Got swamped with my day job + random personal stuff this weekend. Glad to see that it's working fine now. :)