fh2019ustc/DocGeoNet

a question about prepossessing

Closed this issue · 3 comments

hello hao,
Thanks for your third awesome work for document image dewarping. I have a simple question, that is,
In your section 5.2 of the paper , you pointed out that The preprocessing module and the following rectification module are trained independently. as shown in the following:

image
My question is why don't you train the whole flow jointly? Have you made any trials to verify the 2-stage rectification would be better?

sorry, the issue title should be preprocessing, instead of prepossessing

Hi, thanks for your attention to our work, and this is a nice concern.
In fact, the whole network can be trained jointly.
We perform the 2-stage rectification here based on the following reasons,

  1. The gradients can be backpropagated to the prepossessing network due to the nondifferentiable mask. Because the distorted image should be multiplied by the document mask(0 denotes the background and 1 denotes the foreground document region).
  2. Joint training would consume more GPUs.
    I hope this can help you.

I see, it helps me a lot, thanks for your kind response