Page dewarp cuts text
Opened this issue · 8 comments
I have been using this using the command line python page_dewarp.py image.jpg with the included images successfully but when using an image of my own I could see that the text is cropped and cut.
The image was a 2MB jpeg with the resolution of 3400x4600px
This is the original and the resulted image:
Do you know what might be the problem? Thanks
Try setting PAGE_MARGIN_X
and PAGE_MARGIN_Y
to zeros.
I personally would prefer other defaults:
- no cropping
- no binarization (black/white)
- no subsampling (full resolution)
- = dewarping only
I personally would prefer other defaults:
- no cropping
- no binarization (black/white)
- no subsampling (full resolution)
- = dewarping only
how to set in code?
diff --git a/page_dewarp.py b/page_dewarp.py
index 6ef5b33..d095244 100755
--- a/page_dewarp.py
+++ b/page_dewarp.py
@@ -20,8 +20,8 @@ import scipy.optimize
# for some reason pylint complains about cv2 members being undefined :(
# pylint: disable=E1101
-PAGE_MARGIN_X = 50 # reduced px to ignore near L/R edge
-PAGE_MARGIN_Y = 20 # reduced px to ignore near T/B edge
+PAGE_MARGIN_X = 0 # reduced px to ignore near L/R edge
+PAGE_MARGIN_Y = 0 # reduced px to ignore near T/B edge
OUTPUT_ZOOM = 1.0 # how much to zoom output relative to *original* image
OUTPUT_DPI = 300 # just affects stated DPI of PNG, not appearance
@@ -813,17 +813,13 @@ def remap_image(name, img, small, page_dims, params):
image_y_coords = cv2.resize(image_y_coords, (width, height),
interpolation=cv2.INTER_CUBIC)
- img_gray = cv2.cvtColor(img, cv2.COLOR_RGB2GRAY)
-
- remapped = cv2.remap(img_gray, image_x_coords, image_y_coords,
+ remapped = cv2.remap(img, image_x_coords, image_y_coords,
cv2.INTER_CUBIC,
None, cv2.BORDER_REPLICATE)
- thresh = cv2.adaptiveThreshold(remapped, 255, cv2.ADAPTIVE_THRESH_MEAN_C,
- cv2.THRESH_BINARY, ADAPTIVE_WINSZ, 25)
+ thresh = remapped
pil_image = Image.fromarray(thresh)
- pil_image = pil_image.convert('1')
threshfile = name + '_thresh.png'
pil_image.save(threshfile, dpi=(OUTPUT_DPI, OUTPUT_DPI))
diff --git a/page_dewarp.py b/page_dewarp.py index 6ef5b33..d095244 100755 --- a/page_dewarp.py +++ b/page_dewarp.py @@ -20,8 +20,8 @@ import scipy.optimize # for some reason pylint complains about cv2 members being undefined :( # pylint: disable=E1101 -PAGE_MARGIN_X = 50 # reduced px to ignore near L/R edge -PAGE_MARGIN_Y = 20 # reduced px to ignore near T/B edge +PAGE_MARGIN_X = 0 # reduced px to ignore near L/R edge +PAGE_MARGIN_Y = 0 # reduced px to ignore near T/B edge OUTPUT_ZOOM = 1.0 # how much to zoom output relative to *original* image OUTPUT_DPI = 300 # just affects stated DPI of PNG, not appearance @@ -813,17 +813,13 @@ def remap_image(name, img, small, page_dims, params): image_y_coords = cv2.resize(image_y_coords, (width, height), interpolation=cv2.INTER_CUBIC) - img_gray = cv2.cvtColor(img, cv2.COLOR_RGB2GRAY) - - remapped = cv2.remap(img_gray, image_x_coords, image_y_coords, + remapped = cv2.remap(img, image_x_coords, image_y_coords, cv2.INTER_CUBIC, None, cv2.BORDER_REPLICATE) - thresh = cv2.adaptiveThreshold(remapped, 255, cv2.ADAPTIVE_THRESH_MEAN_C, - cv2.THRESH_BINARY, ADAPTIVE_WINSZ, 25) + thresh = remapped pil_image = Image.fromarray(thresh) - pil_image = pil_image.convert('1') threshfile = name + '_thresh.png' pil_image.save(threshfile, dpi=(OUTPUT_DPI, OUTPUT_DPI))
thank you very much
hi @jbarth-ubhd @KyleWang-Hunter I have problem same cut text. it cut text at the end of image
I had set margin_x, margin_y to zeros. How to fix it?? Thanks in advance
For such slightly skewed text without curvature from bent paper, I would use a much simpler algorithm, e. g. https://github.com/jbarth-ubhd/fix-perspective