ZeroDivisionError: float division by zero
Closed this issue · 7 comments
I'm using page-dewarp that was installed from pip into venv on Ubuntu 24.04.
venv/bin/pip list
Package Version
--------------- -----------
contourpy 1.2.1
cycler 0.12.1
fonttools 4.51.0
kiwisolver 1.4.5
matplotlib 3.8.4
mpmath 1.3.0
numpy 1.26.4
opencv-python 4.9.0.80
packaging 24.0
page-dewarp 0.1.5
pillow 10.3.0
pip 24.0
pyparsing 3.1.2
python-dateutil 2.9.0.post0
scipy 1.13.0
six 1.16.0
sympy 1.12
toml 0.10.2
tomlkit 0.12.5
It reports:
venv/bin/page-dewarp -d 3 ep.jpg
Loaded ep.jpg at size='723x854' --> resized='362x427'
Traceback (most recent call last):
File "/tmp/venv/bin/page-dewarp", line 8, in <module>
sys.exit(main())
^^^^^^
File "/tmp/venv/lib/python3.12/site-packages/page_dewarp/__main__.py", line 21, in main
processed_img = WarpedImage(imgfile)
^^^^^^^^^^^^^^^^^^^^
File "/tmp/venv/lib/python3.12/site-packages/page_dewarp/image.py", line 55, in __init__
self.contour_list = self.contour_info(text=True)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/tmp/venv/lib/python3.12/site-packages/page_dewarp/image.py", line 155, in contour_info
return mask.contours()
^^^^^^^^^^^^^^^
File "/tmp/venv/lib/python3.12/site-packages/page_dewarp/mask.py", line 59, in contours
return get_contours(self.name, self.small, self.value)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/tmp/venv/lib/python3.12/site-packages/page_dewarp/contours.py", line 107, in get_contours
contours_out.append(ContourInfo(contour, rect, tight_mask))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/tmp/venv/lib/python3.12/site-packages/page_dewarp/contours.py", line 60, in __init__
self.center, self.tangent = blob_mean_and_tangent(contour)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/tmp/venv/lib/python3.12/site-packages/page_dewarp/contours.py", line 40, in blob_mean_and_tangent
mean_x = moments["m10"] / area
~~~~~~~~~~~~~~~^~~~~~
ZeroDivisionError: float division by zero
Thanks for including the image, I reproduced this.
I've added test instrumentation (if you run uv sync
you'll get development dependencies now, equivalent to uv pip install pysnooper
, or if not using uv pip install pysnooper
).
It arises from an all-zero moments
14:41:36.505573 call 32 def blob_mean_and_tangent(contour):
14:41:36.505844 line 39 moments = cv2_moments(contour)
New var:....... moments = {'m00': 0.0, 'm10': 0.0, 'm01': 0.0, 'm20': 0.0,...u30': 0.0, 'nu21': 0.0, 'nu12': 0.0, 'nu03': 0.0}
14:41:36.506094 line 40 area = moments["m00"]
New var:....... area = 0.0
14:41:36.506349 line 41 mean_x = moments["m10"] / area
14:41:36.506605 exception 41 mean_x = moments["m10"] / area
Exception:..... ZeroDivisionError: float division by zero
so we can breakpoint the bug with
area = moments["m00"]
if not area:
breakpoint()
Then in PDB we can pprint the moments
, which are all zero
(Pdb) pp moments
{'m00': 0.0,
'm01': 0.0,
'm02': 0.0,
'm03': 0.0,
'm10': 0.0,
'm11': 0.0,
'm12': 0.0,
'm20': 0.0,
'm21': 0.0,
'm30': 0.0,
'mu02': 0.0,
'mu03': 0.0,
'mu11': 0.0,
'mu12': 0.0,
'mu20': 0.0,
'mu21': 0.0,
'mu30': 0.0,
'nu02': 0.0,
'nu03': 0.0,
'nu11': 0.0,
'nu12': 0.0,
'nu20': 0.0,
'nu21': 0.0,
'nu30': 0.0}
Click to show the value of contours
giving rise to this zero moment
(Pdb) p contour
array([[[279, 351]],
[[278, 352]],
[[277, 352]],
[[276, 352]],
[[275, 352]],
[[274, 352]],
[[273, 352]],
[[272, 352]],
[[271, 352]],
[[270, 352]],
[[269, 352]],
[[268, 352]],
[[267, 352]],
[[266, 352]],
[[265, 352]],
[[266, 352]],
[[267, 352]],
[[268, 352]],
[[269, 352]],
[[270, 352]],
[[271, 352]],
[[272, 352]],
[[273, 352]],
[[274, 352]],
[[275, 352]],
[[276, 352]],
[[277, 352]],
[[278, 352]],
[[279, 351]],
[[280, 351]],
[[281, 351]],
[[282, 351]],
[[283, 351]],
[[284, 351]],
[[285, 351]],
[[286, 351]],
[[287, 351]],
[[288, 351]],
[[289, 351]],
[[290, 351]],
[[291, 351]],
[[290, 351]],
[[289, 351]],
[[288, 351]],
[[287, 351]],
[[286, 351]],
[[285, 351]],
[[284, 351]],
[[283, 351]],
[[282, 351]],
[[281, 351]],
[[280, 351]]], dtype=int32)
If we do the opposite, and breakpoint on a regular (truthy) area
with if area: breakpoint()
(giving area = 3.0
):
Click to show a regular contour
(Pdb) pp contour
array([[[ 98, 405]],
[[ 97, 406]],
[[ 96, 406]],
[[ 95, 406]],
[[ 94, 406]],
[[ 93, 407]],
[[ 92, 407]],
[[ 91, 407]],
[[ 90, 407]],
[[ 89, 407]],
[[ 88, 407]],
[[ 87, 407]],
[[ 86, 407]],
[[ 85, 407]],
[[ 84, 407]],
[[ 83, 407]],
[[ 82, 407]],
[[ 81, 407]],
[[ 80, 407]],
[[ 79, 407]],
[[ 78, 407]],
[[ 77, 407]],
[[ 78, 407]],
[[ 79, 407]],
[[ 80, 407]],
[[ 81, 407]],
[[ 82, 407]],
[[ 83, 407]],
[[ 84, 407]],
[[ 85, 407]],
[[ 86, 407]],
[[ 87, 407]],
[[ 88, 407]],
[[ 89, 407]],
[[ 90, 407]],
[[ 91, 407]],
[[ 92, 407]],
[[ 93, 407]],
[[ 94, 406]],
[[ 95, 406]],
[[ 96, 406]],
[[ 97, 406]],
[[ 98, 406]],
[[ 99, 406]],
[[100, 406]],
[[101, 405]],
[[102, 405]],
[[103, 405]],
[[104, 405]],
[[105, 405]],
[[106, 405]],
[[105, 405]],
[[104, 405]],
[[103, 405]],
[[102, 405]],
[[101, 405]],
[[100, 405]],
[[ 99, 405]]], dtype=int32)
I'm using page-dewarp that was installed from pip into venv on Ubuntu 24.04.
venv/bin/pip list Package Version
contourpy 1.2.1 cycler 0.12.1 fonttools 4.51.0 kiwisolver 1.4.5 matplotlib 3.8.4 mpmath 1.3.0 numpy 1.26.4 opencv-python 4.9.0.80 packaging 24.0 page-dewarp 0.1.5 pillow 10.3.0 pip 24.0 pyparsing 3.1.2 python-dateutil 2.9.0.post0 scipy 1.13.0 six 1.16.0 sympy 1.12 toml 0.10.2 tomlkit 0.12.5
It reports:
venv/bin/page-dewarp -d 3 ep.jpg Loaded ep.jpg at size='723x854' --> resized='362x427' Traceback (most recent call last): File "/tmp/venv/bin/page-dewarp", line 8, in sys.exit(main()) ^^^^^^ File "/tmp/venv/lib/python3.12/site-packages/page_dewarp/main.py", line 21, in main processed_img = WarpedImage(imgfile) ^^^^^^^^^^^^^^^^^^^^ File "/tmp/venv/lib/python3.12/site-packages/page_dewarp/image.py", line 55, in init self.contour_list = self.contour_info(text=True) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/tmp/venv/lib/python3.12/site-packages/page_dewarp/image.py", line 155, in contour_info return mask.contours() ^^^^^^^^^^^^^^^ File "/tmp/venv/lib/python3.12/site-packages/page_dewarp/mask.py", line 59, in contours return get_contours(self.name, self.small, self.value) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/tmp/venv/lib/python3.12/site-packages/page_dewarp/contours.py", line 107, in get_contours contours_out.append(ContourInfo(contour, rect, tight_mask)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/tmp/venv/lib/python3.12/site-packages/page_dewarp/contours.py", line 60, in init self.center, self.tangent = blob_mean_and_tangent(contour) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/tmp/venv/lib/python3.12/site-packages/page_dewarp/contours.py", line 40, in blob_mean_and_tangent mean_x = moments["m10"] / area ~~~~~~~~~~~~~~~^~~~~~ ZeroDivisionError: float division by zero
I looked it up and found something that might be informative/similar:
The issue is that
cv2.moments()
has a bug and OpenCV contours are weird.
I think the underlying issue is that a single bad contour should be omitted rather than spoiling the entire operation. We are not writing an OCR algorithm, only acquiring contours to provide to the dewarping routine: not comprehensively defining all text line contours is not going to harm this overall operation.
As the traceback shows:
File "/tmp/venv/lib/python3.12/site-packages/page_dewarp/image.py", line 55, in __init__
self.contour_list = self.contour_info(text=True)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/tmp/venv/lib/python3.12/site-packages/page_dewarp/image.py", line 155, in contour_info
return mask.contours()
^^^^^^^^^^^^^^^
File "/tmp/venv/lib/python3.12/site-packages/page_dewarp/mask.py", line 59, in contours
return get_contours(self.name, self.small, self.value)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/tmp/venv/lib/python3.12/site-packages/page_dewarp/contours.py", line 107, in get_contours
contours_out.append(ContourInfo(contour, rect, tight_mask))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/tmp/venv/lib/python3.12/site-packages/page_dewarp/contours.py", line 60, in __init__
self.center, self.tangent = blob_mean_and_tangent(contour)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
We are:
- [in
image.py
] building acontour_list
attribute on theWarpedImage
object (at init time) using itscontour_info()
method, with contour typetext=True
, which is just a wrapper on... - [in
mask.py
] converting aMask
to contours via itscontours()
method, which in turn just passes along to... - [in
contours.py
] running theget_contours()
function, which usescv2.findContours()
and then postprocesses the contours received to drop any unsuitable ones before appending to thecontours_out
variable which ends up going back to theWarpedImage.contour_list
This last step is where the fix seems most suitable:
- We are already looping over a contour list and skipping 'bad eggs' with
continue
- The simplest solution is to
try
to append andpass
on the errors in theelse
block (rather than halt the entire program)
We could also do it without side effects (i.e. without allow an error to throw), by checking for zero area and returning some sentinel value like (None, None)
which we then check for in get_contours
, but catching the ZeroDivisionError
seems proactive and clear enough
I ran it after allowing those errors to be suppressed (and their contours dropped) and the result is pretty poor.
I suspect that the error is a hint that you may want to use a larger input image, as then the contours will be detected by OpenCV, and you'll get more accurate text line angles and subsequent dewarping. I expect this result would be unusable.
Perhaps a better approach would be to print a message to user when skipping these zero-moment contours suggesting to use a larger image (or perhaps sample the image more densely? I forget which parameters do this, I think there's a setting for that).
Ah, I see the problem: the left margin is not being included! If you run the program with the -d
(debug level) flag set to 3 you will produce all intermediate steps for debugging:
zerodivisionerror_image_debug_4_keypoints_before.png
zerodivisionerror_image_debug_5_keypoints_after.png
That explains why enlarging the image does not resolve the issue (I ran it again on the enlarged one to confirm):
zerodivisionerror_image_2x_enlarged_debug_4_keypoints_before.png
zerodivisionerror_image_2x_enlarged_debug_5_keypoints_after.png
The assumption of this program is that you will be supplying a page with a margin, which in this case is a suitable assumption for the right side but not the left.
The flag for this is -x
(see page-dewarp --help
for all flags, I just refactored these to use defaults from this module)
In this case my intuition that the small image was too small seems to be correct, the "keypoints" (i.e. samples taken from the text line contours whose detection we were debugging above) are notably sparse and irregular when the program is run with -x 0
(no left/right page margin)
zerodivisionerror_image_debug_4_keypoints_before
(with -x 0
flags)
...resulting in another poor result:
If I use the enlarged version it works nicely:
zerodivisionerror_image_2x_enlarged_debug_4_keypoints_before.png
zerodivisionerror_image_2x_enlarged_debug_5_keypoints_after.png
...leading to a nice result:
zerodivisionerror_image_2x_enlarged_debug_6_output.png