Zulko/vapory

Some .ppm files trigger a ValueError exception in ppm_to_numpy()

pklaus opened this issue · 3 comments

I found a bug with the regex search expression in lines 36-40 of vapory/io.py:

        header, width, height, maxval = re.search(
            b"(^P\d\s(?:\s*#.*[\r\n])*"
            b"(\d+)\s(?:\s*#.*[\r\n])*"
            b"(\d+)\s(?:\s*#.*[\r\n])*"
            b"(\d+)\s(?:\s*#.*[\r\n]\s)*)", buffer).groups()

The outer group of this regex should match the header of .ppm files. Sometimes (in the case of my image) it matches not only the header but the full file. When the "header" is then stripped away using offset=len(header) in line 49, no payload is left over and reading the data fails.

So, why is this: Because the regex in its current form tries to capture comments after the last header word 'maxval'. This seems to be disallowed for the binary P6 PPM format:

Comments can only occur before the last field of the header and only one byte may appear after the last header field, normally a carriage return or line feed.

[source: http://paulbourke.net/dataformats/ppm/]

Way to reproduce the bug:

  1. Get the attached file problematic_image.ppm.zip and unpack it.
  2. Run the following code:
from vapory.io import ppm_to_numpy
with open('problematic_image.ppm', 'rb') as f:
    out = f.read()
ppm_to_numpy(buffer=out)

This will lead to the following stacktrace:

  File "/opt/python3.7/site-packages/vapory/vapory.py", line 102, in render
    quality, antialiasing, remove_temp)
  File "/opt/python3.7/site-packages/vapory/io.py", line 117, in render_povstring
    return ppm_to_numpy(buffer=out)
  File "/opt/python3.7/site-packages/vapory/io.py", line 49, in ppm_to_numpy
    offset=len(header))
ValueError: buffer is smaller than requested size

Hi, have you solved the problem? Apperently I have the same error for all my pictures. Might look into a fix.

I think I replaced the regex in vapory/io.py by something like:

        header, width, height, maxval = re.search(
            b"(^P\d\s(?:\s*#.*[\r\n])*"
            b"(\d+)\s(?:\s*#.*[\r\n])*"
            b"(\d+)\s(?:\s*#.*[\r\n])*"
            b"(\d+)[\r\n])", buffer).groups()

Hope it helps.

I did not get it to work really stable with reasonable effort. Instead, I wrote a Python wrapper on my own. Thanks for your help anyway!