ryfeus/lambda-packs

No module named PIL error when using Skimage_numpy package

stasov opened this issue · 10 comments

When I replace the service.py with the following code and try to test skimage with a simple imread,
Lambda returns:
Unable to import module 'service': No module named PIL

# -*- coding: utf-8 -*-
from skimage import io
import urllib

def handler(event, context):

	urllib.urlretrieve("http://image.pbs.org/video-assets/pbs/operation-wild/177014/images/mezzanine_928.jpg.focalcrop.767x431.50.10.jpg", "/tmp/hi.jpg")
	img = io.imread('/tmp/hi.jpg')

	return 0

Are there any working code examples with skimage and the package together?
Any help would be greatly appreciated.

Could I do anything to help debug?
Would greatly appreciate your help with this.

@stasov I have combination of Pillow and OpenCV https://github.com/ryfeus/lambda-packs/tree/master/Opencv_pil/source . You can try and add Pillow libs from there to skimage package - it may work. I will look more into it in future.

I tried to copy the Pillow libs but then got some errors about others missing libraries such as scipy, etc. I repeated the process of adding the required the missing libraries but quickly went over the 50MB limit for the Lambda package. Cheerrs

+1 to this. Skimage isn't useful unless it can open images!

@ryfeus I'm happy to help debug this as well--it's killing me. Which steps did you follow to build the skimage-numpy pack? Maybe one of these?

@stasov Have you tried uploading the zip to s3? the 50 MB limit does not apply in that case.

@stasov @MattFerraro solved the issue. New pack is here:
https://github.com/ryfeus/lambda-packs/blob/master/Skimage_numpy/Pack.zip

Now there is skimage, scipy, PIL and numpy inside this pack.

Testing code is the following:

# -*- coding: utf-8 -*-
from skimage import io
import urllib

def handler(event, context):
    urllib.urlretrieve("https://upload.wikimedia.org/wikipedia/commons/3/38/JPEG_example_JPG_RIP_001.jpg", "/tmp/hi.jpg")
    img = io.imread('/tmp/hi.jpg')    
    print(img)
    return 0

Let me know if it works.

Excellent, thank you @ryfeus! I confirmed that with the new pack, the example code you cited runs correctly.

However, lots of other imports still break:

# -*- coding: utf-8 -*-
from skimage import io
import urllib

def handler(event, context):
    import skimage.segmentation as segmentation # Crashes loudly
    return 0

When run as a lambda, the error produced looks like this:

No module named matplotlib: ImportError
Traceback (most recent call last):
  File "/var/task/service.py", line 7, in handler
    import skimage.segmentation as segmentation
  File "/var/task/skimage/segmentation/__init__.py", line 6, in <module>
    from .boundaries import find_boundaries, mark_boundaries
  File "/var/task/skimage/segmentation/boundaries.py", line 5, in <module>
    from ..morphology import dilation, erosion, square
  File "/var/task/skimage/morphology/__init__.py", line 1, in <module>
    from .binary import (binary_erosion, binary_dilation, binary_opening,
  File "/var/task/skimage/morphology/binary.py", line 6, in <module>
    from .misc import default_selem
  File "/var/task/skimage/morphology/misc.py", line 5, in <module>
    from .selem import _default_selem
  File "/var/task/skimage/morphology/selem.py", line 3, in <module>
    from .. import draw
  File "/var/task/skimage/draw/__init__.py", line 1, in <module>
    from .draw import circle, ellipse, polygon_perimeter, set_color
  File "/var/task/skimage/draw/draw.py", line 6, in <module>
    from .._shared._geometry import polygon_clip
  File "/var/task/skimage/_shared/_geometry.py", line 4, in <module>
    from matplotlib import _path, path, transforms
ImportError: No module named matplotlib

segmentation may seem like a strange one-off, but inspecting the error shows that all drawing functions rely on it, and consequently all morphology operations, and actually a ton of skimage depends on this import. So maybe adding matplotlib is the last tweak needed to make this fully useful?

Actually, there may be a way to do this without matplotlib!

If we look at the current master of _geometry.py (here) we see that matplotlib is only imported indirectly as a part of polygon_clip(). If we look at the version present in this Pack, we see matplotlib is imported directly at the top of the file. This suggests to me that the version of skimage used to create the pack is just out of date!

Maybe, rebuilding with recent sources is all that is necessary to fix this? The change that is relevant seems to come from April 6, 2017.

@MattFerraro new version of skimage weights a lot more than old one so I just changed _geometry.py file:
https://github.com/ryfeus/lambda-packs/blob/master/Skimage_numpy/Pack_nomatplotlib.zip

This code works on my AWS Lambda:

# -*- coding: utf-8 -*-
from skimage import io
import urllib
import skimage.segmentation as segmentation

def handler(event, context):
    urllib.urlretrieve("https://upload.wikimedia.org/wikipedia/commons/3/38/JPEG_example_JPG_RIP_001.jpg", "/tmp/hi.jpg")
    img = io.imread('/tmp/hi.jpg')    
    print(img)
    return 0

P.S. There is hard limit of 50 mb on lambda archive enforced by AWS (regardless S3). That's why it is usually the issue to combine libs like:
scikit-image (compressed) 32MB
scipy (compressed) 42MB
numpy (compressed) 15 MB
pillow (compressed) 5 MB

@MattFerraro thank you for the input on the issue. Let me know if I can close it.

This works for me now, ready to close!