serverless invoke local is failing for package tensorflow
Closed this issue · 6 comments
Hello,
When using tensorflow, serverless invoke local
is failing
I think serverless invoke local
is broken when using serverless-ephemeral
$ serverless invoke local --function <my-function> --path test/events/frame.json
Traceback (most recent call last):
File "/home/pierre/.nvm/versions/node/v7.7.3/lib/node_modules/serverless/lib/plugins/aws/invokeLocal/invoke.py", line 57, in <module>
module = import_module(args.handler_path.replace('/', '.'))
File "/usr/lib/python2.7/importlib/__init__.py", line 37, in import_module
__import__(name)
File "/home/pierre/dev/lbd-py-tensorflow/custom-function.py", line 5, in <module>
import lib.classify as classify
File "/home/pierre/dev/lbd-py-tensorflow/lib/__init__.py", line 1, in <module>
from .network import Network
File "/home/pierre/dev/lbd-py-tensorflow/lib/network.py", line 4, in <module>
import tensorflow as tf
File "/usr/local/lib/python2.7/dist-packages/tensorflow/__init__.py", line 24, in <module>
from tensorflow.python import *
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/__init__.py", line 49, in <module>
from tensorflow.python import pywrap_tensorflow
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/pywrap_tensorflow.py", line 52, in <module>
raise ImportError(msg)
ImportError: Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/pywrap_tensorflow.py", line 41, in <module>
from tensorflow.python.pywrap_tensorflow_internal import *
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 28, in <module>
_pywrap_tensorflow_internal = swig_import_helper()
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 24, in swig_import_helper
_mod = imp.load_module('_pywrap_tensorflow_internal', fp, pathname, description)
ImportError: libcudnn.so.6: cannot open shared object file: No such file or directory
Failed to load the native TensorFlow runtime.
See https://www.tensorflow.org/install/install_sources#common_installation_problems
for some common reasons and solutions. Include the entire stack trace
above this error message when asking for help.
The library is missing libcudnn.so.6
I will suggest a PR for this too
Can you provide this info:
- What OS did you use?
- Do you have a use case you can share or a way to use the available examples to triage this?
I'm trying to reproduce the exact issue in my local but I'm not being able to.
I think this issue might be related to my local tensorflow installation,
I'm using
- ubuntu 16.04
- cuda installation
- NVIDIA GPU Card
- cudnn tensorflow installation
I had to set up LD_LIBRARY_PATH to make tensorflow run on local machine (see details below), following the instructions from https://www.tensorflow.org/install/install_linux
NVIDIA requirements to run TensorFlow with GPU support
If you are installing TensorFlow with GPU support using one of the mechanisms described in this guide, then the following NVIDIA software must be installed on your system:
- CUDA® Toolkit 8.0. For details, see NVIDIA's documentation. Ensure that you append the relevant Cuda pathnames to the LD_LIBRARY_PATH environment variable as described in the NVIDIA documentation.
I'm not sure if every tensorflow installation needs this LD_LIBRARY_PATH set up.
@alexleonescalera have you add specific values to LD_LIBRARY_PATH on your local environment to make tensorflow run ? Or is it running without it ?
More info about my environment
nvidia-smi
➜ $ nvidia-smi
Wed Feb 7 09:20:17 2018
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 384.111 Driver Version: 384.111 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX 970 Off | 00000000:01:00.0 On | N/A |
| 0% 39C P8 17W / 200W | 916MiB / 4030MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 1326 G /usr/lib/xorg/Xorg 513MiB |
| 0 2254 G compiz 62MiB |
| 0 3104 G ...-token=<token> 291MiB |
| 0 17160 G ...passed-by-fd --v8-snapshot-passed-by-fd 43MiB |
+-----------------------------------------------------------------------------+
LD_LIBRARY_PATH
$ echo $LD_LIBRARY_PATH
:/usr/local/cuda/lib64:/home/pierre/dev/cudnn/cuda/lib64
Unfortunately, due to time constraints, we didn't go too far with TensorFlow. As you can see, the only packager available is for CPU and Python 2.7. Added to that, our tests were run directly on AWS, not locally. Thus, this specific scenario needs more exploration to come to a more generic solution.
Meanwhile, referring to https://github.com/serverless/serverless/blob/4b71faf2128308894646940ce2fb64e826450972/lib/plugins/aws/invokeLocal/index.js#L93, the lambdaDefaultEnvVars
is merged with providerEnvVars
and functionEnvVars
. Can you try setting the LD_LIBRARY_PATH
in your serverless.yml in either the provider
or the function
env vars?
@alexleonescalera thank you for your support :-)
If you do not face this issue, i understand you do not want to merge this.
Can you try setting the LD_LIBRARY_PATH in your serverless.yml in either the provider or the function env vars?
It will work locally, but it may break then deployed version (which is worse than current situation :-( ) by overriding the amazonlinux LD_LIBRARY_PATH environment variable on the server and creates side-problems
I have created separated plugin https://github.com/piercus/serverless-local-environment.
I hope we would be able to discuss other issues soon :-)
Thank you for your feedback
The separate plugin looks like a better approach since you are addressing an issue that comes from the Serverless core code. I would recommend you contacting the Serverless team about this and requesting them to add your plugin to their list: https://github.com/serverless/plugins
I can see the benefit of flexible settings when running locally vs deployed, so your plugin might be a solution for other people as well.
Thanks for your collaboration.
Yes thank for you advice, i'm waiting for them to accept my PR on serverless/plugins#129