snuspl/nimble

ModuleNotFoundError: No module named 'torch._C'

beomwookang opened this issue · 7 comments

Hi, I'm writing in English in case someone else meets the same issue.
I built Nimble on a docker container with the identical environment mentioned in the instruction guide except for cudnn (7605 in my case).

Once I run the inference code provided:

import torch
import torchvision

# Instantiate a PyTorch Module and move it to a GPU
model = torchvision.models.resnet50()
model = model.cuda()
model.eval()

# Prepare a dummy input
input_shape = [1, 3, 224, 224]
dummy_input = torch.randn(*input_shape).cuda()

# Create a Nimble object
nimble_model = torch.cuda.Nimble(model)
nimble_model.prepare(dummy_input, training=False)

# Execute the object
rand_input = torch.rand(*input_shape).cuda()
output = nimble_model(rand_input)

I get this error as below:

(nimble) root@d137ad00a74b:/workspace/nimble# python3 installation_test.py 
Traceback (most recent call last):
  File "installation_test.py", line 1, in <module>
    import torch
  File "/workspace/nimble/torch/__init__.py", line 81, in <module>
    from torch._C import *
ModuleNotFoundError: No module named 'torch._C'

I first thought that this is because I ran the script in nimble/ where another torch folder exists, but I think I am supposed to do so because torch.cuda.Nimble exists in the corresponding directory.

Could you please specify the guide to run the code after the installation?

My environment is as below (python has been executed in parent directory):
image

Thanks!

Hi @beomwookang, thanks for reaching out!
As you pointed out, the problem occurs because you launched the script under the root directory of nimble where the torch directory exists.
The location of the nimble directory is not important because it is already installed in your python environment (anaconda in this case) if you have run the installation instruction correctly.
So try again with a different working directory.

Note that Nimble does not support cudnn 7. Please use cudnn 8.

Thanks for your reply @gyeongin !

I forgot to mention that I get an AttributeError message that torch.cuda has no attribute 'Nimble' if I run the script in different working directory.
image

Does this mean that the installation has not been completed?
I re-installed nimble by running "setup.py" in the root directory of nimble with a few flags (which are mentioned in the instruction guide) and I'm not sure if it has successfully installed even if I didn't see any error during installation.
I saved the log and the last lines of the log are as below:
image

Is there any way to check if nimble has been successfully installed?
Besides, I replaced cudnn with 8002 version so I think it should not be a problem.

It looks like the build was not successful.
Can you run this command and share the result? You should have nimble.py file under this path.

$ ls /root/anaconda3/envs/nimble/lib/python3.7/site-packages/torch

Also, it would be great if you share the full installation log (maybe via Gist?).

@gyeongin, sorry for the late follow-up.

As you said, it seems like the build is not complete, since I don't see nimble.py under the path you mentioned.

I posted the full installation log at link below:
https://gist.github.com/beomwookang/c29907777b82e4361ec26fddbfac9d0f
It'll be very appreciated if you leave some comment on this.

I encountered some conflicts error while handling the issue, so I think I'll need to start all over with a new container.
I'll keep you updated if any other issues are met!

Your log has only 1569 tasks, while I got 2923 tasks.
This means that you have to clean up your environment (e.g., clone the repository once again).
Also, the build command should give you a summary of cmake configurations like this:

-- ******** Summary ********
-- General:
--   CMake version         : 3.14.0
--   CMake command         : /home/gyeongin/anaconda3/envs/nimble/bin/cmake
--   System                : Linux
--   C++ compiler          : /usr/bin/c++
--   C++ compiler id       : GNU
--   C++ compiler version  : 7.5.0
--   BLAS                  : MKL
--   CXX flags             :  -Wno-deprecated -fvisibility-inlines-hidden -fopenmp -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -O2 -fPIC -Wno-narrowing -Wall -Wextra -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Wno-stringop-overflow
--   Build type            : Release
--   Compile definitions   : TH_BLAS_MKL;ONNX_ML=1;ONNX_NAMESPACE=onnx_torch;MAGMA_V2;IDEEP_USE_MKL;HAVE_MMAP=1;_FILE_OFFSET_BITS=64;HAVE_SHM_OPEN=1;HAVE_SHM_UNLINK=1;HAVE_MALLOC_USABLE_SIZE=1
--   CMAKE_PREFIX_PATH     : /home/gyeongin/anaconda3/envs/nimble;/home/gyeongin/cuda-10.2
--   CMAKE_INSTALL_PREFIX  : /home/gyeongin/workspace/nimble/torch

If you still have other issues, please let us know.
We can build a docker image or even share the Dockerfile that you can use to build the image.

do you have a latest nimble docker image that I can use?

I haven't tried building Nimble docker image yet.
It "should" work if we follow the official instruction from PyTorch (link).
Please refer to issue #15 for further discussion.