Cannot Build fx2ait with setup.py
ioeddk opened this issue · 3 comments
ioeddk commented
When I was trying to build fx2ait with setup.py, it gives the following error:
-- Added CUDA NVCC flags for: -gencode;arch=compute_86,code=sm_86
CMake Warning at /usr/local/lib/python3.8/dist-packages/torch/share/cmake/Torch/TorchConfig.cmake:22 (message):
static library kineto_LIBRARY-NOTFOUND not found.
Call Stack (most recent call first):
/usr/local/lib/python3.8/dist-packages/torch/share/cmake/Torch/TorchConfig.cmake:127 (append_torchlib_if_found)
CMakeLists.txt:4 (find_package)
-- Found Torch: /usr/local/lib/python3.8/dist-packages/torch/lib/libtorch.so
-- Configuring done (3.7s)
-- Generating done (0.0s)
-- Build files have been written to: /home/AITemplate/fx2ait/build/temp.linux-x86_64-3.8
---------- Building extensions ----------------------------------------
[ 33%] Building CXX object CMakeFiles/ait_model.dir/fx2ait/csrc/AITModel.cpp.o
[ 66%] Building CXX object CMakeFiles/ait_model.dir/fx2ait/csrc/AITModelImpl.cpp.o
/home/AITemplate/fx2ait/fx2ait/csrc/AITModelImpl.cpp: In member function ‘void torch::aitemplate::AITModelImpl::allocateOutputs(std::vector<c10::intrusive_ptr<c10::StorageImpl> >&, std::vector<AITData>&, std::vector<std::vector<long int> >&, std::vector<long int*>&, const c10::Device&)’:
/home/AITemplate/fx2ait/fx2ait/csrc/AITModelImpl.cpp:328:44: error: ‘struct c10::StorageImpl’ has no member named ‘mutable_data’
328 | ait_outputs.emplace_back(storage_impl->mutable_data(), shape, ait_dtype);
| ^~~~~~~~~~~~
/home/AITemplate/fx2ait/fx2ait/csrc/AITModelImpl.cpp: In function ‘c10::ScalarType torch::aitemplate::{anonymous}::AITemplateDtypeToTorchDtype(AITemplateDtype)’:
/home/AITemplate/fx2ait/fx2ait/csrc/AITModelImpl.cpp:261:1: warning: control reaches end of non-void function [-Wreturn-type]
261 | }
| ^
make[2]: *** [CMakeFiles/ait_model.dir/build.make:90: CMakeFiles/ait_model.dir/fx2ait/csrc/AITModelImpl.cpp.o] Error 1
make[2]: *** Waiting for unfinished jobs....
make[1]: *** [CMakeFiles/Makefile2:83: CMakeFiles/ait_model.dir/all] Error 2
make: *** [Makefile:91: all] Error 2
Traceback (most recent call last):
File "setup.py", line 101, in <module>
setup(
File "/usr/lib/python3/dist-packages/setuptools/__init__.py", line 144, in setup
return distutils.core.setup(**attrs)
File "/usr/lib/python3.8/distutils/core.py", line 148, in setup
dist.run_commands()
File "/usr/lib/python3.8/distutils/dist.py", line 966, in run_commands
self.run_command(cmd)
File "/usr/lib/python3.8/distutils/dist.py", line 985, in run_command
cmd_obj.run()
File "/usr/lib/python3/dist-packages/setuptools/command/install.py", line 67, in run
self.do_egg_install()
File "/usr/lib/python3/dist-packages/setuptools/command/install.py", line 109, in do_egg_install
self.run_command('bdist_egg')
File "/usr/lib/python3.8/distutils/cmd.py", line 313, in run_command
self.distribution.run_command(command)
File "/usr/lib/python3.8/distutils/dist.py", line 985, in run_command
cmd_obj.run()
File "/usr/lib/python3/dist-packages/setuptools/command/bdist_egg.py", line 172, in run
cmd = self.call_command('install_lib', warn_dir=0)
File "/usr/lib/python3/dist-packages/setuptools/command/bdist_egg.py", line 158, in call_command
self.run_command(cmdname)
File "/usr/lib/python3.8/distutils/cmd.py", line 313, in run_command
self.distribution.run_command(command)
File "/usr/lib/python3.8/distutils/dist.py", line 985, in run_command
cmd_obj.run()
File "/usr/lib/python3/dist-packages/setuptools/command/install_lib.py", line 23, in run
self.build()
File "/usr/lib/python3.8/distutils/command/install_lib.py", line 109, in build
self.run_command('build_ext')
File "/usr/lib/python3.8/distutils/cmd.py", line 313, in run_command
self.distribution.run_command(command)
File "/usr/lib/python3.8/distutils/dist.py", line 985, in run_command
cmd_obj.run()
File "setup.py", line 82, in run
subprocess.check_call(cmake_cmd, cwd=self.build_temp)
File "/usr/lib/python3.8/subprocess.py", line 364, in check_call
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['cmake', '--build', '.', '--config', 'Release', '--', '-j2']' returned non-zero exit status 2.
This is on a platform of SM86, CUDA 11.6. It's not working either bare metal or in the docker image. I've also tried on a SM75 platform with CUDA12.0, it gives the same error as CMake. Also, it gives the error either with setup.py install
or setup.py bdist_wheel
.
ipiszy commented
cc fx2ait poc @wushirong @frank-wei to take a look.
tomerkeren42 commented
Also encountered this, running on an A10g instance.
ymwangg commented
Updating pytorch version should solve this problem. StorageImpl->mutabe_data()
was recently added pytorch/pytorch#97647.