Unknown symbol error in `dpkg -i /tmp/nvidia-peer-memory-dkms_1.0-8_all.deb`
czkkkkkk opened this issue · 2 comments
czkkkkkk commented
Environment
System: Ubuntu 16.04
CUDA version: 10.0
Mellanox ofed: 4.6-1.0.1
$ uname -r
4.4.0-131-generic
$ ls -l /lib/modules
total 16
drwxr-xr-x 7 root root 4096 Nov 26 20:02 4.4.0-131-generic
drwxr-xr-x 3 root root 4096 Jul 31 22:32 4.4.0-21-generic
drwxr-xr-x 3 root root 4096 Jul 31 22:32 4.4.0-64-generic
drwxr-xr-x 3 root root 4096 Jul 31 22:33 4.4.0-66-generic
$ ls -l /usr/src/ofa_kernel/
total 4
drwxr-xr-x 7 root root 4096 Aug 2 02:46 4.4.0-131-generic
lrwxrwxrwx 1 root root 17 Aug 2 02:46 default -> 4.4.0-131-generic
Description
Hi. I tried to install nv_peer_memory. I ran the following commands:
./build_module.sh
cd /tmp
tar xzf /tmp/nvidia-peer-memory_1.0.orig.tar.gz
cd nvidia-peer-memory-1.0
dpkg-buildpackage -us -uc
dpkg -i /tmp/nvidia-peer-memory_1.0-8_all.deb
dpkg -i /tmp/nvidia-peer-memory-dkms_1.0-8_all.deb
It failed when tried to install dkms deb. The full build log is:
$ dpkg -i /tmp/nvidia-peer-memory-dkms_1.0-8_all.deb
(Reading database ... 133469 files and directories currently installed.)
Preparing to unpack .../nvidia-peer-memory-dkms_1.0-8_all.deb ...
------------------------------
Deleting module version: 1.0
completely from the DKMS tree.
------------------------------
Done.
Unpacking nvidia-peer-memory-dkms (1.0-8) over (1.0-8) ...
Setting up nvidia-peer-memory-dkms (1.0-8) ...
Loading new nvidia-peer-memory-1.0 DKMS files...
Building only for 4.4.0-131-generic
Building initial module for 4.4.0-131-generic
Secure Boot not enabled on this system.
Done.
nv_peer_mem:
Running module version sanity check.
- Original module
- No original module exists within this kernel
- Installation
- Installing to /lib/modules/4.4.0-131-generic/updates/dkms/
depmod....
DKMS: install completed.
modprobe: ERROR: could not insert 'nv_peer_mem': Invalid argument
dpkg: error processing package nvidia-peer-memory-dkms (--install):
subprocess installed post-installation script returned error exit status 1
Errors were encountered while processing:
nvidia-peer-memory-dkms
The dmesg errors are:
$ dmesg | grep nv_peer_mem
[1624474.366292] nv_peer_mem: Unknown symbol nvidia_p2p_dma_map_pages (err -22)
[1624474.366314] nv_peer_mem: disagrees about version of symbol nvidia_p2p_free_dma_mapping
[1624474.366316] nv_peer_mem: Unknown symbol nvidia_p2p_free_dma_mapping (err -22)
[1624474.366338] nv_peer_mem: disagrees about version of symbol nvidia_p2p_free_page_table
[1624474.366340] nv_peer_mem: Unknown symbol nvidia_p2p_free_page_table (err -22)
[1633847.270244] nv_peer_mem: disagrees about version of symbol nvidia_p2p_dma_unmap_pages
[1633847.270249] nv_peer_mem: Unknown symbol nvidia_p2p_dma_unmap_pages (err -22)
[1633847.270275] nv_peer_mem: disagrees about version of symbol nvidia_p2p_get_pages
[1633847.270277] nv_peer_mem: Unknown symbol nvidia_p2p_get_pages (err -22)
[1633847.270296] nv_peer_mem: disagrees about version of symbol nvidia_p2p_put_pages
[1633847.270298] nv_peer_mem: Unknown symbol nvidia_p2p_put_pages (err -22)
[1633847.270347] nv_peer_mem: disagrees about version of symbol nvidia_p2p_dma_map_pages
[1633847.270349] nv_peer_mem: Unknown symbol nvidia_p2p_dma_map_pages (err -22)
[1633847.270367] nv_peer_mem: disagrees about version of symbol nvidia_p2p_free_dma_mapping
[1633847.270369] nv_peer_mem: Unknown symbol nvidia_p2p_free_dma_mapping (err -22)
[1633847.270386] nv_peer_mem: disagrees about version of symbol nvidia_p2p_free_page_table
[1633847.270388] nv_peer_mem: Unknown symbol nvidia_p2p_free_page_table (err -22)
I checked the following similar issues but found they are not the source of the problem.
- Kernel mismatching. I think in my case I was using the same kernel
4.4.0-131-generic
to compile and install. And I think this problem has been fixed. - Wrong kernel module name. I ran
make
in the/tmp
dir and it builtnv_peer_mem.ko
. So I think it is not the problem.
czkkkkkk commented
I solved this problem by changing to the release
branch.