Mellanox/nv_peer_memory

Errors when install nvidia_peer_memory-1.0-8.x86_64.rpm

lytofd opened this issue · 3 comments

Hi:
l have met an error when l install nv_peer_mem on a node.
nvidia_peer_memory-1.0-8.x86_64.rpm
[centos@gpu x86_64]$ sudo rpm -ivh nvidia_peer_memory-1.0-8.x86_64.rpm Preparing... ################################# [100%] Updating / installing... 1:nvidia_peer_memory-1.0-8 ################################# [100%] depmod: ERROR: fstatat(4, nvidia-uvm.ko.xz): No such file or directory depmod: ERROR: fstatat(4, nvidia.ko.xz): No such file or directory depmod: ERROR: fstatat(4, nvidia-modeset.ko.xz): No such file or directory
This is the newest version of nv_peer_mem, my kernel version is 3.10.0-957.27.2.el7.x86_64.
After that, l have tried with older version of nv_peer_mem in another node, it successed, its kernel version is 3.10.0-957.12.2.el7.x86_64.
All two nodes are installed with cuda 10.1.

Hi:
I guess you have tried to recompile "version 1.0.8", from git sources.

I have an issue as well on 3.10.0-957.el7.x86_64. with cuda 10.1 / drv 418.67

I "reverted" the changes in 25774c3

# grep modules_pat= create_nv.symvers.sh
modules_pat="__crc_nvidia_p2p_|T nvidia_p2p_"
modules_pat="__crc_nvidia_p2p_"

Pull request #60 should fix the issue, please give it a try

fixed by #60 closing