Mellanox/mstflint

mstconfig Fails to query device configuration on Connectx-3 EN cards

sotiris-bos opened this issue · 5 comments

I have two Connectx-3 EN single SFP+ port cards that I am trying to change the config. FW 2.42.5000 / FlexBoot 3.4.752

I have used multiple versions of the OFED driver as well as the EN driver with multiple versions of mstflint.

I have tried Centos 7.6, Fedora 29 and Ubuntu 16.04.

I have reflashed the cards multiple times as well as reset the config. I even flashed them from livefish mode.

Nothing seems to work, I can't query the config of the cards with either mstflint or MFT.

Edit: Others have the same problem as well:

https://community.mellanox.com/s/question/0D51T00006RVv0vSAD/unable-to-set-mellanox-connectx3-to-ethernet-failed-to-query-device-current-configuration

https://community.mellanox.com/s/question/0D51T00006RVujaSAD/issue-with-connectx3-failed-to-query-device-current-configuration?t=1549276511711

I'm encountering the issue as well, on Debian Bullseye.

I think it's worth mentioning that ethtool is unable to retrieve info about the transceiver, after updating the firmware to 2.42.5000.

$ uname -r
5.10.0-7-amd64
$ sudo ethtool -m enp4s0
Cannot get module EEPROM information: Input/output error
mlx4_core 0000:04:00.0: MLX4_CMD_MAD_IFC Get Module info attr(ff60) port(1) i2c_addr(50) offset(0) size(2): Response Mad Status(31c) - cable is not connected
$ sudo ethtool -i enp4s0
driver: mlx4_en
version: 4.0-0
firmware-version: 2.42.5000
expansion-rom-version: 
bus-info: 0000:04:00.0
supports-statistics: yes
supports-test: yes
supports-eeprom-access: no
supports-register-dump: no
supports-priv-flags: yes
$ sudo ethtool enp4s0
Settings for enp4s0:
        Supported ports: [ FIBRE ]
        Supported link modes:   1000baseKX/Full
                                10000baseKR/Full
        Supported pause frame use: Symmetric Receive-only
        Supports auto-negotiation: No
        Supported FEC modes: Not reported
        Advertised link modes:  1000baseKX/Full
                                10000baseKR/Full
        Advertised pause frame use: Symmetric
        Advertised auto-negotiation: No
        Advertised FEC modes: Not reported
        Speed: 10000Mb/s
        Duplex: Full
        Auto-negotiation: off
        Port: FIBRE
        PHYAD: 0
        Transceiver: internal
        Supports Wake-on: d
        Wake-on: d
        Current message level: 0x00000014 (20)
                               link ifdown
        Link detected: yes

supports-eeprom-access was set to yes with the previous firmware, however many transceiver values were definitely wrong.

Also, at boot the driver seems to take ~6 seconds to load:

[    3.457298] hid-generic 0003:0416:B23C.0006: input,hidraw5: USB HID v1.10 Keyboard [Gaming Keyboard] on usb-0000:05:00.3-6.4/input1
[    3.464809] hid-generic 0003:0416:B23C.0007: hiddev3,hidraw6: USB HID v1.10 Device [Gaming Keyboard] on usb-0000:05:00.3-6.4/input2
[    9.103498] mlx4_core 0000:04:00.0: DMFS high rate steer mode is: disabled performance optimized steering
[    9.107346] mlx4_core 0000:04:00.0: 31.504 Gb/s available PCIe bandwidth (8.0 GT/s PCIe x4 link)
[    9.182189] pps_core: LinuxPPS API ver. 1 registered
[    9.185302] pps_core: Software ver. 5.3.6 - Copyright 2005-2007 Rodolfo Giometti <giometti@linux.it>
[    9.190023] PTP clock support registered
[    9.195434] mlx4_en: Mellanox ConnectX HCA Ethernet driver v4.0-0
[    9.198816] mlx4_en 0000:04:00.0: Activating port:1
[    9.204756] mlx4_en: 0000:04:00.0: Port 1: Using 16 TX rings
[    9.204757] mlx4_en: 0000:04:00.0: Port 1: Using 16 RX rings
[    9.205188] mlx4_en: 0000:04:00.0: Port 1: Initializing port
[    9.205713] mlx4_en 0000:04:00.0: registered PHC clock
[    9.205944] <mlx4_ib> mlx4_ib_add: mlx4_ib: Mellanox ConnectX InfiniBand driver v4.0-0
[    9.206633] mlx4_core 0000:04:00.0 enp4s0: renamed from eth0
[    9.206875] <mlx4_ib> mlx4_ib_add: counter index 1 for port 1 allocated 1

I have the same issues with a Connectx-3 EN card - Anyone end up finding a solution ?

mstconfig still exhibits the same issue:

$ sudo mstconfig -d 04:00.0 query

Device #1:
----------

Device type:    ConnectX3       
Device:         04:00.0         

Configurations:                              Next Boot
-E- Failed to query device current configuration

I now see a different output with ethtool though:

$ uname -r
5.14.0-2-amd64
$ sudo ethtool -m enp4s0
netlink error: Input/output error
mlx4_core 0000:04:00.0: MLX4_CMD_MAD_IFC Get Module ID attr(ff60) port(1) i2c_addr(50) offset(0) size(1): Response Mad Status(31c) - cable is not connected
$ sudo ethtool -i enp4s0
driver: mlx4_en
version: 4.0-0
firmware-version: 2.42.5000
expansion-rom-version: 
bus-info: 0000:04:00.0
supports-statistics: yes
supports-test: yes
supports-eeprom-access: no
supports-register-dump: no
supports-priv-flags: yes
$ sudo ethtool enp4s0
Settings for enp4s0:
        Supported ports: [ FIBRE ]
        Supported link modes:   1000baseKX/Full
                                10000baseKR/Full
        Supported pause frame use: Symmetric Receive-only
        Supports auto-negotiation: No
        Supported FEC modes: Not reported
        Advertised link modes:  1000baseKX/Full
                                10000baseKR/Full
        Advertised pause frame use: Symmetric
        Advertised auto-negotiation: No
        Advertised FEC modes: Not reported
        Speed: 10000Mb/s
        Duplex: Full
        Auto-negotiation: off
        Port: FIBRE
        PHYAD: 0
        Transceiver: internal
        Supports Wake-on: d
        Wake-on: d
        Current message level: 0x00000014 (20)
                               link ifdown
        Link detected: yes

Also, the ~6 seconds delay at boot has not been there anymore for a while now.

Closed due to time limit, if still require addressing – please open a new ticket.