facebook/openbmc

Using Tioga Pass Channel onboard shared-NIC causes BMC reset

MennoCB opened this issue · 2 comments

Hi,

I am using the Tioga Pass Channel version from Wiwynn and I am trying to use the on-board Intel I210 with RJ45 connector for management, instead of using NCSI through the Mezzanine card (Mezzanine works just fine).
It seems to detect the NIC but when I try to configure an IP on it with ifconfig or through interfaces, the BMC just resets/reboots without any message the console.

[ 18.675494] ftgmac100 ftgmac100.0 eth0: irq 2, mapped at dca40000
[ 18.687826] ftgmac100 ftgmac100.0 eth0: generated random MAC address b6:8b:68:aa:f0:30
[ 18.703743] dev_attr_powerup_prep_host_id registered
[ 18.713748] ncsi_nl_socket_init: created netlink socket
[ 18.724270] ftgmac: aen work queue created for eth0
[ 18.734789] ftgmac100 ftgmac100.1 eth1: irq 3, mapped at dca80000
[ 18.747487] ftgmac100 ftgmac100.1 eth1: generated random MAC address 66:61:4e:a1:8a:9e
[ 18.763413] dev_attr_powerup_prep_host_id registered
[ 18.773425] ftgmac: aen work queue created for eth1

Configuring network interfaces... [ 26.995441] Found NCSI NW Controller at (0, 0)
[ 27.506454] NCSI: Mezz Vendor = Mellanox
[ 28.015603] NCSI: MAC 24:8A:07:3F:96:B0:
[ 28.025810] Using NCSI Network Controller (0, 0)
[ 28.400902] random: nonblocking pool is initialized

This option is described in the Facebook 2S Server Tioga Pass specification, secion 9.1:
Option 3: Shared-NIC uses RMII/NCSI interfaces to pass management traffic on data network of Intel® I210-AT. Intel® I210-AT has 10/100/1000 MDI interface to RJ45.

OpenBMC Release fbtp-25474cc

root@bmc-oob:~# fruid-util all

FRU Information : Mother Board
--------------- : ------------------
Chassis Type : Rack Mount Chassis
Chassis Part Number :
Chassis Serial Number :
Chassis Custom Data 1 : CPU:
Chassis Custom Data 2 : CPU:
Board Mfg Date : Wed Jun 6 03:55:00 2018
Board Mfg : Wistron
Board Product : Tioga-Pass Channel
Board Serial : B5501N01000182200010J0A1
Board Part Number : B91.01N10.0001
Board FRU ID : N/A
Product Manufacturer : Wiwynn
Product Name : Tioga-Pass Channel
Product Part Number :
Product Version : FAB1,SA
Product Serial :
Product Asset Tag :
Product FRU ID : N/A
Product Custom Data 1 :
Product Custom Data 2 : PVT
Product Custom Data 3 : 2016-03-24T00:00:00
Failed print FRUID for NIC Mezzanine
Check syslog for errors!
Failed print FRUID for FRU content on the riser slot 2
Check syslog for errors!
Failed print FRUID for FRU content on the riser slot 3
Check syslog for errors!
Failed print FRUID for FRU content on the riser slot 4
Check syslog for errors!

Thanks

Also after installing the OS, sometimes when I reboot the OS this message appears on the BMC:
[ 2214.445881] ftgmac100 ftgmac100.0 eth0: rx crc err
[ 2214.455650] ftgmac100 ftgmac100.0 eth0: rx runt

And then BMC is not reachable on NCSI, no reply on ping, need to do "mc reset cold" from the OS to make it reboot.

Just build another fbtp and now it's working

[ 18.888541] ftgmac100 ftgmac100.1 eth1: irq 3, mapped at dca80000
[ 18.901234] ftgmac100 ftgmac100.1 eth1: generated random MAC address e6:71:91:3b:8f:4b
[ 18.927180] ftgmac: aen work queue created for eth1

Still doesn't use it though, when I set metric in /etc/network/interfaces for eth0 and eth1 it's not taking that to the routing table... but it's one step closer.

The 10G Mezz is eth0 and on-board is 1G eth1, so it's still not using the 1G RJ45 interface ;-)