mchehab/rasdaemon

rasdaemon dimm label format / sysfs content missing

dgcampea opened this issue · 0 comments

Labels don't seem to be correctly applied when using rasdaemon.
If I edit /etc/edac/labels.db and add:

Vendor: ASRock
    Model: X99M Killer
        DDR4_A1: 0.0.0;
        DDR4_B1: 0.0.1;
        DDR4_C1: 0.0.2;
        DDR4_D1: 0.0.3;

edac-ctl seems to detect the correct sysfs dimm directories:

# edac-ctl --print-labels
LOCATION                            CONFIGURED LABEL     SYSFS CONTENTS      
mc0/csrow0/ch0_dimm_label           DDR4_A1              CPU_SrcID#0_Ha#0_Chan#0_DIMM#0
mc0/csrow0/ch1_dimm_label           DDR4_B1              CPU_SrcID#0_Ha#0_Chan#1_DIMM#0
mc0/csrow0/ch2_dimm_label           DDR4_C1              CPU_SrcID#0_Ha#0_Chan#2_DIMM#0
mc0/csrow0/ch3_dimm_label           DDR4_D1              CPU_SrcID#0_Ha#0_Chan#3_DIMM#0

But doing the same for rasdaemon by adding the same content I added in /etc/edac/labels.db to /etc/ras/dimm_labels.d/asrock I get:

# ras-mc-ctl --print-labels
LOCATION                            CONFIGURED LABEL     SYSFS CONTENTS      
mc0 channel 0 slot 0 
              DDR4_A1              CPU_SrcID#0_Ha#0_Chan#0_DIMM#0
                                    DDR4_B1              0:0:1 missing       
                                    DDR4_C1              0:0:2 missing       
                                    DDR4_D1              0:0:3 missing       

System Info

Motherboard: Fatal1ty X99M Killer
CPU: Intel(R) Xeon(R) CPU E5-2620 v4
rasdaemon version: 0.6.4
Kernel: 5.13.19-200.fc34.x86_64
Distribution: Fedora 34

sysfs contents by searching for dimm:

# find /sys/ -iname '*dimm*'
/sys/devices/system/edac/mc/mc0/dimm3
/sys/devices/system/edac/mc/mc0/dimm3/dimm_ue_count
/sys/devices/system/edac/mc/mc0/dimm3/dimm_mem_type
/sys/devices/system/edac/mc/mc0/dimm3/dimm_dev_type
/sys/devices/system/edac/mc/mc0/dimm3/dimm_ce_count
/sys/devices/system/edac/mc/mc0/dimm3/dimm_label
/sys/devices/system/edac/mc/mc0/dimm3/dimm_location
/sys/devices/system/edac/mc/mc0/dimm3/dimm_edac_mode
/sys/devices/system/edac/mc/mc0/csrow0/ch2_dimm_label
/sys/devices/system/edac/mc/mc0/csrow0/ch0_dimm_label
/sys/devices/system/edac/mc/mc0/csrow0/ch3_dimm_label
/sys/devices/system/edac/mc/mc0/csrow0/ch1_dimm_label
/sys/devices/system/edac/mc/mc0/dimm6
/sys/devices/system/edac/mc/mc0/dimm6/dimm_ue_count
/sys/devices/system/edac/mc/mc0/dimm6/dimm_mem_type
/sys/devices/system/edac/mc/mc0/dimm6/dimm_dev_type
/sys/devices/system/edac/mc/mc0/dimm6/dimm_ce_count
/sys/devices/system/edac/mc/mc0/dimm6/dimm_label
/sys/devices/system/edac/mc/mc0/dimm6/dimm_location
/sys/devices/system/edac/mc/mc0/dimm6/dimm_edac_mode
/sys/devices/system/edac/mc/mc0/dimm0
/sys/devices/system/edac/mc/mc0/dimm0/dimm_ue_count
/sys/devices/system/edac/mc/mc0/dimm0/dimm_mem_type
/sys/devices/system/edac/mc/mc0/dimm0/dimm_dev_type
/sys/devices/system/edac/mc/mc0/dimm0/dimm_ce_count
/sys/devices/system/edac/mc/mc0/dimm0/dimm_label
/sys/devices/system/edac/mc/mc0/dimm0/dimm_location
/sys/devices/system/edac/mc/mc0/dimm0/dimm_edac_mode
/sys/devices/system/edac/mc/mc0/dimm9
/sys/devices/system/edac/mc/mc0/dimm9/dimm_ue_count
/sys/devices/system/edac/mc/mc0/dimm9/dimm_mem_type
/sys/devices/system/edac/mc/mc0/dimm9/dimm_dev_type
/sys/devices/system/edac/mc/mc0/dimm9/dimm_ce_count
/sys/devices/system/edac/mc/mc0/dimm9/dimm_label
/sys/devices/system/edac/mc/mc0/dimm9/dimm_location
/sys/devices/system/edac/mc/mc0/dimm9/dimm_edac_mode

ras-mc-ctl --layout output

# ras-mc-ctl --layout
Use of uninitialized value $max_pos[3] in modulus (%) at /usr/sbin/ras-mc-ctl line 868.
Use of uninitialized value $d in numeric ge (>=) at /usr/sbin/ras-mc-ctl line 869.
Use of uninitialized value $d in sprintf at /usr/sbin/ras-mc-ctl line 872.
Use of uninitialized value $pos[3] in join or string at /usr/sbin/ras-mc-ctl line 791.
Use of uninitialized value $pos[3] in join or string at /usr/sbin/ras-mc-ctl line 791.
Use of uninitialized value $pos[3] in join or string at /usr/sbin/ras-mc-ctl line 791.
Use of uninitialized value $pos[3] in join or string at /usr/sbin/ras-mc-ctl line 791.
Use of uninitialized value $pos[3] in join or string at /usr/sbin/ras-mc-ctl line 791.
Use of uninitialized value $pos[3] in join or string at /usr/sbin/ras-mc-ctl line 791.
Use of uninitialized value $pos[3] in join or string at /usr/sbin/ras-mc-ctl line 791.
Use of uninitialized value $pos[3] in join or string at /usr/sbin/ras-mc-ctl line 791.
Use of uninitialized value $pos[3] in join or string at /usr/sbin/ras-mc-ctl line 791.
Use of uninitialized value $pos[3] in join or string at /usr/sbin/ras-mc-ctl line 791.
Use of uninitialized value $pos[3] in join or string at /usr/sbin/ras-mc-ctl line 791.
Use of uninitialized value $pos[3] in join or string at /usr/sbin/ras-mc-ctl line 791.
    +-----------------------------------------------------------------------------------------------------------------------------------------------+
    |                                                                      mc0                                                                      |
    |             channel0              |             channel1              |             channel2              |             channel3              |
    |   slot0   |   slot1   |   slot2   |   slot0   |   slot1   |   slot2   |   slot0   |   slot1   |   slot2   |   slot0   |   slot1   |   slot2   |
----+-----------------------------------------------------------------------------------------------------------------------------------------------+

0: |     0 MB  |     0 MB  |     0 MB  |     0 MB  |     0 MB  |     0 MB  |     0 MB  |     0 MB  |     0 MB  |     0 MB  |     0 MB  |     0 MB  |
----+-----------------------------------------------------------------------------------------------------------------------------------------------+

ras-mc-ctl --error-count output:

# ras-mc-ctl --error-count
Label                         	CE	UE
CPU_SrcID#0_Ha#0_Chan#0_DIMM#0	0	0
CPU_SrcID#0_Ha#0_Chan#3_DIMM#0	0	0
CPU_SrcID#0_Ha#0_Chan#1_DIMM#0	0	0
CPU_SrcID#0_Ha#0_Chan#2_DIMM#0	0	0

dmidecode -t memory output:

# dmidecode 3.2
Getting SMBIOS data from sysfs.
SMBIOS 2.8 present.

Handle 0x000E, DMI type 16, 23 bytes
Physical Memory Array
	Location: System Board Or Motherboard
	Use: System Memory
	Error Correction Type: Multi-bit ECC
	Maximum Capacity: 256 GB
	Error Information Handle: Not Provided
	Number Of Devices: 4

Handle 0x0010, DMI type 17, 40 bytes
Memory Device
	Array Handle: 0x000E
	Error Information Handle: Not Provided
	Total Width: 72 bits
	Data Width: 72 bits
	Size: 8 GB
	Form Factor: RIMM
	Set: None
	Locator: DIMM_A1
	Bank Locator: NODE 1
	Type: DDR4
	Type Detail: Synchronous
	Speed: 2133 MT/s
	Manufacturer: Micron
	Serial Number: 1323637C
	Asset Tag: DIMM_A1_AssetTag
	Part Number: 18ASF1G72PZ-2G1B1  
	Rank: 1
	Configured Memory Speed: 2133 MT/s
	Minimum Voltage: Unknown
	Maximum Voltage: Unknown
	Configured Voltage: Unknown

Handle 0x0012, DMI type 17, 40 bytes
Memory Device
	Array Handle: 0x000E
	Error Information Handle: Not Provided
	Total Width: 72 bits
	Data Width: 72 bits
	Size: 8 GB
	Form Factor: RIMM
	Set: None
	Locator: DIMM_B1
	Bank Locator: NODE 1
	Type: DDR4
	Type Detail: Synchronous
	Speed: 2133 MT/s
	Manufacturer: Micron
	Serial Number: 13236327
	Asset Tag: DIMM_B1_AssetTag
	Part Number: 18ASF1G72PZ-2G1B1  
	Rank: 1
	Configured Memory Speed: 2133 MT/s
	Minimum Voltage: Unknown
	Maximum Voltage: Unknown
	Configured Voltage: Unknown

Handle 0x0014, DMI type 17, 40 bytes
Memory Device
	Array Handle: 0x000E
	Error Information Handle: Not Provided
	Total Width: 72 bits
	Data Width: 72 bits
	Size: 8 GB
	Form Factor: RIMM
	Set: None
	Locator: DIMM_C1
	Bank Locator: NODE 1
	Type: DDR4
	Type Detail: Synchronous
	Speed: 2133 MT/s
	Manufacturer: Micron
	Serial Number: 13236324
	Asset Tag: DIMM_C1_AssetTag
	Part Number: 18ASF1G72PZ-2G1B1  
	Rank: 1
	Configured Memory Speed: 2133 MT/s
	Minimum Voltage: Unknown
	Maximum Voltage: Unknown
	Configured Voltage: Unknown

Handle 0x0016, DMI type 17, 40 bytes
Memory Device
	Array Handle: 0x000E
	Error Information Handle: Not Provided
	Total Width: 72 bits
	Data Width: 72 bits
	Size: 8 GB
	Form Factor: RIMM
	Set: None
	Locator: DIMM_D1
	Bank Locator: NODE 1
	Type: DDR4
	Type Detail: Synchronous
	Speed: 2133 MT/s
	Manufacturer: Micron
	Serial Number: 13236332
	Asset Tag: DIMM_D1_AssetTag
	Part Number: 18ASF1G72PZ-2G1B1  
	Rank: 1
	Configured Memory Speed: 2133 MT/s
	Minimum Voltage: Unknown
	Maximum Voltage: Unknown
	Configured Voltage: Unknown