rasdaemon dimm label format / sysfs content missing
dgcampea opened this issue · 0 comments
dgcampea commented
Labels don't seem to be correctly applied when using rasdaemon.
If I edit /etc/edac/labels.db
and add:
Vendor: ASRock
Model: X99M Killer
DDR4_A1: 0.0.0;
DDR4_B1: 0.0.1;
DDR4_C1: 0.0.2;
DDR4_D1: 0.0.3;
edac-ctl seems to detect the correct sysfs dimm directories:
# edac-ctl --print-labels
LOCATION CONFIGURED LABEL SYSFS CONTENTS
mc0/csrow0/ch0_dimm_label DDR4_A1 CPU_SrcID#0_Ha#0_Chan#0_DIMM#0
mc0/csrow0/ch1_dimm_label DDR4_B1 CPU_SrcID#0_Ha#0_Chan#1_DIMM#0
mc0/csrow0/ch2_dimm_label DDR4_C1 CPU_SrcID#0_Ha#0_Chan#2_DIMM#0
mc0/csrow0/ch3_dimm_label DDR4_D1 CPU_SrcID#0_Ha#0_Chan#3_DIMM#0
But doing the same for rasdaemon by adding the same content I added in /etc/edac/labels.db
to /etc/ras/dimm_labels.d/asrock
I get:
# ras-mc-ctl --print-labels
LOCATION CONFIGURED LABEL SYSFS CONTENTS
mc0 channel 0 slot 0
DDR4_A1 CPU_SrcID#0_Ha#0_Chan#0_DIMM#0
DDR4_B1 0:0:1 missing
DDR4_C1 0:0:2 missing
DDR4_D1 0:0:3 missing
System Info
Motherboard: Fatal1ty X99M Killer
CPU: Intel(R) Xeon(R) CPU E5-2620 v4
rasdaemon version: 0.6.4
Kernel: 5.13.19-200.fc34.x86_64
Distribution: Fedora 34
sysfs contents by searching for dimm:
# find /sys/ -iname '*dimm*'
/sys/devices/system/edac/mc/mc0/dimm3
/sys/devices/system/edac/mc/mc0/dimm3/dimm_ue_count
/sys/devices/system/edac/mc/mc0/dimm3/dimm_mem_type
/sys/devices/system/edac/mc/mc0/dimm3/dimm_dev_type
/sys/devices/system/edac/mc/mc0/dimm3/dimm_ce_count
/sys/devices/system/edac/mc/mc0/dimm3/dimm_label
/sys/devices/system/edac/mc/mc0/dimm3/dimm_location
/sys/devices/system/edac/mc/mc0/dimm3/dimm_edac_mode
/sys/devices/system/edac/mc/mc0/csrow0/ch2_dimm_label
/sys/devices/system/edac/mc/mc0/csrow0/ch0_dimm_label
/sys/devices/system/edac/mc/mc0/csrow0/ch3_dimm_label
/sys/devices/system/edac/mc/mc0/csrow0/ch1_dimm_label
/sys/devices/system/edac/mc/mc0/dimm6
/sys/devices/system/edac/mc/mc0/dimm6/dimm_ue_count
/sys/devices/system/edac/mc/mc0/dimm6/dimm_mem_type
/sys/devices/system/edac/mc/mc0/dimm6/dimm_dev_type
/sys/devices/system/edac/mc/mc0/dimm6/dimm_ce_count
/sys/devices/system/edac/mc/mc0/dimm6/dimm_label
/sys/devices/system/edac/mc/mc0/dimm6/dimm_location
/sys/devices/system/edac/mc/mc0/dimm6/dimm_edac_mode
/sys/devices/system/edac/mc/mc0/dimm0
/sys/devices/system/edac/mc/mc0/dimm0/dimm_ue_count
/sys/devices/system/edac/mc/mc0/dimm0/dimm_mem_type
/sys/devices/system/edac/mc/mc0/dimm0/dimm_dev_type
/sys/devices/system/edac/mc/mc0/dimm0/dimm_ce_count
/sys/devices/system/edac/mc/mc0/dimm0/dimm_label
/sys/devices/system/edac/mc/mc0/dimm0/dimm_location
/sys/devices/system/edac/mc/mc0/dimm0/dimm_edac_mode
/sys/devices/system/edac/mc/mc0/dimm9
/sys/devices/system/edac/mc/mc0/dimm9/dimm_ue_count
/sys/devices/system/edac/mc/mc0/dimm9/dimm_mem_type
/sys/devices/system/edac/mc/mc0/dimm9/dimm_dev_type
/sys/devices/system/edac/mc/mc0/dimm9/dimm_ce_count
/sys/devices/system/edac/mc/mc0/dimm9/dimm_label
/sys/devices/system/edac/mc/mc0/dimm9/dimm_location
/sys/devices/system/edac/mc/mc0/dimm9/dimm_edac_mode
ras-mc-ctl --layout
output
# ras-mc-ctl --layout
Use of uninitialized value $max_pos[3] in modulus (%) at /usr/sbin/ras-mc-ctl line 868.
Use of uninitialized value $d in numeric ge (>=) at /usr/sbin/ras-mc-ctl line 869.
Use of uninitialized value $d in sprintf at /usr/sbin/ras-mc-ctl line 872.
Use of uninitialized value $pos[3] in join or string at /usr/sbin/ras-mc-ctl line 791.
Use of uninitialized value $pos[3] in join or string at /usr/sbin/ras-mc-ctl line 791.
Use of uninitialized value $pos[3] in join or string at /usr/sbin/ras-mc-ctl line 791.
Use of uninitialized value $pos[3] in join or string at /usr/sbin/ras-mc-ctl line 791.
Use of uninitialized value $pos[3] in join or string at /usr/sbin/ras-mc-ctl line 791.
Use of uninitialized value $pos[3] in join or string at /usr/sbin/ras-mc-ctl line 791.
Use of uninitialized value $pos[3] in join or string at /usr/sbin/ras-mc-ctl line 791.
Use of uninitialized value $pos[3] in join or string at /usr/sbin/ras-mc-ctl line 791.
Use of uninitialized value $pos[3] in join or string at /usr/sbin/ras-mc-ctl line 791.
Use of uninitialized value $pos[3] in join or string at /usr/sbin/ras-mc-ctl line 791.
Use of uninitialized value $pos[3] in join or string at /usr/sbin/ras-mc-ctl line 791.
Use of uninitialized value $pos[3] in join or string at /usr/sbin/ras-mc-ctl line 791.
+-----------------------------------------------------------------------------------------------------------------------------------------------+
| mc0 |
| channel0 | channel1 | channel2 | channel3 |
| slot0 | slot1 | slot2 | slot0 | slot1 | slot2 | slot0 | slot1 | slot2 | slot0 | slot1 | slot2 |
----+-----------------------------------------------------------------------------------------------------------------------------------------------+
0: | 0 MB | 0 MB | 0 MB | 0 MB | 0 MB | 0 MB | 0 MB | 0 MB | 0 MB | 0 MB | 0 MB | 0 MB |
----+-----------------------------------------------------------------------------------------------------------------------------------------------+
ras-mc-ctl --error-count
output:
# ras-mc-ctl --error-count
Label CE UE
CPU_SrcID#0_Ha#0_Chan#0_DIMM#0 0 0
CPU_SrcID#0_Ha#0_Chan#3_DIMM#0 0 0
CPU_SrcID#0_Ha#0_Chan#1_DIMM#0 0 0
CPU_SrcID#0_Ha#0_Chan#2_DIMM#0 0 0
dmidecode -t memory
output:
# dmidecode 3.2
Getting SMBIOS data from sysfs.
SMBIOS 2.8 present.
Handle 0x000E, DMI type 16, 23 bytes
Physical Memory Array
Location: System Board Or Motherboard
Use: System Memory
Error Correction Type: Multi-bit ECC
Maximum Capacity: 256 GB
Error Information Handle: Not Provided
Number Of Devices: 4
Handle 0x0010, DMI type 17, 40 bytes
Memory Device
Array Handle: 0x000E
Error Information Handle: Not Provided
Total Width: 72 bits
Data Width: 72 bits
Size: 8 GB
Form Factor: RIMM
Set: None
Locator: DIMM_A1
Bank Locator: NODE 1
Type: DDR4
Type Detail: Synchronous
Speed: 2133 MT/s
Manufacturer: Micron
Serial Number: 1323637C
Asset Tag: DIMM_A1_AssetTag
Part Number: 18ASF1G72PZ-2G1B1
Rank: 1
Configured Memory Speed: 2133 MT/s
Minimum Voltage: Unknown
Maximum Voltage: Unknown
Configured Voltage: Unknown
Handle 0x0012, DMI type 17, 40 bytes
Memory Device
Array Handle: 0x000E
Error Information Handle: Not Provided
Total Width: 72 bits
Data Width: 72 bits
Size: 8 GB
Form Factor: RIMM
Set: None
Locator: DIMM_B1
Bank Locator: NODE 1
Type: DDR4
Type Detail: Synchronous
Speed: 2133 MT/s
Manufacturer: Micron
Serial Number: 13236327
Asset Tag: DIMM_B1_AssetTag
Part Number: 18ASF1G72PZ-2G1B1
Rank: 1
Configured Memory Speed: 2133 MT/s
Minimum Voltage: Unknown
Maximum Voltage: Unknown
Configured Voltage: Unknown
Handle 0x0014, DMI type 17, 40 bytes
Memory Device
Array Handle: 0x000E
Error Information Handle: Not Provided
Total Width: 72 bits
Data Width: 72 bits
Size: 8 GB
Form Factor: RIMM
Set: None
Locator: DIMM_C1
Bank Locator: NODE 1
Type: DDR4
Type Detail: Synchronous
Speed: 2133 MT/s
Manufacturer: Micron
Serial Number: 13236324
Asset Tag: DIMM_C1_AssetTag
Part Number: 18ASF1G72PZ-2G1B1
Rank: 1
Configured Memory Speed: 2133 MT/s
Minimum Voltage: Unknown
Maximum Voltage: Unknown
Configured Voltage: Unknown
Handle 0x0016, DMI type 17, 40 bytes
Memory Device
Array Handle: 0x000E
Error Information Handle: Not Provided
Total Width: 72 bits
Data Width: 72 bits
Size: 8 GB
Form Factor: RIMM
Set: None
Locator: DIMM_D1
Bank Locator: NODE 1
Type: DDR4
Type Detail: Synchronous
Speed: 2133 MT/s
Manufacturer: Micron
Serial Number: 13236332
Asset Tag: DIMM_D1_AssetTag
Part Number: 18ASF1G72PZ-2G1B1
Rank: 1
Configured Memory Speed: 2133 MT/s
Minimum Voltage: Unknown
Maximum Voltage: Unknown
Configured Voltage: Unknown