Mellanox/mstflint

mlxreg: can't write large registers

kcgthb opened this issue · 9 comments

Hi!

I'm trying to use mlxreg --set to update a switch's node description, but it looks like there's an issue with register sizes, that may be larger than the maximum register size that can be sent in-band.

Specifically, I can read the SPZR register just fine:

# mlxreg -d /dev/mst/SW_MT54000_sh03-isw-c03_lid-0x0010 --reg_name SPZR --get --indexes "swid=0x0"
Sending access register...

Field Name              | Data
=====================================
enh_sw_p0               | 0x00000000
ng                      | 0x00000000
sig                     | 0x00000000
mp                      | 0x00000000
vk                      | 0x00000000
cm                      | 0x00000000
enh_sw_p0_mask          | 0x00000000
ndm                     | 0x00000000
cm2                     | 0x00000000
swid                    | 0x00000000
capability_mask         | 0x4450c848
system_image_guid_h     | 0x1c34da03
system_image_guid_l     | 0x004f6244
node_guid_h             | 0x1c34da03
node_guid_l             | 0x004f6244
v_key_h                 | 0x00000000
v_key_l                 | 0x00000000
capability_mask2        | 0x0000003b
max_pkey                | 0x00000008
node_description[0]     | 0x73683033
node_description[1]     | 0x2d697377
node_description[2]     | 0x2d633033
node_description[3]     | 0x00000000
node_description[4]     | 0x00000000
node_description[5]     | 0x00000000
node_description[6]     | 0x00000000
node_description[7]     | 0x00000000
node_description[8]     | 0x00000000
node_description[9]     | 0x00000000
node_description[10]    | 0x00000000
node_description[11]    | 0x00000000
node_description[12]    | 0x00000000
node_description[13]    | 0x00000000
node_description[14]    | 0x00000000
node_description[15]    | 0x00000000
=====================================

But setting a value for node_description[0] fails with -E- Failed to send access register: Register is too large:

# mlxreg -d /dev/mst/SW_MT54000_sh03-isw-c03_lid-0x0010 --reg_name SPZR --set "ndm=0x1,node_description[0]=0x74657374" --indexes "swid=0x0"
You are about to send access register: SPZR with the following data:
Field Name              | Data    
=====================================
enh_sw_p0               | 0x00000000
ng                      | 0x00000000
sig                     | 0x00000000
mp                      | 0x00000000
vk                      | 0x00000000
cm                      | 0x00000000
enh_sw_p0_mask          | 0x00000000
ndm                     | 0x00000001
cm2                     | 0x00000000
swid                    | 0x00000000
capability_mask         | 0x4450c848
system_image_guid_h     | 0x1c34da03
system_image_guid_l     | 0x004f6244
node_guid_h             | 0x1c34da03
node_guid_l             | 0x004f6244
v_key_h                 | 0x00000000
v_key_l                 | 0x00000000
capability_mask2        | 0x0000003b
max_pkey                | 0x00000008
node_description[0]     | 0x74657374
node_description[1]     | 0x2d697377
node_description[2]     | 0x2d633033
node_description[3]     | 0x00000000
node_description[4]     | 0x00000000
node_description[5]     | 0x00000000
node_description[6]     | 0x00000000
node_description[7]     | 0x00000000
node_description[8]     | 0x00000000
node_description[9]     | 0x00000000
node_description[10]    | 0x00000000
node_description[11]    | 0x00000000
node_description[12]    | 0x00000000
node_description[13]    | 0x00000000
node_description[14]    | 0x00000000
node_description[15]    | 0x00000000
=====================================

 Do you want to continue ? (y/n) [n] : y
 Sending access register...
-E- Failed to send access register: Register is too large

This seems to be related to the fact that the SPZR register has a size of 0x70, while the maximum register size authorized to be sent inband is 0x2c (INBAND_MAX_REG_SIZE = 44, as defined here:

#define INBAND_MAX_REG_SIZE 44

There's same issue with both mlxreg from MFT and mstreg from mstflint.

What would be the best way around this?
Thanks!

fix for this issue can be found in master_devel branch as part of PR#540.
the fix will also be included on the next master release

The issue doesn't seem to be fixed in MFT 4.20:

# mst version
mst, mft 4.20.0-34, built on Apr 25 2022, 20:48:31. Git SHA Hash: 62bbc33

# mlxreg -d lid-8 --reg_name SPZR --set "ndm=0x1,node_description[0]=0x666f6f6f"  --indexes "swid=0x0"
You are about to send access register: SPZR with the following data:
Field Name              | Data
=====================================
enh_sw_p0               | 0x00000000
ng                      | 0x00000000
sig                     | 0x00000000
mp                      | 0x00000000
vk                      | 0x00000000
cm                      | 0x00000000
enh_sw_p0_mask          | 0x00000000
ndm                     | 0x00000001
cm2                     | 0x00000000
swid                    | 0x00000000
capability_mask         | 0x4450d848
system_image_guid_h     | 0x1c34da03
system_image_guid_l     | 0x004f6244
node_guid_h             | 0x1c34da03
node_guid_l             | 0x004f6244
capability_mask2        | 0x0000003b
max_pkey                | 0x00000008
node_description[0]     | 0x666f6f6f
node_description[1]     | 0x2d697377
node_description[2]     | 0x2d633033
node_description[3]     | 0x00000000
node_description[4]     | 0x00000000
node_description[5]     | 0x00000000
node_description[6]     | 0x00000000
node_description[7]     | 0x00000000
node_description[8]     | 0x00000000
node_description[9]     | 0x00000000
node_description[10]    | 0x00000000
node_description[11]    | 0x00000000
node_description[12]    | 0x00000000
node_description[13]    | 0x00000000
node_description[14]    | 0x00000000
node_description[15]    | 0x00000000
=====================================

 Do you want to continue ? (y/n) [n] : y
 Sending access register...
-E- Failed to send access register: Register is too large

The same issue is still present in MFT 4.21.

@tomer540 could you please clarify if MFT >= 4.20 is supposed to include the fix from PR#540?

i will check with relevant owners and update you.

It seems only the 'get' command was fixed, the 'set' has some internal issues.
The team and I are working on a solution for it.
I will update master_devel branch as soon we have it.

Thanks @tomer540 !

@tomer540 long overdue update, but I just wanted to confirm that this issue has been fixed in MFT 4.22.
Thanks a lot!

Closing #329

I am happy I was able to help, even if it took so long