sonic-net/sonic-sairedis

Warm reboot: How the FlexCounter was handled after init view?

shuaishang opened this issue · 13 comments

Description
With the buffer pool in config db, then do warm reboot, the syncd can't handle the FlexCounter with new VID, in "processFlexCounterEvent".
Even after apply view, the syncd should not be able to get the counter for this new VID.

Steps to reproduce the issue:

  1. the config db has the buffer_pool.
  2. do warm reboot
  3. check syslog

config db:
"BUFFER_POOL": { "egress_pool": { "mode": "dynamic", "size": "100000", "type": "egress" } },

Describe the results you received:
syslog after warmboot:
Jul 15 07:40:46.978278 sonic ERR syncd#syncd: :- translateVidToRid: unable to get RID for VID oid:0x18000000000647
Jul 15 07:40:46.978385 sonic WARNING syncd#syncd: :- processFlexCounterEvent: port VID oid:0x18000000000647, was not found (probably port was removed/splitted) and will remove from counters now
Jul 15 07:40:46.978474 sonic NOTICE syncd#syncd: :- removeBufferPool: Trying to remove nonexisting buffer pool 0x18000000000647 from flex counter BUFFER_POOL_WATERMARK_STAT_COUNTER

Describe the results you expected:
Error free warmboot, and buffer pool counter should be collected correctly by Flexcounter after warmboot.

Additional information you deem important (e.g. issue happens only occasionally):
Attach sai rec
sairedis.rec.zip

The buffer pool VID before warm reboot is 0x180000000005cd.
After warm reboot, in init view, the buffer pool VID is 0x18000000000647, the orchagent attach the flex counter to the new VID, but syncd failed because no RID yet.

Even after apply view, the syncd should not be able to get the counter for this new VID 0x180000000005cd.
Because the processFlexCounterEvent will not be called for this new VID after apply view.

Because there is no RID in INIT_VIEW, for objects which VID may change between different boot, does it mean the orchagent can't do the following for them?

  1. get the object after create. (macsecorch, get_macsec_attribute should be called after create_macsec)
  2. attach the flexcounter for it. (for example, this case)

hey, i don't exactly know how flex counters are handled in orchagent, but subscribe for flex counters should be done after apply view, and it does not matter whether this is cold or warm boot.

  1. yes, if you create object in init view, you can't do GET operation on it, exception is create_switch, because in apply view stage those new VID will be matched by comparison logic with possibly existing objects already to match RID, to provide minimal dataplane disruption.
  2. if i remember correctly, now a RID value for counter is required to subscribe for a counter, that's why OA should subscribe to flex after apply view, and even if we would move that ability to subscribe for VID, then during init-view phase we still would not be able to query those, since RID was not matched yet. so overall result is to subscribe for counters after apply view

But currently, in OA warmboot process, warmRestoreAndSyncUp, it restored the data from APP_DB and processed them before APPLY_VIEW.
It means buffer orch subscribed the flex counter, macsec orch called get_macsec_attribute before APPLY_VIEW.
There is no difference in OA doTask between init and apply view.

this is design problem on OA and should be addressed on OA side, sairedis/syncd will not know which RID/VID is which before apply view in warm boot mode

@lguohan could you pls help to check this issue?

@stephenxs can you please have a look ?

Any update for this issue?

Any update for this issue?

I'm checking and will share any findings once I have them.

Any progress?

There is no guarantee that the OIDs of buffer pools keep unchanged across warm reboots. It will always regenerate the counter ID list for buffer pools by calling BufferOrch::generateBufferPoolWatermarkCounterIdList when the system (cold or warm) starts. So OIDs changed across warm reboot is not the issue.

The issue is that the orchagent should NOT clear counters before APPLY_VIEW. I think it is caused by BufferOrch::generateBufferPoolWatermarkCounterIdList as well.

When the system starts, enable_counters will wait for ~180 seconds and then push the field {"FLEX_COUNTER_STATUS": "enable"} into CONFIG_DB.FLEX_COUNTER_TABLE|BUFFER_POOL_WATERMARK, meaning to enable the buffer pool watermark.
Flex counter orch is listening to this table and will call BufferOrch::generateBufferPoolWatermarkCounterIdList to push the counter IDs list to FLEX_COUNTER_DB once it heard the field.
By default, the watermark is fetched in read-and-clear mode which is not supported by all vendors.
To avoid further errors, a logic was introduced to check the capability of the read-and-clear by clearing the counters and checking the return value.

From the attached sairedis.rec we see there was clear_stats before APPLY_VIEW, which indicates generateBufferPoolWatermarkCounterIdList was called before APPLY_VIEW.

2021-07-15.07:40:46.976150|q|clear_stats|SAI_OBJECT_TYPE_BUFFER_POOL:oid:0x18000000000647|SAI_BUFFER_POOL_STAT_WATERMARK_BYTES=|SAI_BUFFER_POOL_STAT_XOFF_ROOM_WATERMARK_BYTES=
2021-07-15.07:40:46.977456|Q|clear_stats|SAI_STATUS_INVALID_OBJECT_ID
2021-07-15.07:40:46.983732|a|APPLY_VIEW
2021-07-15.07:40:47.894567|A|SAI_STATUS_SUCCESS

Even though there is no guarantee that generateBufferPoolWatermarkCounterIdList is always called after APPLY_VIEW, it is unlikely to be called before APPLY_VIEW given that there is a 180 seconds delay before it is triggered.

Can you check in your platform

  • Probably the syncd docker starts a bit late in your platform?
  • Is there a delay logic in enable_counters.py?

In case the logic exists but generateBufferPoolWatermarkCounterIdList is still called before APPLY_VIEW in your platform, I have some preliminary thoughts to handle it:

  • Buffer Orch not to test the capability of CLEAR. SAI redis to test it when buffer pool counter IDs are inserted for the first time, just like what has been done for counter IDs in setBufferPoolCounterList
  • Don't try clearing counters immediately after CONFIG_DB.FLEX_COUNTER_TABLE|BUFFER_POOL_WATERMARK is updated in a warm reboot flow. Just record it instead and trigger clearing counters after APPLY_VIEW has been triggered.
  • To find another way to test whether CLEAR is supported by the vendor. Eg., reading some capabilities. Not sure whether it is supported and need further check.

@shuaishang @qiluo-msft @kcudnik @liat-grozovik FYI.

@stephenxs You mentioned

Even though there is no guarantee that generateBufferPoolWatermarkCounterIdList is always called after APPLY_VIEW

Actually it is possible. The orchagent warm-reboot includes below steps. (check the code in OrchDaemon::warmRestoreAndSyncUp)

  1. bake(): it is a virtual function in every Orch subclass. The default implementation is just loaded Redis data into Consumer objects. In the warm-reboot shutdown process, we stored Redis data into files, and in the warm-reboot bootup process, we load files into Redis data, and erase all the data not relating warm-reboot. bake() could be reimplemented in subclass, in order to customize. So if you want to clean ConfigDB entries, please reimplement bake() or just erase them.
  2. doTask() will consume all the data loaded in the Consumer objects in last step. Be aware we are only processing data from init Redis content, ie, indirectly from files.
  3. apply view
  4. doTask() will start process data from upstream. Even there are upstream data coming during above steps, they are waiting and not processed until this step.

So you could take advantage of Step 1 and Step 4 to make sure the guarantee.

Guys, can we close this bug as it has been resolved by sonic-net/sonic-swss#1987?