cisco-system-traffic-generator/trex-core

ierrors are observed frequently in ASTF mode.

jsmoon opened this issue · 4 comments

jsmoon commented

I observed ierrors counter under the low CPU load situation. I want to remove the counters as possible as I can.

The situation the ierrors counter is observed:

  1. When a big-size dynamic profile started in multiple cores. --> the counter is increased during starting. process_at_cp=true reduce the counter.
  2. When the capturing packet is activated with high throughput in multiple cores.
  3. When burst packets are sent. --> causes burst RX packets and missed packets are observed.

The ierrors counters are caused by missed packets. It means that sometimes there is not enough time to receive packets in RX queues even though the core's CPU load is enough. Since the errors cause TCP throughput degradation even though there are no packet drops throughout the SUT.

In my opinion, RX packets should be handled in real-time scope.
Since the DP scheduler performs scheduled works on a simulation-time basis, several long time-consuming works between TCP_RX_FLUSH periodic works may cause a large time gap. (--> TCP_RX_FLUSH cannot be handled in real-time.)
If I can make the TCP_RX_FLUSH work can be performed on a real-time basis, it will be freed from ierrors counters while the CPU is not busy.

@hhaim what do you think about this issue?

hhaim commented

@jsmoon the problem is simple. trex works in event driven. rx and tx work are shared. A very long operation in one of the DP core will starve the rx and tx that will issue the ierrors. there are cases that one DP core tx (e.g. core1) send traffic to rx core 7 which has a long operation and starve to operate.
I think that the solution is to split the long operations to a smaller one as this ierror is only one symptom of the issue

jsmoon commented

@hhaim, I agree that the long operations should be split into smaller ones.
But we should consider the case that lots of the small operations can be scheduled before the next TCP_RX_FLUSH. It can be the case that lots of dynamic profiles are running.
The current scheduler can perform the next TCP_RX_FLUSH operation after all previous small operations have been done. It is simulation-time-based scheduling. My suggestion is that we need to make the next TCP_RX_FLUSH performed immediately if the real-time has been passed although the simulation-time has not been reached.

jsmoon commented

In the ASTF DP core scheduler, there are 4 kinds of work scheduled.

Node Name Schedule Period Work Load
TCP_TX_FIF on demand depends on number of nodes
TCP_RX_FLUSH on every 20us max 256 packets on dual port
TCP_TW on every 20us depends on number of timers
FLOW_SYNC on every 1ms depends on number of messages

FLOW_SYNC and TCP_TW will be long operations due to lots of messages and timers to handle. If we assume that each message and timer handling operation is short, I think we can trigger prefetch RX packets after each handling operation.

@hhaim what do you think about this?

jsmoon commented

According to my test, other concerns can be observed under somewhat abnormal cases (too many dynamic profiles, too many flow generation causes high CPU load, etc) that can be managed.
But the following case needs to be discussed.
image
It is observed under normal CPU loads. I guess that DP core is instantly busy.
Since the received packets are left to the RX descriptors during handling operations, the number of RX descriptors only can be used for RX packet buffering. 4096 descriptors can keep about 1 msec data with 1500 MTU size on 40Gbps NIC.
According to my investigation, TCP_RX_FLUSH (under 256 packets throttling) sometimes takes longer than 1 msec because it can perform sending TX packets also.
So I think more buffering for RX packets is needed. @hhaim do you agree with this necessity?