tohojo/flent

Plots are broken somehow

klukonin opened this issue · 16 comments

v2.0.0 Release is wonderful and has a lot of improvements.
But currently plotting bug is present.
For some reasons combined plots such as "totals" are shifted. I mean graphs are not aligned.

Command to reproduce:
flent --input some_test_file.flent --plot totals --bounds-x 300 --figure-width 26 --figure-height 14 --zero-y --no-title --bounds-y=500 --output ./test_totals.png

Master version seems to be OK.

dtaht commented

What is causing the 3000ms!!!! spikes in your test???

Hello, Dave.

It's not a network issue. Bluetooth/Wi-Fi scanning cause these spikes. But honestly, I did a lot of research about current Android network kitchen. There are a lot of to improve. You see a test results of SberDevices SberBox early release (unpatched firmware).
Current release is better because we did so much work to decrease latency and improve stability of Wi-Fi connection. I can say that scanning and roaming are two main diseases of android TV boxes/sticks. RTT could be up to 2000+ms. We can discuss it in the mailing list, but not here =)

Hope to share soon our results and Flent extensions for Android devices.

Actually, It's a good chance to say thank you, because some years ago your articles inspired me to learn more about traffic and queues.

@tohojo

Than you very much. Master version works perfectly.
Is it a good approach to make a minor release 2.0.1 without this issue?

dtaht commented

thx very much for absorbing my (our) work.

what's the wifi chipset in this device? (can I get one?) did you implement https://www.usenix.org/conference/atc17/technical-sessions/presentation/hoilan-jorgesen

yes, I had written about the impact of channel scans before over here: http://blog.cerowrt.org/post/disabling_channel_scans/

dtaht commented

also try adding net.ipv4.tcp_ecn=1 to sysctl

dtaht commented

It looks like we lost the g+ thread but basically we suppressed channel scans if the existing rssi was < 80 or so.

@dtaht

it's a BCM4359 with bcmdhd driver (amlogic fork).
FullMAC architecture doesn't let us use airtime fairness.
So we did some job to decrease buffering inside the driver, we turned off power_save, roaming, periodic scans, background scanning.
We also use a combination of fq_codel qdisc for wlan0 interface and BBR tcp congestion algo. So ECN is enabled =)

bcmdhd driver do some weird things. According to my limited understanding tx_glom and rx_glom work independently from kernel scatter-gather mechanism. So a bunch of tx frames would be splitted and aligned according to tx_glom bucket size every time we try to send something.
This chip uses single queue for tx and rx, so according to it's architecture there is no way to achieve low delay with medium and heavy traffic scenarios.
Nowadays many android devices use this driver without any optimizations. And it works awful with pfifofast qdisc which is default for android.
So much to improve...

P.S.
It looks like the default fq_codel interval (100ms) is not enough for mobile devices in some scenarios.
300ms ought to be enough for everybody.

dtaht commented

can you email me, I have something in the works you are going to love (davet AT teklibre.net)

dtaht commented

@klukonin I am curious if you have been up to anything interesting since this?

Otherwise, closing this bug.

@dtaht Hello Dave.
Sorry, I didn't mention your previous message.

Nothing I could add here. So please feel free to close this bug =)