kclyu/rpi-webrtc-streamer

What is the build's performance on pi4 64bit aarch64 ?

Joel-Mckay opened this issue · 2 comments

I was able to build the regular wevrtc-streamer main repo (all 13.5GB of the deps), and wanted to compare this accelerated fork's performance.

For the V4L2 driver on my pi4, a single 320x240 video only stream with P5V04A pi camera module shows:

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND     
 3210 ubuntu    20   0 2810992  24240  13220 S  21.5  0.6   0:54.94 webrtc-str+

Yet for software encoding older USB cameras, the same stream resolution pins a core at 95%... not really usable. =(

I am not cross compiling, and wanted to try building RWS against the same webrtc.a lib as the other project to compare the two... To answer the question if is this is worth the effort to directly use mmal... Do you have a native build script documented, or can you please post some top dumps of it running the same encoded resolution?

I am using a partial 64bit userland build from here, and am unsure if there is enough library compatibility left in 64bit mode to use the camera directly (should as raspivid still functions):

git clone https://github.com/6by9/userland.git
cd userland
git checkout 64bit_mmal
./buildme --aarch64

I am out of weekend to try the build again... any advice or performance metrics would be berry appreciated. =)

Cheers,
J

kclyu commented

RWS has not yet made a binary package for 64 bits OS. The 64-bit binary will improve to some extent than the 32-bit binary, but it is not expected to be significant.

Below is the top message capture at 1152x864@28 in raspberry pi 2.

top - 12:49:07 up 33 days, 16:17,  2 users,  load average: 0.60, 0.29, 0.11
Tasks: 104 total,   1 running, 103 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.2 us,  0.2 sy,  0.0 ni, 99.6 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
MiB Mem :    874.5 total,    379.8 free,    106.4 used,    388.3 buff/cache
MiB Swap:    100.0 total,    100.0 free,      0.0 used.    691.1 avail Mem 
  PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND                      
 6925 pi        20   0  172712  25972  23752 S  79.7   2.9   1:54.92 webrtc-streamer                         

The P5V04A pi camera module at 1440x1080 28fps (assumed the fps as I couldn't set this anyplace on the pi4 64bit version)

top - 22:17:09 up 1 day,  3:15,  2 users,  load average: 0.94, 0.93, 0.63
Tasks: 153 total,   1 running,  98 sleeping,   1 stopped,   0 zombie
%Cpu(s): 20.6 us,  1.8 sy,  0.0 ni, 77.6 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem :  3835456 total,  2457252 free,   485372 used,   892832 buff/cache
KiB Swap:        0 total,        0 free,        0 used.  3278188 avail Mem 

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND     
23703 ubuntu    20   0 2819208 103404  10492 S  84.1  2.7   4:31.93 webrtc-str+

I noted the memory and cpu use tended to climb about 15% depending on movement, and chromium or Firefox streamer VP8 based clients. The v4l2 driver also likely restricts the frame height and width options we get too, as it defaulted to 1080p when I tried to compare berries-to-berries so to speak. ;-)

The webrtc-streamer identifies the camera as mmal service 16.1 as well as in the chrome://webrtc-internals/ stream details..

Took a bit to port, but I will submit my clang-10 based build howto for webrtc.a at the main repo if you guys go 64bit at some point. =)

Thanks again,
J