Multiple issues when building for rpi
Closed this issue ยท 17 comments
mpv stopped working for me on rpi with release 0.30.0, but only now with release 0.31.0 I've managed to debug it a bit deeper. Here goes the list of issues found:
- When building with rpi, mpv uses EGL headers provided by platform (/opt/vc/include), which in turn appear to lack some optional typedefs used by mpv. First such typedef is EGLAttrib provided only in EGL 1.5 which is missing in relevant section of video/out/opengl/egl_helpers.c:
// EGL 1.5
#ifndef EGL_CONTEXT_OPENGL_PROFILE_MASK
#define EGL_CONTEXT_MAJOR_VERSION 0x3098
#define EGL_CONTEXT_MINOR_VERSION 0x30FB
#define EGL_CONTEXT_OPENGL_PROFILE_MASK 0x30FD
#define EGL_CONTEXT_OPENGL_CORE_PROFILE_BIT 0x00000001
#define EGL_CONTEXT_OPENGL_FORWARD_COMPATIBLE 0x31B1
#define EGL_OPENGL_ES3_BIT 0x00000040
#endif
- Second missing typedef is PFNEGLGETPLATFORMDISPLAYEXTPROC used in video/out/opengl/context_drm_egl.c
- After fixing compilation mpv most commonly fails with:
Could not get DISPMANX objects.
sometimes happens to segfault. Both with -vo gpu and -vo gpu --gpu-context=rpi. It appears to be regression of db09d77 in which mpv started to link against libbrcmEGL instead of libEGL (same goes for libbrcmGLESv2 vs libGLESv2). As far as I understand libbrcmEGL is libEGL replacement to avoid confusion that a custom library is required. The problem is mpv now links with both librcmEGL (--rpi) and libEGL (egl-helpers) which breaks it completely. Not really sure how this could be solved cleanly. The only way to make it work now would be to link only against libbrcmEGL, even for egl-helpers, however that most likely disables the possibility of using such build of mpv with Mesa stack (not sure if that's possible though). I suppose the ideal solution would be to have VOs as dynamically loaded modules to isolate linked libraries, though in that case "module" with egl-helpers shouldn't be linked against anything and should use VOs symbols. I'll leave it to you guys though.
- Once libbrcmEGL is forced as the only linked library, mpv starts to reliably segfault with -vo rpi. That's because video/out/vo_rpi.c still refers to "non-brcm" variant of libGLESv2:
void *h = dlopen("/opt/vc/lib/libGLESv2.so", RTLD_LAZY);
after correcting it to libbrcmGLESv2 -vo rpi started to work correctly again!
- Regarding -vo gpu --gpu-context=rpi -- it still doesn't work:
[vo/gpu/opengl] Initializing GPU context 'rpi'
[vo/gpu/opengl] EGL_VERSION=1.4
[vo/gpu/opengl] EGL_VENDOR=Broadcom
[vo/gpu/opengl] EGL_CLIENT_APIS=OpenGL_ES OpenVG
[vo/gpu/opengl] Trying to create Desktop OpenGL context.
[vo/gpu/opengl] Could not bind API!
[vo/gpu/opengl] Trying to create GLES 3.x context.
[vo/gpu/opengl] Could not choose EGLConfig for GLES 3.x!
[vo/gpu/opengl] Trying to create GLES 2.x context.
[vo/gpu/opengl] Chosen EGLConfig:
[vo/gpu/opengl] EGL_CONFIG_ID=8
[vo/gpu/opengl] EGL_RED_SIZE=8
[vo/gpu/opengl] EGL_GREEN_SIZE=8
[vo/gpu/opengl] EGL_BLUE_SIZE=8
[vo/gpu/opengl] EGL_ALPHA_SIZE=0
[vo/gpu/opengl] EGL_COLOR_BUFFER_TYPE=12430
[vo/gpu/opengl] EGL_CONFIG_CAVEAT=12344
[vo/gpu/opengl] EGL_CONFORMANT=7
[vo/gpu/opengl] Recreating DISPMANX state...
[vo/gpu/opengl] glGetString(GL_VERSION) returned NULL.
[vo/gpu] Failed initializing any suitable GPU context!
Error opening/initializing the selected video_out (--vo) device.
But with working -vo rpi I gave up investigation. If needed I can go back to it.
- Now for something a little different -- are there any plans on dropping waf build system? Performance wise it's literally the worst. My rpi, which is slow as hell already, uses one entire core during build only for waf process. That's like 33% longer build time. On my Intel-based workstation I happen to run mpv builds with verbose output to see exact compiler flags. This in turn makes waf process use also single core and bottle-neck whole process. No matter how many jobs I will give it, system has some spare resources because waf itself struggles to keep up.
On my Intel-based workstation I happen to run mpv builds with verbose output to see exact compiler flags. This in turn makes waf process use also single core and bottle-neck whole process.
Known bug Might be related to #5502.
@jpalus I think in general the RPI doesn't get much love, because AFAIK none of the devs maintains its build, and it seems to use some proprietary stuff too. I guess it also depends on the distro one uses.
Would you be able to provide some clean patchset to make it work in common RPI distros/setups?
@sfan5 not sure if that's the same issue. If I understand #5502 correctly it states that only single core/thread is used for actual build. In my case, system installed waf 2.0.11, 8 threads, waf instructed to use 10 jobs I see all 8 threads being used but not saturated, utilization is at about 60%. Single process stands out though -- python with waf itself.
@avih Normally I would submit patches along the bug report but to be honest I'm completely clueless about all the *GL* libraries and lack confidence in deciding what is right ie:
- is it ok to add typedef for EGLAttrib or is it better to drop its usage, maybe someone was unaware that it's EGL 1.5 specific
- same goes for the other typedef
- libbrcmEGL is generally not trivial case and I start to wonder whether best solution isn't to revert wscript changes done in db09d77. Not sure if there are any advantages of libbrcmEGL over libEGL, or is there some EOL planned for the latter
Well, you already did a good job with identifying specific issues, so that's a good start. Personally I'm completely unfamiliar with the platform, but maybe someone else can provide answers to your questions.
In general though, it's hard to maintain it without someone who knows it, and currently among the devs it seems no one does or is interested in actively pursuing it...
Note that typdefs issues are not specific to rpi, these are mainly concerned with EGL itself -- one is bound to specific version 1.5, while the other lives in "eglext.h" header so I guess it's also optional.
The other big issue is more of a problem with linking of dynamic libraries in C and somewhat questionable decision of rpi platform owners to deliver libEGL with completely different name combined with a fact that mpv does not have dynamic modules/plugins. That's why I would opt for avoiding the issue by going back to "libEGL".
For me, adding typedef intptr_t EGLAttrib;
suffices to successfully finish a 0.31.0 build โ at least for my setup.
However, as with 0.30.0 I presumably suffer from the blank video problem described in #5405 (mpv --vo=gpu --hwdec=mmal โฆ
).
Which PI/raspbian are you using? I had some issues on RPi4 + buster...
Thanks for looking into this. I'm currently playing around with a spare Model B Rev 2 running Buster.
The other big issue is more of a problem with linking of dynamic libraries in C and somewhat questionable decision of rpi platform owners to deliver libEGL with completely different name combined with a fact that mpv does not have dynamic modules/plugins. That's why I would opt for avoiding the issue by going back to "libEGL".
In theory, the vo_gpu mechanism that can use the rpi overlay does not need broadcom EGL, as long as the Mesa one can create a transparent overlay and put it above the video one.
Also, EGL is already an abstraction library that should make dynamically loading it unnecessary. For example, Mesa provides EGL backends for X11, Wayland, and GBM.
Not sure if the effort to dynamically load EGL is worth it. There may be clashes anyway at runtime.
That's why I think it would be better to go back to linking against -lEGL instead of -lbrcmEGL. Personally I use Arch on my rpi, where /opt/vc/lib has both libEGL and libbrcmEGL, but in case the former is gone, I would still prefer to solve the situation by simply doing symbolic link between the two instead of debugging issues with multiple libraries being loaded/linked.
Now that I look at /opt/vc/lib contents, at least on Arch, it lacks dynamic libraries with so soname number, which makes me think that whole /opt/vc/lib thing is quite crappy.
That's why I think it would be better to go back to linking against -lEGL instead of -lbrcmEGL.
This would definitely break vo_rpi too.
@jpalus did you ever solved the "Could not get DISPMANX objects." problem? v0.33.1 builds fine but still seeing that error.
I managed to bisect the "Could not get DISPMANX objects." error to this one commit: db09d77
Bizzarely, it seems like there is a race-cond in the build. The same tree would sometimes produce a working binary and sometimes not. It seems to have something to do with the order waf decides to link the various /opt/vc libraries in:
Good (db09d77):
[root@rpi3 ~]# ldd ./mpv.good | grep opt.vc
libmmal.so => /opt/vc/lib/libmmal.so (0x76bd5000)
libmmal_core.so => /opt/vc/lib/libmmal_core.so (0x76bb7000)
libmmal_util.so => /opt/vc/lib/libmmal_util.so (0x76b97000)
libmmal_vc_client.so => /opt/vc/lib/libmmal_vc_client.so (0x76b7c000)
libbcm_host.so => /opt/vc/lib/libbcm_host.so (0x76b55000)
libvcsm.so => /opt/vc/lib/libvcsm.so (0x76b3b000)
libvcos.so => /opt/vc/lib/libvcos.so (0x76b22000)
libbrcmEGL.so => /opt/vc/lib/libbrcmEGL.so (0x74b26000)
libbrcmGLESv2.so => /opt/vc/lib/libbrcmGLESv2.so (0x74b01000)
libvchiq_arm.so => /opt/vc/lib/libvchiq_arm.so (0x74aeb000)
libmmal_components.so => /opt/vc/lib/libmmal_components.so (0x747c4000)
libcontainers.so => /opt/vc/lib/libcontainers.so (0x747a3000)
Bad (db09d77):
[root@b827eb68a81e ~]# ldd ./mpv.bad | grep opt.vc
libbrcmEGL.so => /opt/vc/lib/libbrcmEGL.so (0x75b38000)
libbrcmGLESv2.so => /opt/vc/lib/libbrcmGLESv2.so (0x75b13000)
libbcm_host.so => /opt/vc/lib/libbcm_host.so (0x75aec000)
libvcos.so => /opt/vc/lib/libvcos.so (0x75ad3000)
libvchiq_arm.so => /opt/vc/lib/libvchiq_arm.so (0x75abd000)
libmmal.so => /opt/vc/lib/libmmal.so (0x74cb3000)
libmmal_core.so => /opt/vc/lib/libmmal_core.so (0x74c95000)
libmmal_util.so => /opt/vc/lib/libmmal_util.so (0x74c75000)
libmmal_vc_client.so => /opt/vc/lib/libmmal_vc_client.so (0x74c5a000)
libvcsm.so => /opt/vc/lib/libvcsm.so (0x74c40000)
libmmal_components.so => /opt/vc/lib/libmmal_components.so (0x723e4000)
libcontainers.so => /opt/vc/lib/libcontainers.so (0x723c3000)
Currently RTFMing the waf build system to figure out why this happens.
@cyph84 I've stopped using /opt/vc
libraries altogether some time ago because they are one big mess. Switched to mainline but that also means no accelerated video on rpi. Please check:
and related to this issue:
@jpalus That's a bummer, without hwaccel on Rpi3, HD videos are barely watchable. I just sent a pull request, tested on Arch Linux. Hopefully this fixes the last of your issues.
On rpi4 as far as I have heard regarding hwdec:
- v4l2-m2m is being utilized for wrapping MMAL for older codecs (where completely new drivers are not being developed)
- and there is a new non-MMAL v4l2-requests based kernel hwdec driver for HEVC.
This all sounds good (nobody seems to have liked MMAL, which was also heavily paired with the broadcom EGL driver for most of the time if not still), but unfortunately it means two separate v4l2 hwdec APIs for different codecs (and MMAL still being behind one of them), and -requests is still not finished on the kernel side for all video formats (I think only H.264 or so got stabilized?).
For older ones you're essentially Shit Out of Luck and only having Old MMAL.
The waf build is mostly likely as bugged as ever. That said, the meson build should work without any problems so I'll close this. I think that's a reasonable enough solution since the chances of fixing waf are probably really small.