davisking/dlib

[Bug]: Multiple errors using dlib on macOS (either dynamic linked or statically compiled) with test target

objectivecosta opened this issue · 5 comments

What Operating System(s) are you seeing this problem on?

macOS (Apple Silicon)

dlib version

19.24

Python version

N/A

Compiler

clang 15.0.0

Expected Behavior

Following dlib's examples/CMakeLists.txt example, one should be able to embed dlib as a static library in a project in macOS/aarch64 successfully.

Current Behavior

I encounter ld: library 'libpng' not found when following the instructions on examples/CMakeLists.txt inside a static library that is linked against by a test target.

Forcing dlib to build libpng (and others) instead of using system installs results in other errors such as:

Undefined symbols for architecture arm64:
  "_png_do_expand_palette_rgb8_neon", referenced from:
      _png_do_expand_palette in libdlib.a[66](pngrtran.c.o)
  "_png_do_expand_palette_rgba8_neon", referenced from:
      _png_do_expand_palette in libdlib.a[66](pngrtran.c.o)
  "_png_riffle_palette_neon", referenced from:
      _png_do_read_transformations in libdlib.a[66](pngrtran.c.o)
ld: symbol(s) not found for architecture arm64

Steps to Reproduce

Here were my debugging steps:

  • Created a new static library project – using CMake.
  • Followed the steps of examples/CMakeLists.txt
    • Now, I double checked libpng and it seems that dlib successfully finds my copy of libpng on /opt/homebrew/Cellar/libpng/1.6.[...] and prints Found system copy of libpng
    • Even though that happens and I can successfully build the static library itself, as soon as I try to build a test target (that links to the static library I mentioned in step one), I hit an error in the linker, ld: library 'libpng' not found.
  • I understand that my linker is not finding libpng when building the test target (still figuring out why), so I went into the .cmake files inside dlib and edited them to forcefully set libpng to be built from scratch. Following that, I encountered errors with finding libjpeg and libwebp. So I am assuming that something is wrong with this setup and finding the libraries. To proceed, I forcefully set dlib to compile libjpeg and libwebp from scratch too.

I assumed that this would work. However, with this setup that would have dlib, libpng, libjpeg and libwebp all compiled inside the project, I now hit:

Undefined symbols for architecture arm64:
  "_png_do_expand_palette_rgb8_neon", referenced from:
      _png_do_expand_palette in libdlib.a[66](pngrtran.c.o)
  "_png_do_expand_palette_rgba8_neon", referenced from:
      _png_do_expand_palette in libdlib.a[66](pngrtran.c.o)
  "_png_riffle_palette_neon", referenced from:
      _png_do_read_transformations in libdlib.a[66](pngrtran.c.o)
ld: symbol(s) not found for architecture arm64

When executing my test target...

Looking at my build folder, I see that dlib did build multiple object files, including the ones referenced in the error pngrtran.c.o inside the folder cmake-build-debug/third_party/dlib-19.24/dlib/CMakeFiles/dlib.dir/external/libpng. So I'd assume that those symbols would be defined and findable. Since it mentions _neon, I am assuming that it is indeed finding arm64 stuff correctly – which made me even more confused.

Is there anything obvious that I am missing? Or is dlib not compatible with macOS aarch64 setups?

Anything else?

Basically this was an attempt of re-building an old project I had in macOS x86_64, so I am assuming this is something arch-related.

I'd be glad to help out and update documentations if this is indeed something that can be improved for other new-comers to the project.

It should all work, but maybe the neon code in the copy of libpng we have in external/ just doesn't work on your machine. The libpng in external is only there as a fallback for people who don't have libpng installed or can't figure out how to install it for whatever reason. Or have broken copies like yours apparently. There are a lot of package managers that install broken copies of libpng or other libraries out there. Which is something I have no control over 🤷

Anyway, maybe the neon code just doesn't work and should be deleted. Try removing the libpng files with arm or neon in the name. Although we had this a while ago which is how one of them got there #2664

Frankly I would prefer to not have any non-portable code in external/ at all. Since it's only a fallback and shouldn't be used. People should use a more official libpng if they really want it to be super fast. That way it isn't dlib's responsibility for having a build system that can build all these other libraries with their platform specific hardware acceleration :)

That's all to say, yeah, see what you can do to make it work on your machine. I would be fine with a PR that just disabled all the neon stuff if that's what makes it work, since that would be fundamentally more portable (i.e. easily built across all platforms).

Thanks for the quick reply! Interestingly enough, if I remove the "lib" prefix from the png/jpeg libraries in find_libpng and find_libjpeg, it seems to work just fine (it successfully links by using -lpng instead of -llibpng – achieved that by manually using set(PNG_LIBRARIES "png;z")).

I would assume that this would also cause test_for_libpng to also fail linking, but for some reason, it doesn't (which is still baffling for me)

As for the embedded libpng copy, I'll look into removing the neon stuff and seeing if it compiles locally. If so, I'll submit a PR!

Warning: this issue has been inactive for 35 days and will be automatically closed on 2024-09-28 if there is no further activity.

If you are waiting for a response but haven't received one it's possible your question is somehow inappropriate. E.g. it is off topic, you didn't follow the issue submission instructions, or your question is easily answerable by reading the FAQ, dlib's official compilation instructions, dlib's API documentation, or a Google search.

Warning: this issue has been inactive for 43 days and will be automatically closed on 2024-09-28 if there is no further activity.

If you are waiting for a response but haven't received one it's possible your question is somehow inappropriate. E.g. it is off topic, you didn't follow the issue submission instructions, or your question is easily answerable by reading the FAQ, dlib's official compilation instructions, dlib's API documentation, or a Google search.

Notice: this issue has been closed because it has been inactive for 45 days. You may reopen this issue if it has been closed in error.