lz4/lz4-java

Unexpected error of the latest v1.7.0 in Spark on Mac OS X 10.13 (older macOS)

maropu opened this issue ยท 17 comments

Hi, @odaira, thanks for tough work in advance!

We now use the latest lz4-java (v1.7.0) and we got an error report from a Spark user below;

dyld: lazy symbol binding failed: Symbol not found: ____chkstk_darwin
  Referenced from: /private/var/folders/1v/ckh8py712_n_5r628_16w0l40000gn/T/liblz4-java-820584040681098780.dylib (which was built for Mac OS X 10.15)
  Expected in: /usr/lib/libSystem.B.dylibdyld: Symbol not found: ____chkstk_darwin
  Referenced from: /private/var/folders/1v/ckh8py712_n_5r628_16w0l40000gn/T/liblz4-java-820584040681098780.dylib (which was built for Mac OS X 10.15)
  Expected in: /usr/lib/libSystem.B.dylib

https://issues.apache.org/jira/browse/SPARK-30196?page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel&focusedCommentId=17005066#comment-17005066

Any info about that?

Bests,
Takeshi

Thanks for opening the issue. I should note that I'm on Mac OS 10.13 not 10.15. A bit of googling leads me to believe that this might affect others as well but I'm not sure.

I built the JNI binding on my macOS 10.15, but looks like it is not backward compatible with 10.13...? Unfortunately, I don't have access to 10.13 any more. I think I have three options but am not sure if they are possible. I'll think about them, but I appreciate it if you could have any suggestion.

  • Somehow specify the target version when building the JNI binding.
  • Create a VM and install 10.13.
  • Get access to a 10.13 build machine.

I tried digging a bit but didn't actually run any code. It looks like you are using the cpptasks ant contrib to compile the JNI stuff which in turn uses gcc by default which is using clang (llvm) which supports a -mmacosx-version-min=<value> option. You can set that to "10.13" or "10.9" or any other value.

I briefly tried building LZ4 using Ant/Ivy but ran into issues.

BUILD FAILED
/Users/lars/dev/external/lz4-java/build.xml:202: /Users/lars/dev/external/lz4-java/src/lz4/lib does not exist.

this was easily fixed by creating the necessary directory but I believe that should probably be another mkdir in the compile-jni step

/Users/lars/dev/external/lz4-java/src/jni/net_jpountz_xxhash_XXHashJNI.c:16:10: fatal error: 'xxhash.h' file not found

And indeed, I don't have that file. No idea where that should come from.

I believe one can add a <compilerarg value="-mmacosx-version-min=10.13"/> to the cpptasks thing but wasn't able to test due to this last error.

Thanks for the reply, @odaira ! If we can, the option 1 (Somehow specify the target version when building the JNI binding) looks nice to me. Since macOS 10.13 High Sierra was released on 2017.6, I think many users still use it.

@lfrancke I think you just forgot to do git submodule init/update.

Anyway, I checked on macOS v10.12 by myself; master...maropu:LZ4Java4MacOS
On this env, I got the same error above and the error disappeared with lz4-java including the binary that was compiled on v10.12.

Since I have a latest macOS on my office, I'll check if this error can disappear with the lz4 binary that is compiled on the latest macOS next week.

ok, I'll check

hmmm, I still hit the same error on macOS v10.12:

SQLQueryTestSuite:
22:50:47.913 WARN org.apache.hadoop.util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
dyld: lazy symbol binding failed: Symbol not found: ____chkstk_darwin
  Referenced from: /Users/maropu/Repositories/spark/spark-master/sql/core/target/tmp/liblz4-java-707696892123264349.dylib
  Expected in: /usr/lib/libSystem.B.dylib

dyld: Symbol not found: ____chkstk_darwin
  Referenced from: /Users/maropu/Repositories/spark/spark-master/sql/core/target/tmp/liblz4-java-707696892123264349.dylib
  Expected in: /usr/lib/libSystem.B.dylib

Can you check it on v10.13? @lfrancke

In order to make this easier to test, maybe we could use a Travis CI cross-build to test on both Ubuntu and OS X. According to its docs, it looks like Travis supports macOS v10.10 through v10.14.

Oh, I didn't know that. It looks useful, thanks, @JoshRosen

Looks great and the error has gone away on macOS 10.12! How did you fix that?

I was using the if flag of compilerarg wrong. I'll release 1.7.1 by the end of next week.

Many thanks!

Indeed, thank you for your prompt response and work on this.

Released 1.7.1.

Thanks!

I'll close this cuz the issue has been fixed. Thanks, again, @odaira !