emacs-tree-sitter/elisp-tree-sitter

Provide `arm64` build of `tsc-dyn.dylib` for Apple Silicon (M1) Macs

3c1u opened this issue · 28 comments

3c1u commented

Currently, tsc-dyn.dylib has only x86_64 binary.

✗ ~ file tsc-dyn.dylib 
tsc-dyn.dylib: Mach-O 64-bit dynamically linked shared library x86_64

tsc-dyn.dylib can be cross-compiled using the latest Rust toolchain.

This will likely need system toolchain for arm64 as well, in addition to Rust toolchain.
Then we will need someone with an M1 Mac to test it.

Have you been able to cross-compile it? Or do you have a link to a working guide for targeting aarch64-apple-darwin on x86_64-apple-darwin?

3c1u commented

Tested on x86_64 macOS 11.1.

Use master branch of tree-sitter (arm64 is now supported), and emacs crate with updated bindgen (libclang fails to load somehow otherwise)

emacs = { path = "../../emacs-module-rs" }
tree-sitter = { git = "https://github.com/tree-sitter/tree-sitter.git" }

and run this command:

env SDKROOT="/Library/Developer/CommandLineTools/SDKs/MacOSX11.1.sdk" cargo build --all --release --target aarch64-apple-darwin

Did you install XCode Beta?

My MacOS SDK doesn't support arm64. I'm getting error architecture not supported.
I don't have access to XCode Beta, so I cannot try this out at the moment.

https://stackoverflow.com/questions/64313634/xcode-12-compiling-for-macos-arm64-arch

3c1u commented

Nope, I'm using the latest stable Xcode 12.3 (12C33) on macOS Big Sur 11.1.

Ah ok, I misread as 10.11. Yeah, Big Sur's SDK supports arm64 arch.

I'll check this again when I have access to an SDK supporting arm64.

and emacs crate with updated bindgen (libclang fails to load somehow otherwise)

Does it work if you delete Cargo.lock instead?

3c1u commented

Deleting Cargo.lock worked on x86_64 cross-compilation. I also tried on an Apple Silicon Mac and libclang fails to load, which cannot be solved by deleting Cargo.lock.

I'm struggling with how to build this on my M1 mac. I seem to be stuck on resolving dependencies, the earliest error being:

   Compiling emacs_module v0.12.0
The following warnings were emitted during compilation:

warning: couldn't execute `llvm-config --prefix` (error: No such file or directory (os error 2))
warning: set the LLVM_CONFIG_PATH environment variable to a valid `llvm-config` executable

error: failed to run custom build command for `emacs_module v0.12.0`

Caused by:
  process didn't exit successfully: `/Users/theodor/Git/emacs-tree-sitter/target/debug/build/emacs_module-e1ade98717ceb887/build-script-build` (exit code: 101)
  --- stdout
  cargo:warning=couldn't execute `llvm-config --prefix` (error: No such file or directory (os error 2))
  cargo:warning=set the LLVM_CONFIG_PATH environment variable to a valid `llvm-config` executable

  --- stderr
  thread 'main' panicked at 'libclang error; possible causes include:
  - Invalid flag syntax
  - Unrecognized flags
  - Invalid flag arguments
  - File I/O errors
  If you encounter an error missing from this list, please file an issue or a PR!', /Users/theodor/.cargo/registry/src/github.com-1ecc6299db9ec823/bindgen-0.51.1/src/ir/context.rs:571:15
  note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

With some help from someone more knowledgeable in rust I can try to build this

@theothornhill Seems like you need to install llvm. It's needed to generate the Rust binding for emacs-module.h

Yeah, you're right. emacs_module fails on bindgen-0.51, as mentioned other places.

I think I managed to compile it to M1 now.

running file on the created dylib returns

tsc-dyn.dylib: Mach-O 64-bit dynamically linked shared library arm64

However, now I'm stuck at compiling the rust.dylib.
Running make ensure/rust it fails with:

Debugger entered--Lisp error: (tsc-lang-load-failed "dlopen(/Users/theodor/Git/emacs-tree-sitter/langs/...")
  tsc--load-language("/Users/theodor/Git/emacs-tree-sitter/langs/bin/rus..." "tree_sitter_rust" rust)
  tree-sitter-load(rust nil nil)
  tree-sitter-require(rust)
  (condition-case nil (tree-sitter-require lang-symbol) (error (display-warning 'tree-sitter-langs (format "Could not load grammar for `%s', trying to compile..." lang-symbol)) (tree-sitter-langs-compile lang-symbol) (tree-sitter-require lang-symbol)))
  (unwind-protect (condition-case nil (tree-sitter-require lang-symbol) (error (display-warning 'tree-sitter-langs (format "Could not load grammar for `%s', trying to compile..." lang-symbol)) (tree-sitter-langs-compile lang-symbol) (tree-sitter-require lang-symbol))) (tree-sitter-langs--copy-query lang-symbol))
  tree-sitter-langs-ensure(rust)
  (progn (require 'tree-sitter-langs) (tree-sitter-langs-ensure 'rust))
  eval((progn (require 'tree-sitter-langs) (tree-sitter-langs-ensure 'rust)) t)
  command-line-1(("--directory" "/Users/theodor/Git/emacs-tree-sitter/core" "--directory" "/Users/theodor/Git/emacs-tree-sitter/lisp" "--directory" "/Users/theodor/Git/emacs-tree-sitter/langs" "--eval" "(progn (require 'tree-sitter-langs) (tree-sitter-l..."))
  command-line()
  normal-top-level()

after it has run the tree-sitter test. I've tried it both from normal terminal and shell inside emacs with load-path explicitly set before running the command. Any tips for how to proceed?

I'm seeing something similar. I built tsc-dyn.dylib but I see the following when running make ensure/python:

$ EMACS=/Applications/Emacs.app/Contents/MacOS/Emacs make ensure/python
[...]
Debugger entered--Lisp error: (tsc-lang-abi-too-new 13 (9 . 12) "/Users/ddavis/software/repos/emacs-tree-sitter/lan...")
  tsc--load-language("/Users/ddavis/software/repos/emacs-tree-sitter/lan..." "tree_sitter_python" python)
  (let ((language (tsc--load-language full-path native-symbol-name lang-symbol))) (let* ((key lang-symbol)) (condition-case nil (with-no-warnings (map-put! tree-sitter-languages key language nil)) (map-not-inplace (setq tree-sitter-languages (map-insert tree-sitter-languages key language))))) language)
  (let* ((lang-name (symbol-name lang-symbol)) (fallback-name (replace-regexp-in-string "-" "_" lang-name)) (native-symbol-name (or native-symbol-name (format "tree_sitter_%s" fallback-name))) (files (if file (list file) (cons lang-name (if (string= lang-name fallback-name) nil (list fallback-name))))) (full-path (seq-some #'(lambda (base-name) (locate-file base-name tree-sitter-load-path tree-sitter-load-suffixes)) files))) (if full-path nil (error "Cannot find shared library for language: %S" lang-symbol)) (let ((language (tsc--load-language full-path native-symbol-name lang-symbol))) (let* ((key lang-symbol)) (condition-case nil (with-no-warnings (map-put! tree-sitter-languages key language nil)) (map-not-inplace (setq tree-sitter-languages (map-insert tree-sitter-languages key language))))) language))
  tree-sitter-load(python nil nil)
  (or (alist-get lang-symbol tree-sitter-languages) (tree-sitter-load lang-symbol file native-symbol-name))
  tree-sitter-require(python)
  (condition-case nil (tree-sitter-require lang-symbol) (error (display-warning 'tree-sitter-langs (format "Could not load grammar for `%s', trying to compile..." lang-symbol)) (tree-sitter-langs-compile lang-symbol) (tree-sitter-require lang-symbol)))
  (unwind-protect (condition-case nil (tree-sitter-require lang-symbol) (error (display-warning 'tree-sitter-langs (format "Could not load grammar for `%s', trying to compile..." lang-symbol)) (tree-sitter-langs-compile lang-symbol) (tree-sitter-require lang-symbol))) (tree-sitter-langs--copy-query lang-symbol))
  tree-sitter-langs-ensure(python)
  (progn (require 'tree-sitter-langs) (tree-sitter-langs-ensure 'python))
  eval((progn (require 'tree-sitter-langs) (tree-sitter-langs-ensure 'python)) t)
  command-line-1(("--directory" "/Users/ddavis/software/repos/emacs-tree-sitter/cor..." "--directory" "/Users/ddavis/software/repos/emacs-tree-sitter/lis..." "--directory" "/Users/ddavis/software/repos/emacs-tree-sitter/lan..." "--eval" "(progn (require 'tree-sitter-langs) (tree-sitter-l..."))
  command-line()
  normal-top-level()

make: *** [ensure/python] Error 255

Debugger entered--Lisp error: (tsc-lang-load-failed "dlopen(/Users/theodor/Git/emacs-tree-sitter/langs/...")

Emacs collapses the error message. You can click on the dots ... to expand it. You can also modify/advise tree-sitter-load to catch that signal and print the full error data.

Debugger entered--Lisp error: (tsc-lang-abi-too-new 13 (9 . 12) "/Users/ddavis/software/repos/emacs-tree-sitter/lan...")

This is a different error. It means that your python.dylib is too new (13). It was probably compiled by tree-sitter CLI version 0.19. If you compile it yourself, you can try installing an older version of the CLI, e.g. 0.18.

This is a different error. It means that your python.dylib is too new (13). It was probably compiled by tree-sitter CLI version 0.19. If you compile it yourself, you can try installing an older version of the CLI, e.g. 0.18.

Compiling the tree-sitter CLI at the 0.18.0 tag indeed worked. Thanks!

tsujp commented

I don't really know the ins and outs of tree-sitter intimately, or elisp as I'm still new to both of these but I played around with this all for an hour or two and got as far as manually compiling a parser for arm64 (using gcc not clang) but emacs segfaults when that's done. I've looked at the source for emacs-tree-sitter as well as tree-sitter-langs and tree-sitter-c-sharp and as far as I understand it the parser.c is generated from here. Where this is compiled into a binary I don't know. I looked through all 3 repos for "clang" and "ld" (which are the things being used as I mucked around a bit and ended up getting this out). This was the result after I managed to feed an arm64 parser.so into it. So I know roughly where in the pipeline this is happening but.. no mention of "clang" or "ld" anywhere in the codebase so I don't know what to edit :(

Parser compilation failed.
Stdout: 
Stderr: Undefined symbols for architecture x86_64:
  "_tree_sitter_c_sharp_external_scanner_create", referenced from:
      _tree_sitter_c_sharp.language in parser-fc6bff.o
  "_tree_sitter_c_sharp_external_scanner_deserialize", referenced from:
      _tree_sitter_c_sharp.language in parser-fc6bff.o
  "_tree_sitter_c_sharp_external_scanner_destroy", referenced from:
      _tree_sitter_c_sharp.language in parser-fc6bff.o
  "_tree_sitter_c_sharp_external_scanner_scan", referenced from:
      _tree_sitter_c_sharp.language in parser-fc6bff.o
  "_tree_sitter_c_sharp_external_scanner_serialize", referenced from:
      _tree_sitter_c_sharp.language in parser-fc6bff.o
ld: symbol(s) not found for architecture x86_64
clang: error: linker command failed with exit code 1 (use -v to see invocation)

Error calling (tree-sitter test), exit code is 1
make: *** [Makefile:13: ensure/c-sharp] Error 255

So I've hit a dead end.

Emacs collapses the error message. You can click on the dots ... to expand it. You can also modify/advise tree-sitter-load to catch that signal and print the full error data.

The error was just that c-sharp.dylib was wrong architecture. I tried to manually run tree-sitter generate

Now I got it to work using this recipe:

Steps to reproduce:

  1. clone this repo
  2. cd /langs
  3. git submodule update --checkout
  4. cd /repos/c-sharp
  5. tree-sitter generate
  6. gcc -o parser.so --shared src/parser.c src/scanner.c -I./src
  7. mv parser.so /path/to/langs/bin/c-sharp.dylib
  8. cd /emacs-tree-sitter-root
  9. make ensure/c-sharp

What remains from here for M1 support in MELPA? It is possible to generate bindings just fine, it seems :)

Now I got it to work using this recipe:

Awesome!

6. gcc -o parser.so --shared src/parser.c src/scanner.c -I./src

That's weird. IIRC, this is what tree-sitter test does underneath, if the shared lib was not compiled and put in place (path/to/langs/bin/c_sharp.so)

What remains from here for M1 support in MELPA? It is possible to generate bindings just fine, it seems :)

We'd need to update CI jobs to build M1 binaries, and update the code (tsc-dyn-get and tree-sitter-langs-build) to handle both M1 and Intel chips.

However, since GitHub Actions and Azure Pipelines don't support M1 at the moment, what we can do now is just updating the docs with M1 installation instructions.

edit: 2021-08-05 - I had to update my build process to handle the fact that emacs-tree-sitter doesn't work with the trunk version of tree-sitter (v0.20.0 at time of writing) - so now I explicitly pull v0.19.5 and also more carefully deal with some intermediate artifacts

That's weird. IIRC, this is what tree-sitter test does underneath, if the shared lib was not compiled and put in place (path/to/langs/bin/c_sharp.so)

I just went through getting emacs-tree-sitter set up on an M1 laptop, and the tree-sitter I installed through npm install --global tree-sitter-cli was the x86_64 version. Similarly, cargo install tree-sitter got an (old and) x86_64 executable. Either of them, when running tree-sitter test, would invoke the x86_64 version of the compiler and create a x86_64 dylib, rather than an arm64 one. I had to clone the tree-sitter repo then cd cli && cargo install --path . to install an up-to-date correct architecture version.

tl;dr: for any other M1 users treading this path:

I had to brew install cask llvm node rust for dependencies.
My full use-package + straight.el (note that I have straight-use-package-by-default set to true) invocation looks like this, though I can't guarantee at all that this will work for you, reader:

(use-package tsc
  :straight `(:pre-build ,(when (and (memq window-system '(mac ns))
                                     (string-match-p (rx string-start "arm-")
                                                     system-configuration))
                                         ;; required for tree-sitter
                            (unless (and (executable-find "cargo")
                                         ;; required for building bindings
                                         (executable-find "cask")
                                         (executable-find "git")
                                         ;; required for tree-sitter to generate
                                         (executable-find "npm")
                                         ;; required for bindings
                                         (executable-find "llvm-gcc"))
                              (warn "tree-sitter build will fail"))
                            (setf lyn--self-compiled-tsc t)
                              ;; get tree-sitter v0.19.5 - last to put files in a reasonable place
                            '(("sh" "-c" "test -d rust-tree-sitter || git clone https://github.com/tree-sitter/tree-sitter rust-tree-sitter; cd rust-tree-sitter && git fetch && git checkout v0.19.5")
                              ("sh" "-c" "cd rust-tree-sitter/cli && cargo install --path .")
                              ;; needed or it will download x86_64 dylibs over the arm64 ones we just built
                              ("sh" "-c" "file core/tsc-dyn.dylib | grep -q arm64 || rm -f core/tsc-dyn.dylib")
                              ("sh" "-c" "grep -q LOCAL core/DYN-VERSION || printf LOCAL >core/DYN-VERSION")
                              ("sh" "-c" "grep -q DYN-VERSION bin/build && sed -e '/DYN-VERSION/d' bin/build >bin/build.tmp && mv bin/build.tmp bin/build && chmod +x bin/build || :")
                              ;; rebuild bindings
                              ("sh" "-c" "EMACS=emacs ./bin/setup && EMACS=emacs ./bin/build")
                              ;; ensure all language definitions
                              ("find" "langs/repos" "-type" "f" "-name" "grammar.js" "-not" "-path" "*/node_modules/*" "-not" "-path" "*/ocaml/interface/*" "-exec" "sh" "-c" "targets=''; for grammar_file in \"$@\"; do grammar_dir=\"${grammar_file%/*}\"; targets=\"$targets ensure/${grammar_dir##*/}\"; done; EMACS=emacs make -j7 $targets" "sh" "{}" "+")))
                         :files ("core/DYN-VERSION" "core/tsc-dyn.*" "core/*.el")))
(use-package tree-sitter
  :commands (tree-sitter-hl-mode))
(use-package tree-sitter-langs
  ;; Don't clone the separate tree-sitter-langs repo, use the dylibs we
  ;; already built
  :straight (:host github :repo "ubolonton/emacs-tree-sitter"
             :files ("langs/*.el" ("bin" "langs/bin/*.dylib") ("queries" "langs/queries/*")))
  :after tree-sitter
  ;; If this isn't set then it'll download x86_64 dylibs over the arm64
  ;; dylibs we built
  :init (setf tree-sitter-langs--testing lyn--self-compiled-tsc))

Still can't get it to work on my M1. @lynlevenick Could you please write a short walkthrough on how to do it manually?

Does anyone have instructions on how to compile it manually?

Any update?

The binaries are available starting from release 0.16.1.

Notes:

  • The downloading code hasn't been updated, so you will have to manually download tsc-dyn.aarch64-apple-darwin.dylib, rename it to tsc-dyn.dylib, and put it next to tsc.el.
  • It has not been tested. (Neither I nor GitHub Actions has M1).
  • tree-sitter-langs hasn't provided M1 binaries yet.

Pre-compiled grammars for Apple Silicon are available starting from tree-sitter-langs 0.10.13.

Notes:

  • The downloading code hasn't been updated, so you will have to manually download tree-sitter-grammars.aarch64-apple-darwin.v0.10.13.tar.gz, and extract the contents into (tree-sitter-langs--bin-dir).
  • They have not been tested. (Neither I nor GitHub Actions has M1).

Just reporting that I tried these now (hadn't been using Emacs on my M1 in a bit) and it seems to work fine so far.

@timlod what version of emacs are you using? I'm using emacs-mac with use pacakge and am getting the error

Error (use-package): tree-sitter/:catch: Cannot open load file: No such file or directory, tsc-dyn
Error (use-package): tree-sitter-langs/:catch: Cannot open load file: No such file or directory, tsc-dyn

i tried using straight-use-package with the same result.

See the last two messages by ubolonton - you need to manually download the files still and extract them into the tree-sitter directories.

Thank you @timlod