Rewrite C dependency in Rust?
LegNeato opened this issue ยท 22 comments
Less of an issue, more questions:
Are there any plans to get rid of the C dependency? How much work do you think it would be? Cmake dep is a huge bummer It would also be nice to use const rather than all the generated tables.
Additionally, as you have the most relevant experience to answer this... the h3 api is very C-shaped with a lot of freestanding functions...if you were to write an idiomatic rust lib from the start, would the API be similar to h3ron or would you do things differently?
There are no such plans currently, although it would be very helpful to not require cmake and it would also resolve the issues people having compiling for WASM (#21) and on blockchain platforms (#30).
The main reasons for sticking with the C dependency for me would be:
- Its the buildsystem supported upstream.
- There is a lot of work done in the C library including an increasing amount of test coverage. I am not able to put this much effort into this library - a project I am mostly doing in my spare time.
- Currently uber/h3 is preparing version 4 (uber/h3#573), this will bring quite a few changes which h3ron needs to accommodate for. Thats reasonably easy when compiling in the upstream implementation, but would again be a lot of effort for keeping up with when maintaining a rewrite.
That being said, I would also be in favour of a rust-only crate. What may be worth exploring would be experimenting with tools like c2rust on how good the translation from C to rust is, and how much manual effort has to be put in when updating to newer h3 versions. Everything would be unsafe of course, but the current API wrappers can simply continue to exist. Those are a better fit for a rust-centered API than the single functions.
It also would be feasible to compile the C library using the cc
crate by replicating a few build steps in the h3ron-h3-sys crate. That would remove the cmake requirement, but would be of little help for the WASM and blockchain issues. So it may not be worth it.
I'll have a lot of time on my hands to devote to open source soon and I'd love to contribute to a pure rust port. Would you want it as part of this project or would you rather me work elsewhere and then potentially combine once I have something to show?
Naively, the c code doesn't look like it would be hard to port or at least keep the C api for tests.
Wow - sounds great. Just go ahead when you like and start working on a fork of this repo.
Somehow I like the idea of solution where the user of this crate can decide which implementation to use, the rust-port or the C library. That could be accomplished using a feature gate to switch between the h3ron-h3-sys
crate and the rust port. The rust port could maybe be a part of the h3ron
crate itself.
cc'ing @dfellis to this discussion as we've had some discussions about possibly rewriting the C library in Rust. Our primary goal with H3 being written in C was the portability of binding the core library to other languages - JavaScript, Python, Java, etc. which we felt C was the best option for. If bindings could be made to a Rust core library from these other languages, which I think could be the case, it may obviate the need for the C core library.
Agreed with @isaacbrodsky. A combination of DGGRID being C and Rust being much less mature 6-7 years ago is why we went with C.
We intentionally went with a memory ownership model where the callers of H3 provide the memory blocks to the library that then relinquish control back at the end to make integration with VMs simpler and roughly modeled that on Rust, so it has always been in the back of my mind.
I am confused by the statement that WASM support is hindered by the code being in C, however. Emscripten is designed with C as the first class citizen, and we only haven't done a WASM port due to personal time and API breakage with h3-js. It should be pretty straightforward to make a WASM port and re-expose the C functions for other WASM projects and minor tweaks to h3-js for JS exposure. (Worst case you can use WASM's ability to call JS code to bridge in h3-wasm (or h3-js right now).)
Where the code should live is a different discussion. I personally would like it in the H3 repo, but how that affects 4.0 is something to consider. It could be another barrier to upgrading if the code base is fully rewritten instead of what we have been doing. Perhaps it could live in a crate within the H3 repo and we write compatibility testing and fuzzing to confirm equivalent behavior and switch over in some 4.x release?
It would definitely be a lot easier to write the CLI executable in Rust, along with the advantages of a real dependency management system and the improved type safety.
Things are evolving quickly here.
I for my part would be very much in favour of an official rust crate.
The concept of having H3 only work on pre-allocated memory blocks is quite good.
As C i pretty much the optimum for creating bindings to other languages, the situation in rust may be a bit worse, although certainly not bad:
- The rust implementation certainly could provide the same C-API the C implementation does. That also could be the way to stay compatible with most of the current H3 ecosystem.
- For python bindings there is pyo3 and rust-numpy which works very well
- JS is somewhat covered by compiling to WASM
- There are crates for interfacing with the JVM - I never used one of those.
- extendr for R
- rustler for the erlang VM
- ... certainly more.
@dfellis The WASM issue is caused that - at least what I found when doing a little search - the ABI created when compiling C to WASM appears to be incompatible to the ABI WASM emitted by rust expects. But I do not know many more details on this, there are a few links in #21. The way through JS is just less straight forward and also less comfortable, and brings in some additional communication overhead.
Pre-allocated blocks are good in Rust for no_std
support, which would be desirable from the Rust side anyway. I've personally never written a no_std
rust lib though, just used them.
I am confused by the statement that WASM support is hindered by the code being in C, however
I've recently been building a rust-wasm library, so I think I can add a little context here. The issue is that there are two separate WASM targets from rust. There's wasm32-unknown-emscripten
and wasm32-unknown-unknown
. The vast majority of the rust-wasm ecosystem, projects like wasm-bindgen
, js-sys
, and wasm-pack
are built around wasm32-unknown-unknown
. As I understand it, the wasm32-unknown-unknown
target generally creates smaller bundles but does not have a default C toolchain. Generally any pure-rust project is able to compile to wasm32-unknown-unknown
without changes. Meanwhile I haven't seen any easy-to-use tooling made to work with wasm32-unknown-emscripten
.
There is a bit more context in rustwasm/team#291, but it seems like the core maintainers barely have time to keep up with wasm-bindgen
as it is.
Last time I tried, I got lots of compilation errors when trying to compile h3-sys
to wasm32-unknown-unknown
(e.g. with wasm-pack build
). I think it worked out of the box with cargo build --target wasm32-unknown-emscripten
, but the issue is that only builds the wasm bundle (technically a shared object in the target
dir?). I don't know how you'd go from that to actually using it from JS.
Thank you for this detailed writeup of the C + WASM situation @kylebarron
Ok, I haven't had time to try to port this "for real", but I was able to run c2rust
on h3 and then manually tweaked it:
https://github.com/LegNeato/unsafe-h3lib
All C-based tests pass except for:
The following tests FAILED:
91 - testH3Api_test91 (Failed)
95 - testPolygonToCells_test95 (Failed)
99 - testLatLng_test99 (Failed)
This is likely because I couldn't use f128
as rust has no built-in support and the f128
crate on crates.io only works on linux.
I think @isaacbrodsky may have started hacking on an actual rust port
Sweet!
Interesting. @LegNeato Seems you had to do quite a few manual fixes to reach the working state, so c2rust may not be too great when the source needs to be updated periodically. In general, how would you rate the c2rust experience?
I am happy to hear the actual port of H3 to rust really appears to be in consideration.
It actually wasn't too bad to get it working, took me a couple of hours of actual work. It would have been quicker if my find and replace regex skills were better. The main changes / patterns were:
- Formatting
- Replacing
f128
. I just made thesef64
, but I should probably change torust_decimal
with thec_repr
feature. This is probably the source of the test failures. - Adding the various
Cargo.toml
files. - Adding the various
lib.rs
files. This was needed as the app binaires aren't as standalone as they could be. - Silencing warnings, including things like using
PI
consts and unnecessarymut
. - Fixing clippy lints
- There was a c2rust bug in some of the tests I needed to work around by copying an informational output string. Oddly, it got it right in other cases so I need to investigate.
I also did a hacky change to cmake in the h3
project to call out to the Rust binaries so I could leverage the project's testing infra rather than rewrite them in Rust.
In hindsight, I shouldn't have bothered with any clippy or warning stuff and should have just stuck with changes that mattered so it would have been easier to forward-port. I actually thought it would be harder / c2rust
would be worse so was just kinda exploring and then it worked ๐ .
I may take another crack at it tomorrow and see if I can't do it in such a way where it could be kept in lock-step with h3
. That seems like a better use of time for a couple of hours work than rewriting in Rust right now (though I would love to contribute to that effort some day). Note that it would take more work to have the generated code work under the target wasm32-unknown-unknown
as the generated code relies on libc
(I think? it didn't immediately build but I haven't done wasm stuff before).
Done ๐
Great work!
Looks very complete and wonderful it is already available on crates.io. I will put a link in the README of this repository here to your project. Your implementations solves many issues people are having ๐
Thanks!
Yup, in term of completeness I'm on par with the current H3 release (4.0). My C binding even pass their test suites.
And clearly, this will make integration in Rust projects smoother (it was one of the goal), while giving a nice perf boost.
I've also made an announcement on Reddit for the visibility, linking to an article that goes into more detail (for those who want to know more).
Thanks for the link. The performance improvements described in the article look quite impressive.
I guess I will need to find some time to port some of my stuff to your library soon ๐
Haha thanks, I'm glad to hear that.
Don't hesitate to reach out if you need help (I tried to document it as best as I could, but external/fresh eyes are more than welcome), or have feedback on the API (rough edges, improvements, etc.)
@grim7reaper Looks amazing!
Exciting!
I am closing this issue here now. I linked to h3o from the main README of this repository.