apache/tvm

[RFC] Rust Support

tqchen opened this issue ยท 21 comments

There are two parts of the things that can be done in rust:

  • Rust "frontend" as in java/js, that wraps the current C API and expose a rust programming model
    • This allows most users to use tvm and its runtime(gpu/vulkan support etc)
    • Need to link runtime(by c++, or "backend" in next part)
  • Rust "backend" which serves as an alternative to current C++ runtime.
    • It gives a standalone WASM module that even cpp does not yet offer(due to stdc++)
    • Security stuffs e.g. SGX

Both have values and I think we should have a good discussion on how can we organize them.

  • #1597 is mainly about "backend"
  • @ehsanmok has implemented a version of "frontend"

cc @jroesch @nhynes @ehsanmok

From an organizational perspective, we can have

tvm/rust/
    Cargo.toml  # a virtual manifest
    frontend/  # what @ehsanmok has been developing
        Cargo.toml
        src/
    runtime/  # what's currently in /tvm/rust/runtime
         Cargo.toml

the runtime/ would continue to be the tvm package on crates.io and the frontend could be named elsewise.

I agree @nhynes and I'd love to hear your feedbacks of my frontend when is completed as this is my first major Rust project and still learning a lot. The same to @jroesch as well :)

Please also comment on official code-styles that we should generally follow in rust package

Please also comment on official code-styles that we should generally follow in rust package

The .rustfmt.toml currently used in #1597 is the default produced by rustup --dump-config for rustfmt version 0.6.1. We should just use the default unless there's a good reason not to.

I would be happy to jump in and do some Rust hacking in the future with you guys, but am pretty busy with some new TVM stuff @tqchen and I have been working on. I'm happy to provide feedback and code review though, until I have a few more cycles available. Current incremental plan to getting started with Rust sounds good to me.

rust runtime is merged in #1597

Here are a few next steps:

regarding docs, the conventional approach is to simply upload each release of the tvm crate to crates.io which will automatically build the docs and post them on docs.rs

OK, we can do that then if that is the conventional approach of rust

Rust CI is up and running

@tqchen Sorry, I was quite detached for some time and trying to finish it as soon as I can in my free time. I'm working on ergonimic registration of callbacks in Rust, when that's done I'd probably need help for the RPC as looking at the python and java RPC it's seems they have been sort of done from scratch!

RPC [...] seems they have been sort of done from scratch!

Now that we're on the topic, might it be a good time to start experimenting with a more established framework like grpc? There's even a nice crate for it!

I may be completely incorrect in saying this, but I think that many of the roll-your-own components are a result of C++ having a terrible package ecosystem. For instance, the dmlc-core json parser, the memory stream reader used for params serdes, the thread-local store. In Rust, these become serde, nom, and thread_local!, respectively.

In some version of reality, the TVM compiler is rewritten in Rust (with clear separation from the python components!) and is much more maintainable because of it.

We roll out our own RPC because the requirement is somewhat different from grpc. We need a lightweight stateful session that persists over multiple function calls, it also helps porting into devices like android.

So there is a reason for rolling out the rpc module. I agree that json parser and thread local are somewhat unnecessary(thread_local is already supported in c++11 and it is there only for old compilers)

On the topic of TVM compiler in Rust, I have seriously thought about it and discussed with @jroesch on possibilities.

My current take is that the things would be fine if everything is in Rust, but that is not what we really want. We want something that can interop (i.e. a good runtime system) and still enables writing pass in python, things might break in that land because of that ventures into a new territory. We will need a clear C data structure, and rust wrappers that gives you the safety guarantee you want.

In the most ideal world, we want the following thing:

  • A core data structure, in C that can be shared across c++/rust/python
  • A primary language(currently c++, maybe it could be rust), which serves the task of scaffolding.
  • Patchable pass modules that can be in any language and can be attached to the project

@nhynes right! I already looked a little into tarpc and jsonrpc to get some ideas. Given

We need a lightweight stateful session that persists over multiple function calls

I cannot find a Rust lib that we can leverage so probably should write our own! Do you know such a lib and what do you suggest?

@tqchen I like the most ideal case, but since

We want something that can interop (i.e. a good runtime system) and still enables writing pass in python, things might break in that land because of that ventures into a new territory

is a major requirement, maybe I'm wrong but I don't think using some promising Rust python bindings such as pyo3 can handle that easily yet!

Of {tarpc, jsonrpc}, I'd definitely go with not jsonrpc. The reason jsonrpc exists is to serve web3 clients on the ethereum blockchain (which is what paritytech is all about). They don't necessarily care about this use case. tarpc doesn't look bad and might work with some lazy_static!s, but it also looks a bit more complicated than necessary. Whatever you'd find productive, I guess--as long as it's your rust lib on both ends; interop with existing TVM RPC is probably better, though.

the most ideal case

rust-python programming isn't bad if there's a C repr. Unfortunately, that's the really hard part since it'd need yet another IR, as the HalideIR is also C++.

The real issue I have with python-c++ is that even if the ops are written in C++, the schedules and patterns are still in python!

I won't worry about the IR, but what we really want is a node system(like the one we had in tvm::Node), once we have that, we can always build an IR around

Hopefully before the end of this week, I'll complete writing the docs and be ready for PR. In the meantime, you can checkout the resnet example.

@nhynes I'd like to know in the proposed setup

tvm/rust/
    Cargo.toml  # a virtual manifest
    frontend/ 
        Cargo.toml
        src/
    runtime/  # what's currently in /tvm/rust/
         Cargo.toml
        src/
    tvm-sys  # <-- Raw extern C API generated from bindgen
        Cargo.toml
        src/

shall we have a common, unified tvm-sys (matching the released version) with bindgen so that both runtime and frontend can import?

Also for the .rustfmt.toml I'd highly suggest to use the official and very common tab_spaces = 4 instead of tab_spaces = 2 for the whole rust repo.

@tqchen Btw I don't want to go through impl of RPC as I haven't found a good Rust option satisfying the requirements yet. It needs separate work after the frontend is merged as well as contrib graph runtime support.

common, unified tvm-sys

Unfortunately, no because bindgen doesn't work when the target is exotic (e.g., wasm or sgx). That's why the headers are checked in as code. You're welcome to keep it as a top-level crate, though

Let's see how else your implementation looks, though. We might want a top-level common/ crate for shared functionality.

tab_spaces = 4

Two spaces is also common. At Google, 2 is used to make deeply nested code more readable. I'll defer to other community members here, but my strong preference is two spaces since it makes code more readable (esp. for rust where functional patterns encourage deep nesting). Again, let's see how your PR looks before we make a final decision. We should update (and stabilize on) rustfmt 1.0 at the same time, too.

@tqchen @nhynes Frontend is finally here.

@tqchen This can be closed now.