tokio-rs/tokio

Restructure documentation from the ground up

Closed this issue · 14 comments

One of the most frequent pieces of feedback we get about futures and Tokio in Rust is "it's confusing!" While https://tokio.rs has some great content it doesn't seem to have stood well against the test of time. It's time we rewrite/restructure our documentation from the ground up. I hope to use this issue to track this work and provide a location to discuss it as well!

@aturon and I talked a bit today about how to start structuring docs, and we came initially to some high-level conclusions:

  • We should assume that async/await syntax will happen "relatively soon". This syntax is such a game changer for futures that we should make liberal use of it in documentation, even if it's unstable. Documentation will have a disclaimer about why it is unstable, links to tracking issues for stability progress, and links to all code examples which work on stable Rust as well. Our hope is that by the time this is done async/await is at least in the standard nightly compilers rather than having to work off a rustc fork.
  • We likely want to not mention tokio-proto/tokio-service at all in the first pass of documentation. This has led to quite some confusion with how the current documentation is so concretely centered around tokio-proto.
  • We need to inundate ourselves with examples. We cannot have enough examples. The current documentation suffers quite a bit in this respect.

Other than that, here's the thinking for an outline so far:

Compare and contrast sync and async

  • Goal: (1) async isn’t scary! (2) async gives you super-powers (3) you are already empowered to kick the tires
  • Async programming is actually super easy! Looks just like sync
  • Async is really expressive
    • show off something super easy with futures that’d be a pain with std
      • timeouts
      • multiplexing heterogeneous events
      • incredibly cheap tasks
        • Rayon integration
      • not critical understand everything, just show some high-level things to get across the “power” point
  • Async is really performant
    • Killer memory usage comparison
    • Killer throughput comparison
    • Both should use relatively simple/easy to understand but quasi-realistic servers. Ideally people can download and play themselves

Behind the curtain: Futures

  • Kernel
    • Basic conceptual model of polling etc
    • The trait
      • fn poll only
    • Task model
  • Combinators
    • How does async/await map down
  • Streams/sinks

I/O with futures

  • Tokio!

There's definitely quite a bit to fill out here! We'll try to do that over time.

Here is my current (rough) outline for docs:

  • "Hello world" tutorial

    • Echo server using tokio-io helpers
  • Foundational knowledge

    • Execution model / non-blocking / event-driven
    • Reactor
    • Tasks (basics)
      • Async type
    • Using a socket
  • AsyncRead / AsyncWrite (using a socket)

    • Reading / writing in more depth
      • bytes crate for buffers
      • Echo server reading / writing to the socket
    • Composing / decorating
      • TLS
    • Shutdown (drop can't block).
  • Futures

    • What is a future
    • Combinators
      • Functional / passing state
    • How futures work on top of tasks
      • Callbacks do not fire immediately!!! unlike other libs
    • Architecting w/ futures
    • Returning futures
    • Timeouts
    • Cancellation
  • Transports (Sink + Stream)

    • Framing concepts
    • Implement transport manually
    • Using transport
      • Split, forward
    • tokio-io codec module
  • More transports

    • Handshake
    • Ping / pong
    • Timeouts
    • Reconnect
  • Organizing an application

    • Tasks to manage resources / message passing (mpsc)
    • Chat server
    • File operations / blocking ops / CPU bound
  • Advanced topics

    • Back pressure
    • Routing / broadcast
  • Cookbook / FAQ?

    • Shutting down the reactor
    • ???

Re: Async / await, I strongly feel that the documentation should be built off stable Rust by default. Examples can have both stable & async / await versions and docs can reference / explain async / await & link to external resources on it, but the flow should not be based on it until it hits stable.

It should also be plausible to work on the docs incrementally vs. a long lived branch.

Ixrec commented

Take this with a grain of salt since I haven't had an opportunity to use tokio (or any other Rust code) for any serious projects, but I did experience some confusion back when I read the tokio docs "cover to cover". I believe that was primarily because of the counterintuitiveness of the poll method/the Future trait, but also partially because a large portion of it simply wasn't useful information for writing high-level applications. Some concrete suggestions:

  • If the actual Future trait/poll() signature is not something the average tokio user should need to understand (which is the impression I'm getting here), we should make it very obvious when they get to that part of the documentation that this is optional low-level stuff that is not required. The impression the current docs gave me is that understanding poll() is technically not required, but in practice you're gonna have to know it to be productive with any non-toy code, much the same way many niggling C++ details are technically not need-to-know but still get thorough coverage in Effective C++.

  • When we do get to the part that explains poll(), do not start with the signature, or with "low-level" code that talks to futures directly. Such code feels a lot like the definition of a monad to me, i.e. it's so abstract and generic that it's likely to appear totally meaningless unless you're already familiar with it (I feel obligated to link http://thecodelesscode.com/case/143 at this point). Instead, start with a conceptual execution trace of some simple but high-level async I/O program. Like, these methods return a main future that's the combination of futures A, B, C which gets compiled into a state machine like enum { A(...), }, then calling core.run() calls .poll() on the main future, which calls poll() on A, then A says it's ready so we actually call the callback for it, then the core calls .poll() again and this time since A already finished it calls poll() on B which says it's not ready and registers the core to wake up on some OS event rather than busy waiting...and so on. At least, that's how I think a typical tokio program operates at runtime based on the current docs, but I'd really like to know for sure if I've gotten that right or not. Then we reveal the Future trait/poll() method after we've mentioned all the key things that may or may not actually happen at runtime via that trait/method.

  • There should probably be very brief but dedicated sections somewhere that specifically address the handful of other major languages with strong async I/O conventions. Otherwise newcomers will have to waste a lot of cognitive energy on questions like "Is this conceptually the same as X from my language or is it a totally separate thing I need to pay close attention to?" For example, Javascript developers should get told that "reactor::Core is a single-threaded event loop, so when you're using futures (or #[async]/await!()) driven by reactor::Core the semantics are very similar to Javascript with A+ promises (or ES7 async/await keywords)" (that is correct, right?) while C# developers and Python developers and so on get told something else.

Everything else I can think of (e.g., more examples!) is already in the OP post.

I know you have already noted this above, but yes, could you please strive towards more real world examples?

Such as:

  • An http server that also makes http requests (using a single core reactor).
  • How do I add a periodic timer task to the example above?

Some notes from a chat earlier today:

We'll probably want an FAQ or otherwise style page about maintaining state with futures. Right now most futures are 'static but idiomatic methods in Rust already are &self, and this causes quite the tension when you say, want to use &self inside of an and_then closure.

Today you need to restort to using self and threading ownership or to use Rc or some other reference-counted sharing solution likely. We should document this explicitly and explore the various tradeoffs as well.

qmx commented

something I just stumbled upon, it would be nice to have an example where you pass down a &msg down to a spawn_fn - pretty desirable when you're parsing binary data from things like kafka

I struggled a lot while learning futures/tokio, I will note down reasons, this might help you guys to give some perspective from a normal newcomer.

  1. Hello, world. But, simple, describe how to get data from stream clearly, and how to send data through sink clearly, instead of using Copy or some abstract method.

  2. How to work with raw byte streams instead of using abstract framed on socket and then use encoder, decoder. Why this? because, if I am coming from another language, I will be familiar with byte streams, so it will be a little bit confusing for me to understand encoder and decoder at first, no doubt they are really cool features.

  3. Is single thread enough for C10K problem? This is I get allot in my mind, will single threaded loop will be able to handle such a big connection pool, (Why? because I don't know that many lower level details and it has always been taught to me that multithreaded applications are always good and all, so, just clarifying on that will relax new users on how Tokio is really cool for building applications.)

  4. Simple Example on how to scale Tokio in terms of big applications, like how to architect code and all the modules. What should be a standard way of handling functions and returning values.

    • Should I return `boxed futures'?
    • Should I return `one of the combinators'?
    • Should I return direct values and then encapsulate them into combinator chain?
  5. If possible, small example on how to create multiple reactors on different threads and then provide work to those threads from other threads.

  6. Simple standard to build client applications.

  7. Simple standard to build server applications.

point 6, 7 are not in terms of giving traits and implement them, just with example if you guys can demonstrate that, if you were to build Tokio servers, this is how you would build. This may boost a lot of confidence in newcomers.

It's not much, but I hope my input help grow Tokio.

Thanks @0freeman00! I'm working actively on a small book for async programming in Rust, and have already taken into account many of the suggestions you present. Talking specifically about the C10K problem (or similar), though, in an interesting twist; I'll give it some thought.

@aturon waiting for the book to come out then. :)

Just saw a great source of inspiration for "advanced user"/"reference" documentation passing by in TWIR: https://cafbit.com/post/tokio_internals/ .

@aturon do you have any updates on the book?

There has been miscommunication and @aturon has not been planning on writing this book as documentation, so this is still an open issue.

An extremely early version is up here, but it's based on a bunch of changes to the relevant crates that have since seen further iteration. I've been holding off further work on it until the dust settles a bit on the design.