A pure `alloc` library, with today's `alloc` becoming `global_alloc`
Ericson2314 opened this issue · 4 comments
@SimonSapin and @rkruppe asked me to lay down my plan in it's own thread, rather than just in a comment in another + tons of references in other threads.
N.B I'm probably going to end up editing this a faiir amount,
Motivation
Most of alloc
is "pure" in that it doesn't depend on any external (e.g.) capabilities, and doesn't use stable language features. That makes this code very pure portable---across platforms and hypothetical Rust implementations. Leaving it with the other items in alloc, however, encumbers it with less 0pure things like GlobalAlloc
, global OOM handling, etc. But if we pull it out into a separate library, we get more "free" portability: i.e. the code itself need not (or hardly) be changed, just moved.
Also, while the current global hooks in alloc are fine for implementing std
, they are less than satisfactory on their own. @japaric in rust-lang/rust#51607 (comment) points out that that for resource constrained environments the dynamism and indirection imposes meaningful cost. I also consider them unergonomic and unidiomatic. Rust today boasts of its lack of virtual functions, imposed global state, and singletons, and simple Result
-based error handling. These all are violated by these global hooks.
Many (most?) libraries need allocation, but have couldn't care less about changing these hooks. And consumers of those libraries also really don't want them to not change any global state. If those libraries use the pure alloc
, then there is a static guarantee that they cannot change the global state. Circling back to @japaric's concerns, this also makes those libraries closer to being accidentally portable to those obscure resource constrained platforms, and "accidental portability", where code ends up being usable in more situations than its authors intend, should be our gold standard and goal.
Besides the portability benefits, having more of std
's implementation on crates.io
is good for bringing in more developers and prototyping interfaces before their permanent stabilization in std
. External repos, and a simplified build system (at least when building the crate on its own) will allow more distributed development, and the use of regular Rust (insofar that most of the unstable features used here are library interfaces not language features) means that users can fully grok more of std without knowing the "extended" Rust language.
Finally, as part of the plan towards this goal, I propose adding an associated error type to the Alloc
trait. This is technically an independent change, but I suppose if I use it in the plan I should defend it here too. Unlike the accepted try_reserve
methods, it is race-free in that there's no additional method call, and the return value fully indicates whether the operation failed or succeeded.
Unlike having tons of separate methods on the same type, it allows code to be polymorphic over oom-divergeness, and even in the monomorphic case better convey intent and allows enforces that all errors be manually handled.
Plan
Compiler
-
rust-lang/rust#50097 preparatory work in rustc to allow extra type params on
Box
. Take 2; it's actually merged, but only supports ZST. -
Get non-0 sized types working for
Box
too. Some of rust-lang/rust#47043 (upon which the previous PR was based) might be useful. Thankfully no library effort (just stabilization) is blocked on this.
Library
- We make a new library called
alloc
and movecore::alloc
into it.core::alloc
is unstable, so there is no impediment to doing this.
Then we convert collections to use an alloc
parameter, and Alloc
to have an associated error type so that fallible/infallible allocation is reflected in the type system. [Adding the allocator parameter is already a tentative goal tracked in rust-lang/rust#42774 .]
-
rust-lang/rust#50882 convert box to use
Alloc
trait. Per rust-lang/rust#50882 (comment) I think there are stop-gap solutions that allow us to merge this immanently without preventing better solutions later. -
https://github.com/quiltos/rust/tree/allocator-error convert the
Alloc
trait to use associated error and then add the allocator parameter to many collections besidesBox
.Box
is included so need to rebase on previous PR, but this is good because the rebasedBox
changes will show the benefits of the associated error type alone.
We incrementally move collections out of global_alloc
into alloc
. We don't now yet how to provide default parameters away from the way the underlying items are defined, so we newtype them in global_alloc
instead as a stop-gap. Until alloc
is stable in the sysroot (which could even never happen with std-aware cargo), there is nothing force us to commit to alloc::Foo<T, E>
and std::Foo::<T>
unifying or not at some E
.
alloc
will be left with the pure code of the collections and the Alloc
trait. global_alloc
will contain the less pure stuff like GlobalAlloc
, oom hook, etc. std
should be able to reexport most of Alloc
as-is. alloc
should be safe to go crates.io
, as the only unstable features it contains are unstable library interfaces that do not interact with the compiler. (We probably should have a different mechanism to put unstable items in stable crates.io libraries, so as to avoid all-or-nothing stabilization.)
As a final note, we can continue the path of rust-lang/rust#51846 and move HashMap
into the pure alloc
too. global_alloc
would reexport alloc::HashMap
with a default for the Alloc
parameter, and std
would reexport global_alloc::HashMap
with a default for the hasher parameter. Not that the "deferred default" problem is identical for the hasher and the allocator.
Ramifications
Stabilizing today's alloc
is technically no issue as it is mainly a subset of std
, so retrofitting today's alloc
or std
onto the "pure" Alloc are sort of equivalent issues. However I am concerns about the various ways it pulls us away from the spirit of this plan.
-
If this plan goes through, it is my guess that the vast majority of allocation-needing crates will either need the pure
alloc
or all ofstd
.global_alloc
and its hooks would mainly exist for implementingstd
, and not normal library or binary usage. Stabilizingalloc
is thus obviated for the motivations for stabilization given in https://github.com/SimonSapin/rfcs/blob/liballoc/text/0000-liballoc.md. -
The name
alloc
implies it is the "final story" or "one-stop shop" on allocation, when in fact that is the purealloc
library, andglobal_alloca
just provides various global/singleton hooks (mechanisms that really could be applied to just about any trait). Renaming itglobal_alloc
makes that purpose clear, and "leave room" for the purealloc
crate described here.
CC @SimonSapin @rkruppe @japaric @jethrogb @glandium @Amanieu @Haavy @eddyb @eternaleye
If this plan goes through, it is my guess that the vast majority of allocation-needing crates will either need the pure alloc or all of std.
I disagree, I expect that most allocation-needing crates will just want to use a Vec
or Box
with the global allocator (e.g. regex
). People usually want something that "just works" out of the box (wink).
With that said, I do believe that there is some merit to your idea of having a crate with allocator-generic collections which do not depend on a global allocator. However I feel that we should keep the alloc
crate as it is and move towards stabilizing it ASAP because of what it enables in the ecosystem. We can also add a collections
crate later on with allocator-generic collections.
I’ll try to extract some high-level goals that seem to be discussed together here:
- Collections and containers should be allocator-generic
- It should be easier to contribute to the standard library
- It should be possible to avoid relying implicitly on global state
@Ericson2314, I am not confident that this is an accurate representation of what you have in mind (especially for # 3), so please try to present your goals in a succint form similar to this.
I think it’s important to separate high-level goals from the way we can get there. Often, alternative solutions can turn up that are better than the first solution we think of.
While a single solution can sometimes achieve multiple goals, it’s valuable to talk separately about different goals. A many-comments thread can become hard to follow, and points can start being lost in the middle of other stuff. Conflating topics amplifies this problem. More bluntly: just because this is something that you also want doesn’t mean that it belongs in the same thread.
Meta-points aside, responding to the list above:
-
I absolutely agree, and I believe that there is already strong consensus around this goal. However it looks to me that much of what you discuss here does not affect this goal directly.
-
I also agree with this goal, but I believe that the solution discussed here (moving stuff to crates.io) both:
- Would be difficult to achieve, because of reliance on unstable compiler details
- May not achieve the goal. On the contrary, coordinating across multiple source repositories that each gate changes on passing tests can be a very real barrier: https://internals.rust-lang.org/t/the-current-submodule-setup-is-not-tenable/6593. Maybe some pieces of work can be done entirely within one repo, but not all.
- This is where I’m mostly guessing, possibly because I don’t know why this is important.
Now, about the specifics:
If we leave it with the other items in alloc, it will always be encumbered by less pure things like GlobalAlloc, global OOM handling, etc.
This appears to be the core of the motivation, but I don’t think this “always” is accurate. liballoc already depends on libcore, so if for example Vec
can be made to not assume a global allocator we could very well move it to libcore.
I think that what you mean by "pure" allocation library already exists at core::alloc
, and it’s not clear to me what is the benefit of making it a separate. A move like this should not be a goal in itself, but a mean to achieve some goal.
Many (most?) libraries need allocation, but have couldn't care less about changing these hooks.
But as Amanieu wrote they do care about being able to use a global allocator without being allocator-generic themselves and passing around an allocator instance.
And consumers of those libraries also really don't want them to not change any global state. If those libraries use the pure alloc, then there is a static guarantee that they cannot change the global state.
No, changing the global allocator is done through the #[global_allocator]
attribute which is part of the language, not any crate. Looking at what crates a library uses cannot tell you whether it’s using #[global_allocator]
.
having more of std's implementation on crates.io is good for bringing in more developers and prototyping interfaces before their permanent stabilization in std
Moving stuff to crates.io appears to be separate from the rest of this proposal. But regardless, it’s very difficult in this case. In 1.27.0, liballoc declares using more than fifty different unstable features, so any given version of it likely only works with a very narrow set of rustc versions/commits because of changes in those features.
https://github.com/rust-lang/rust/blob/1.27.0/src/liballoc/lib.rs#L77-L130
adding an associated error type to the Alloc trait.
This is to allow that type to be !
for infallible allocation, right? There is some more discussion of this at https://internals.rust-lang.org/t/pre-rfc-changing-the-alloc-trait/7487
Unlike the accepted try_reserve methods, it is race-free in that there's no additional method call, and the return value fully indicates whether the operation failed or succeeded.
What the Alloc trait(s) look like is separate from the API of collections. Vec::push
is stable and does not return a Result, so it seems like Result-returning APIs on Vec
has to be through new methods.
core::alloc is unstable
It is stable in 1.28.
alloc will be left with the pure code of the collections and the Alloc trait. global_alloc will contain the less pure stuff like GlobalAlloc, oom hook, etc.
In this vision, since the "pure" flavor of Vec
does not have access to OOM handling, does it follow that it doesn’t have infallible methods like push
? Does that make it a different type that cannot be used with a library that takes a &mut std::vec::Vec<…>
parameter?
We probably should have a different mechanism to put unstable items in stable crates.io libraries, so as to avoid all-or-nothing stabilization.
That is a nice goal but I have no idea how it could be possible to achieve without giving up on the stability promise.
Renaming it global_alloc makes that purpose clear, and "leave room" for the pure alloc crate described here.
“Leaving room” is the reason for your objection to rust-lang/rfcs#2480. But again it’s not clear what the benefits are of a separate crate over the current core::alloc
module.
rust-lang/rfcs#2492 Thank you, you both. I started replying to this, but then thought more about the language issues and decided there is a good enough path forward there that we can avoid needing to split up crates. A few days later, and the result is this RFC: rust-lang/rfcs#2492.
I moved the existing PR bullet points over to rust-lang/rust#42774 (comment). I'll update that with more ones as they're opened.