rust-lang/rust

Tracking issue for location of facade crates

alexcrichton opened this issue · 33 comments

We probably don't want to indefinitely have a large number of facade crates which are all unstable, so we should eventually stabilize these crates, fold them back into the standard library, or find a good middle ground in which they can all reside.

The current set of crates considered under this issue are:

  • rustc_unicode
  • libc
  • collections
  • alloc
  • rand

Update 2018-04-05:

  • libc and rand have moved to crates.io
  • collections was merged into alloc
  • rustc_unicode was renamed to std_unicode and later merged into core

This leaves only the alloc crate still unstable, which with the stable core and std crates (plus arguably proc_macro) form the standard library.


Update 2018-06-19:

RFC 2480 proposes stabilizing the alloc crate, which would close this issue.

It has been noted that the separation between alloc and collections is seemingly arbitrary. However with a robust allocator API, it could be possible to specify collections without any allocator. So e.g. no_std users could provide their own allocator and still use Vec.

Basically I don't think these should be stabilized at all in anything approaching the short term pending the ultimate allocator designs.

Found this through alloc documents.

Is there any news about this issue?
btw, I just want a RawVec for custom data-structures.

RawVec would be great. Any way to allocate memory other than Vec::with_capacity + mem::forget would be nice.

Is there any news about this issue?
btw, I just want a RawVec for custom data-structures.

RawVec would be great. Any way to allocate memory other than Vec::with_capacity + mem::forget would be nice.

It would be nice if rust officially support some low-level mem layer (in rust way).

@chao787 that's the alloc crate.

I've found this issue through compiler error message:

error: use of unstable library feature 'alloc_system': this library is unlikely to be stabilized in its current form or name (see issue #27783)

I'm just trying to opt out of jemalloc and get Rust to compile my binary with system default malloc implementation. Is there a way to do it that would not involve unstable language features, like a compiler switch to get jemalloc out of my binaries and off my lawn?

Rationale: jemalloc has abysmal fork performance (80x slowdown) on default kernel configuration of recent Ubuntu. See issue #36705 for more info.

Opting out of jemalloc is also important for fuzzing, e.g. with afl-fuzz; even on jemalloc-friendly configurations using default malloc gets you 20% more fork performance which is significant for fuzzing workloads, and lets you substitute abusive memory allocators at runtime that catch more errors than default allocator or jemalloc but don't incur the performance penalty of DUMA or AddressSanitizer. See AFL's libdislocator as an example.

@Shnatsel yeah I'd love to explore the possibilities of stabilizing the "please give me the system allocator" intent. Right now it's unfortunately not possible to do that in stable Rust.

The tracking issue for global allocators in general is #27389, but it may be worth spawning off a separate thread of discussion for just declaring the intent to use the system allocator.

brson commented

FWIW I'm not in favor of merging the facade crates. Having std be decomposed into reusable components should be useful for those targeting more exotic systems, particularly if I complete my further ambitions to extract all system-specific components of std into their own crates. Ultimately people should be able to create custom stds for whatever weird systems they want be reusing our small self-contained building blocks.

Decomposed stdlib would be really useful for asm.js target that I will need in the near future and really want to use Rust for.

@brson As a person targeting a mildly exotic system, I agree. For example, I would expect that a significant proportion of bare-metal use cases (like mine) would want alloc and collections but nothing beyond.

The current set of crates considered under this issue are:

  • rustc_unicode
  • libc
  • collections
  • alloc
  • rand

I surveyed what’s tracked by this issue when writing #39954. Unless I missed something:

  • libc and rand are now available on crates.io.
  • All functionality in collections is re-exported as stable in std, is deprecated, or has a more specific tracking issue.
  • Some non-deprecated functionality in std_unicode (formerly rustc_unicode) and alloc is re-exported as stable in std.
  • Functionality not available on stable Rust is:
    • alloc::heap
      • When a type T can be written such as mem::align_of::<T>() is the desired alignment, Vec + mem::forget can be used to re-implement allocate, deallocate, and reallocate.
      • As far as I know, no such work-around is possible for reallocate_inplace or usable_size.
      • EMPTY is “known” to be 1 as *mut (), but keeping a separate definition and assuming it keeps matching alloc’s seems fragile
    • alloc::oom
    • alloc::raw_vec
    • std_unicode::derived_property
    • std_unicode::property
    • std_unicode::str::is_utf16 #40190
    • std_unicode::str::utf8_char_width #40189
    • std_unicode::str::Utf16Encoder
  • Additional functionality not available on stable Rust with #![no_std]:
    • alloc::arc
    • alloc::boxed
    • alloc::rc
    • collections
    • std_unicode::char
    • std_unicode::str::UnicodeStr

Some environments might require #![no_std] because they lack threads for example, but still have a memory allocator available.

My subjective opinions:

  • derived_property, property, utf8_char_width, and raw_vec are public in order to be used in other crates. Arguably they’re not something we ever want to stabilize.
  • Utf16Encoder is almost in this category, but it is usefully more general than the stable str::encode_utf16 method since it works on arbitrary char iterators rather than just &str.
  • is_utf16 has never been used in the compiler or standard library since 47e7a05 added it in 2012 “for OS API interop”. It can be replaced with a one-liner: std::char::decode_utf16(s).all(|r| r.is_ok()). I propose removing it: #40190

Looking at Utf16Encoder again, it’s based on char::encode_utf16 and so fairly easy to reproduce outside of std, so never mind. (Mostly, it’s only verbose because defining an iterator is verbose.)

I think this leaves two items in need of attention:

  • Stabilize raw memory allocation (alloc::heap, maybe also alloc::oom).
  • Figure out how to provide non-core functionality to #![no_std] crates.

I vaguely recall GC-related concerns about the former, but I think those would also apply to the Vec + mem::forget trick that is used on stable today. There’s even a crate for it: https://crates.io/crates/memalloc

Stabilize raw memory allocation

This is #27700, not sure how I missed that.

For unicode stuff, now that we can use crates.io crates in rustc build system, could the crates.io crates be used?

The compiler depends on crates from crates.io. Does std? Should it?

Question: is there a reason why Box is re-exported in libcollections but Rc and Arc are not?

@alexcrichton might want to remove collections from the list in the description since it's gone as of #42648

@SimonSapin I think it would be a huge boost to the no-std ecosystem if std was allowed to depend on crates.io (at least official out-of-tree crates).

su8 commented

How do I install libc, can't use the following

extern crate libc;
use libc::{c_char, uint8_t};

@su8 libc is on crates.io now

@alexcrichton the OP should probably be updated

https://internals.rust-lang.org/t/a-vision-for-portability-in-rust/6719 proposes undoing the facade and unifying the standard library under one std crate.

The relevant crates are now:

  • std
  • core
  • alloc
  • std_unicode
  • proc_macro (maybe? It’s sort of separate, but it’s distributed with std and intended for "public consumption")

rust-lang-nursery/portability-wg#1 talks about what to do with std. I'd argue the goal for out-of-tree ports (among other things) actually pushes in the direction of making std more of a facade.

[As an aside, some of our language here is rather confusing. The "facade" is the exterior skin of the building, so std is the facade, not the creates it depends on. I'd rename this issue to start to clean things up.]

I’ve been using "the facade" as short for "the fact that std is (also) a facade for other crates, rather than a stand-alone crate."

#49698 proposes merging std_unicode into core.

OK now that #49698 is merged, I'd like to reiterate my concerns on this general direction of fewer creates behind the facade.

  • There is some tentative consensus that multiple facade creates is good, even necessary, for portability. See rust-lang-nursery/portability-wg#1. This does not apply to the std_unicode merge.

  • I (and a handful of other) concerned about too big/monolithic crates, and keeping library code unnecessarily in https://github.com/rust-lang/rust/.

  • Code that doesn't use unstable language features can always be moved out in principle.

  • Moving code out is friendlier to alternative Rust implementations, and even for the official one better formalizes the which code the lang team doesn't need to worry extra about.

  • Smaller libraries are easier to read.

  • Independent repos foster parallel development

  • core is getting quite big, especially. IMO It shouldn't contain all pure abstractions, but just the minimal amount to support lang items.

The second concern absolutely does apply to std_unicode merge.std_unicode doesn't appear to, and nobody said anything to the contrary. It's much easier to merge crates than split them apart as things naturally grow entangled when nothing prevents that. So to the extent that the jury is still out on the facade, we should not be "speculatively" merging crates without fully-formed justification.

Also, on a process level, I'm concerned that rather than getting a simple "we disagree with your points" which would have been fine, I was getting told "your concerns don't apply to this PR", which just isn't true. This leaves me wondering who actually read the discussion before checking off the box, and whether the rules "final comment period" was followed.

Admittedly, I did open with the unrelevant portability concerns, unfortunately sowing confusion I did not intend, but I thought that was cleared up in the end, at least between @SimonSapin and I. @SimonSapin indeed said in his last comment to me #49698 (comment) "I don’t care as much about how source code is organized internally" which at least felt like a "this PR isn't trying to disagree with you" leaving open other resolutions to the issue. But then @withoutboats rejoined with another "[your concern] seems off topic from this PR" #49698 (comment), and @alexcrichton, who had not participated in that discussion since a LGTM before I raised my points, did the r+.

Again, I'm not demanding that everyone agree to with me, but just to that explicit reject my points rather than leaving open that they were just missed. If there is no process violation going on, either FCP is different for PRs than RFCs in ways I wasn't aware of, or I misunderstand FCP overall.

RFC 2480 proposes stabilizing the alloc crate, which would close this issue.

Related:

  • PR #51569: Make the public API of the alloc crate a subset of std
  • PR #51607: Move OOM handling to liballoc and remove the oom lang item
  • PR #51639: Update the error message for a missing global allocator

There is something questionable with using alloc in a no_std context currently: you automatically get a global allocator without adding a #[global_allocator], and that allocator is jemalloc. I would have expected something similar to panic: a compiler error about the missing panic_implementation (which currently still talks about the lang item, but you get the point).

@glandium #36963 should make this disappear, and it only happens for executables. A staticlib for example does get an error message: #51639, since the "default lib allocator" is not jemalloc and is defined in std.

But yeah it’s a good question whether we should change this in the meantime. Maybe it’s not much of a problem in practice?

How about adding getrandom to the OP list? (maybe instead of rand?) Introduction of a simple (overwritable) function for pulling entropy from system randomness source was previously briefly discussed here. See rust-random/getrandom for the current prototype.

RFC 2480 to stabilize the alloc crate was accepted. PR #59675 does so, and closes this issue since alloc was the last crate tracked here. (Others have been stabilized already, moved to crates.io, or merged into other crates.)