Allocator traits and std::heap

Question

Allocator traits and std::heap

nikomatsakis opened this issue 9 years ago · 471 comments

Answer 1 · 2016-04-11T03:07:19.000Z

I unfortunately wasn't paying close enough attention to mention this in the RFC discussion, but I think that realloc_in_place should be replaced by two functions, grow_in_place and shrink_in_place, for two reasons:

I can't think of a single use case (short of implementing realloc or realloc_in_place) where it is unknown whether the size of the allocation is increasing or decreasing. Using more specialized methods makes it slightly more clear what is going on.
The code paths for growing and shrinking allocations tend to be radically different - growing involves testing whether adjacent blocks of memory are free and claiming them, while shrinking involves carving off properly sized subblocks and freeing them. While the cost of a branch inside realloc_in_place is quite small, using grow and shrink better captures the distinct tasks that an allocator needs to perform.

Note that these can be added backwards-compatibly next to realloc_in_place, but this would constrain which functions would be by default implemented in terms of which others.

For consistency, realloc would probably also want to be split into grow and split, but the only advantage to having an overloadable realloc function that I know of is to be able to use mmap's remap option, which does not have such a distinction.

Answer 2 · 2016-04-11T03:12:08.000Z

Additionally, I think that the default implementations of realloc and realloc_in_place should be slightly adjusted - instead of checking against the usable_size, realloc should just first try to realloc_in_place. In turn, realloc_in_place should by default check against the usable size and return success in the case of a small change instead of universally returning failure.

This makes it easier to produce a high-performance implementation of realloc: all that is required is improving realloc_in_place. However, the default performance of realloc does not suffer, as the check against the usable_size is still performed.

Answer 3 · 2016-10-26T13:04:27.000Z

Another issue: The doc for fn realloc_in_place says that if it returns Ok, then one is assured that ptr now "fits" new_layout.

To me this implies that it must check that the alignment of the given address matches any constraint implied by new_layout.

However, I don't think the spec for the underlying fn reallocate_inplace function implies that it will perform any such check.

Furthermore, it seems reasonable that any client diving into using fn realloc_in_place will themselves be ensuring that the alignments work (in practice I suspect it means that the same alignment is required everywhere for the given use case...)

So, should the implementation of fn realloc_in_place really be burdened with checking that the alignment of the given ptr is compatible with that of new_layout? It is probably better in this case (of this one method) to push that requirement back to the caller...

Answer 4 · 2016-10-26T13:05:05.000Z

@gereeter you make good points; I will add them to the check list I am accumulating in the issue description.

Answer 5 · 2016-10-31T17:38:52.000Z

(at this point I am waiting for #[may_dangle] support to ride the train into the beta channel so that I will then be able to use it for std collections as part of allocator integration)

Answer 6 · 2017-01-04T20:12:58.000Z

I'm new to Rust, so forgive me if this has been discussed elsewhere.

Is there any thought on how to support object-specific allocators? Some allocators such as slab allocators and magazine allocators are bound to a particular type, and do the work of constructing new objects, caching constructed objects which have been "freed" (rather than actually dropping them), returning already-constructed cached objects, and dropping objects before freeing the underlying memory to an underlying allocator when required.

Currently, this proposal doesn't include anything along the lines of ObjectAllocator<T>, but it would be very helpful. In particular, I'm working on an implementation of a magazine allocator object-caching layer (link above), and while I can have this only wrap an Allocator and do the work of constructing and dropping objects in the caching layer itself, it'd be great if I could also have this wrap other object allocators (like a slab allocator) and truly be a generic caching layer.

Where would an object allocator type or trait fit into this proposal? Would it be left for a future RFC? Something else?

Answer 7 · 2017-01-04T20:22:53.000Z

I don't think this has been discussed yet.

You could write your own ObjectAllocator<T>, and then do impl<T: Allocator, U> ObjectAllocator<U> for T { .. }, so that every regular allocator can serve as an object-specific allocator for all objects.

Future work would be modifying collections to use your trait for their nodes, instead of plain ole' (generic) allocators directly.

Answer 8 · 2017-01-04T20:25:22.000Z

@pnkfelix

(at this point I am waiting for #[may_dangle] support to ride the train into the beta channel so that I will then be able to use it for std collections as part of allocator integration)

I guess this has happened?

Answer 9 · 2017-01-04T20:27:20.000Z

@Ericson2314 Yeah, writing my own is definitely an option for experimental purposes, but I think there'd be much more benefit to it being standardized in terms of interoperability (for example, I plan on also implementing a slab allocator, but it would be nice if a third-party user of my code could use somebody else's slab allocator with my magazine caching layer). My question is simply whether an ObjectAllocator<T> trait or something like it is worth discussing. Although it seems that it might be best for a different RFC? I'm not terribly familiar with the guidelines for how much belongs in a single RFC and when things belong in separate RFCs...

Answer 10 · 2017-01-04T20:42:32.000Z

@joshlf

Where would an object allocator type or trait fit into this proposal? Would it be left for a future RFC? Something else?

Yes, it would be another RFC.

I'm not terribly familiar with the guidelines for how much belongs in a single RFC and when things belong in separate RFCs...

that depends on the scope of the RFC itself, which is decided by the person who writes it, and then feedback is given by everyone.

But really, as this is a tracking issue for this already-accepted RFC, thinking about extensions and design changes isn't really for this thread; you should open a new one over on the RFCs repo.

Answer 11 · 2017-01-04T21:01:36.000Z

@joshlf Ah, I thought ObjectAllocator<T> was supposed to be a trait. I meant prototype the trait not a specific allocator. Yes that trait would merit its own RFC as @steveklabnik says.

@steveklabnik yeah now discussion would be better elsewhere. But @joshlf was also raising the issue lest it expose a hitherto unforeseen flaw in the accepted but unimplemented API design. In that sense it matches the earlier posts in this thread.

Answer 12 · 2017-01-04T21:27:36.000Z

@Ericson2314 Yeah, I thought that was what you meant. I think we're on the same page :)

@steveklabnik Sounds good; I'll poke around with my own implementation and submit an RFC if it ends up seeming like a good idea.

Answer 13 · 2017-01-04T21:54:03.000Z

@joshlf I don't any reason why custom allocators would go into the compiler or standard library. Once this RFC lands, you could easily publish your own crate that does an arbitrary sort of allocation (even a fully-fledged allocator like jemalloc could be custom-implemented!).

Answer 14 · 2017-01-04T21:58:58.000Z

@alexreg This isn't about a particular custom allocator, but rather a trait that specifies the type of all allocators which are parametric on a particular type. So just like RFC 1398 defines a trait (Allocator) that is the type of any low-level allocator, I'm asking about a trait (ObjectAllocator<T>) that is the type of any allocator which can allocate/deallocate and construct/drop objects of type T.

Answer 15 · 2017-01-04T22:01:33.000Z

@alexreg See my early point about using standard library collections with custom object-specific allocators.

Answer 16 · 2017-01-04T22:02:54.000Z

Sure, but I’m not sure that would belong in the standard library. Could easily go into another crate, with no loss of functionality or usability.

…

On 4 Jan 2017, at 21:59, Joshua Liebow-Feeser ***@***.***> wrote: @alexreg <https://github.com/alexreg> This isn't about a particular custom allocator, but rather a trait that specifies the type of all allocators which are parametric on a particular type. So just like RFC 1398 defines a trait (Allocator) that is the type of any low-level allocator, I'm asking about a trait (ObjectAllocator<T>) that is the type of any allocator which can allocate/deallocate and construct/drop objects of type T. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#32838 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAEF3IhyyPhFgu1EGHr_GM_Evsr0SRzIks5rPBZGgaJpZM4IDYUN>.

Answer 17 · 2017-01-04T22:03:49.000Z

I think you’d want to use standard-library collections (any heap-allocated value) with an *arbitrary* custom allocator; i.e. not limited to object-specific ones.

…

On 4 Jan 2017, at 22:01, John Ericson ***@***.***> wrote: @alexreg <https://github.com/alexreg> See my early point about using standard library collections with custom object-specific allocators. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#32838 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAEF3CrjYIXqcv8Aqvb4VTyPcajJozICks5rPBbOgaJpZM4IDYUN>.

Answer 18 · 2017-01-04T22:13:00.000Z

Sure, but I’m not sure that would belong in the standard library. Could easily go into another crate, with no loss of functionality or usability.

Yes but you probably want some standard library functionality to rely on it (such as what @Ericson2314 suggested).

I think you’d want to use standard-library collections (any heap-allocated value) with an arbitrary custom allocator; i.e. not limited to object-specific ones.

Ideally you'd want both - to accept either type of allocator. There are very significant benefits to using object-specific caching; for example, both slab allocation and magazine caching give very significant performance benefits - take a look at the papers I linked to above if you're curious.

Answer 19 · 2017-01-04T22:16:15.000Z

But the object allocator trait could simply be a subtrait of the general allocator trait. It’s as simple as that, as far as I’m concerned. Sure, certain types of allocators can be more efficient than general-purpose allocators, but neither the compiler nor the standard really need to (or indeed should) know about this.

…

On 4 Jan 2017, at 22:13, Joshua Liebow-Feeser ***@***.***> wrote: Sure, but I’m not sure that would belong in the standard library. Could easily go into another crate, with no loss of functionality or usability. Yes but you probably want some standard library functionality to rely on it (such as what @Ericson2314 <https://github.com/Ericson2314> suggested). I think you’d want to use standard-library collections (any heap-allocated value) with an arbitrary custom allocator; i.e. not limited to object-specific ones. Ideally you'd want both - to accept either type of allocator. There are very significant benefits to using object-specific caching; for example, both slab allocation and magazine caching give very significant performance benefits - take a look at the papers I linked to above if you're curious. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#32838 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAEF3L9F9r_0T5evOtt7Es92vw6gBxR9ks5rPBl9gaJpZM4IDYUN>.

Answer 20 · 2017-01-04T22:28:41.000Z

But the object allocator trait could simply be a subtrait of the general allocator trait. It’s as simple as that, as far as I’m concerned. Sure, certain types of allocators can be more efficient than general-purpose allocators, but neither the compiler nor the standard really need to (or indeed should) know about this.

Ah, so the problem is that the semantics are different. Allocator allocates and frees raw byte blobs. ObjectAllocator<T>, on the other hand, would allocate already-constructed objects and would also be responsible for dropping these objects (including being able to cache constructed objects which could be handed out later in leu of constructing a newly-allocated object, which is expensive). The trait would look something like this:

trait ObjectAllocator<T> {
    fn alloc() -> T;
    fn free(t T);
}

This is not compatible with Allocator, whose methods deal with raw pointers and have no notion of type. Additionally, with Allocators, it is the caller's responsibility to drop the object being freed first. This is really important - knowing about the type T allows ObjectAllocator<T> to do things like call T's drop method, and since free(t) moves t into free, the caller cannot drop t first - it is instead the ObjectAllocator<T>'s responsibility. Fundamentally, these two traits are incompatible with one another.

Answer 21 · 2017-01-05T00:16:00.000Z

Ah right, I see. I thought this proposal already included something like that, i.e. a “higher-level” allocator over the byte level. In that case, a perfectly fair proposal!

…

On 4 Jan 2017, at 22:29, Joshua Liebow-Feeser ***@***.***> wrote: But the object allocator trait could simply be a subtrait of the general allocator trait. It’s as simple as that, as far as I’m concerned. Sure, certain types of allocators can be more efficient than general-purpose allocators, but neither the compiler nor the standard really need to (or indeed should) know about this. Ah, so the problem is that the semantics are different. Allocator allocates and frees raw byte blobs. ObjectAllocator<T>, on the other hand, would allocate already-constructed objects and would also be responsible for dropping these objects (including being able to cache constructed objects which could be handed out later in leu of constructing a newly-allocated object, which is expensive). The trait would look something like this: trait ObjectAllocator<T> { fn alloc() -> T; fn free(t T); } This is not compatible with Allocator, whose methods deal with raw pointers and have no notion of type. Additionally, with Allocators, it is the caller's responsibility to drop the object being freed first. This is really important - knowing about the type T allows ObjectAllocator<T> to do things like call T's drop method, and since free(t) moves t into free, the caller cannot drop t first - it is instead the ObjectAllocator<T>'s responsibility. Fundamentally, these two traits are incompatible with one another. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#32838 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAEF3GViJBefuk8IWgPauPyL5tV78Fn5ks5rPB08gaJpZM4IDYUN>.

Answer 22 · 2017-01-05T00:53:18.000Z

@alexreg Ah yes, I was hoping so too :) Oh well - it'll have to wait for another RFC.

Answer 23 · 2017-01-05T00:55:05.000Z

Yes, do kick start that RFC, I’m sure it would get plenty of support! And thanks for the clarification (I hadn’t kept up with the details of this RFC at all).

…

On 5 Jan 2017, at 00:53, Joshua Liebow-Feeser ***@***.***> wrote: @alexreg <https://github.com/alexreg> Ah yes, I was hoping so too :) Oh well - it'll have to wait for another RFC. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#32838 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAEF3MQQeXhTliU5CBsoheBFL26Ee9WUks5rPD8RgaJpZM4IDYUN>.

Answer 24 · 2017-01-12T18:13:02.000Z

A crate for testing custom allocators would be useful.

Answer 25 · 2017-02-07T17:35:14.000Z

Forgive me if I'm missing something obvious, but is there a reason for the Layout trait described in this RFC not to implement Copy as well as Clone, since it's just POD?

Answer 26 · 2017-03-20T19:59:46.000Z

I can't think of any.

Answer 27 · 2017-04-05T23:15:35.000Z

Sorry to be bringing this up so late in the process, but...

Might it be worth adding support for a dealloc-like function that isn't a method, but rather a function? The idea would be to use alignment to be able to infer from a pointer where in memory its parent allocator is and thus be able to free without needing an explicit allocator reference.

This could be a big win for data structures that use custom allocators. This would allow them to not keep a reference to the allocator itself, but rather only need to be parametric on the type of the allocator to be able to call the right dealloc function. For example, if Box is eventually modified to support custom allocators, then it'd be able to keep being only a single word (just the pointer) as opposed to having to be expanded to two words to store a reference to the allocator as well.

On a related note, it might also be useful to support a non-method alloc function to allow for global allocators. This would compose nicely with a non-method dealloc function - for global allocators, there'd be no need to do any kind of pointer-to-allocator inference since there'd only be a single static instance of the allocator for the whole program.

Answer 28 · 2017-04-06T15:20:24.000Z

@joshlf The current design allows you to get that by just having your allocator be a (zero-sized) unit type - i.e. struct MyAlloc; that you then implement the Allocator trait on.
Storing references or nothing at all, always, is less general than storing the allocator by-value.

Answer 29 · 2017-04-06T16:17:36.000Z

I could see that being true for a directly-embedded type, but what about if a data structure decies to keep a reference instead? Does a reference to a zero-sized type take up zero space? That is, if I have:

struct Foo()

struct Blah{
    foo: &Foo,
}

Does Blah have zero size?

Answer 30 · 2017-04-06T16:21:08.000Z

Actually, even it's possible, you might not want your allocator to have zero size. For example, you might have an allocator with a non-zero size that you allocate from, but that has the ability to free objects w/o knowing about the original allocator. This would still be useful for making a Box take only a word. You'd have something like Box::new_from_allocator which would have to take an allocator as an argument - and it might be a nonzero-sized allocator - but if the allocator supported freeing without the original allocator reference, the returned Box<T> could avoid storing a reference to the allocator that was passed in the original Box::new_from_allocator call.

Answer 31 · 2017-04-06T17:49:19.000Z

For example, you might have an allocator with a non-zero size that you allocate from, but that has the ability to free objects w/o knowing about the original allocator.

I recall long, long, ago proposing factoring out separate allocator and deallocator traits (with associate types connecting the two) for basically this reason.

Answer 32 · 2017-04-06T18:14:37.000Z

Is/should the compiler be allowed to optimize away allocations with these allocators?

Answer 33 · 2017-04-06T20:30:49.000Z

Is/should the compiler be allowed to optimize away allocations with these allocators?

@Zoxc What do you mean?

I recall long, long, ago proposing factoring out separate allocator and deallocator traits (with associate types connecting the two) for basically this reason.

For posterity, let me clarify this statement (I talked to @Ericson2314 about it offline): The idea is that a Box could be parametric just on a deallocator. So you could have the following implementation:

trait Allocator {
    type D: Deallocator;
    
    fn get_deallocator(&self) -> Self::D;
}

trait Deallocator {}

struct Box<T, D: Deallocator> {
    ptr: *mut T,
    d: D,
}

impl<T, D: Deallocator> Box<T, D> {
    fn new_from_allocator<A: Allocator>(x: T, a: A) -> Box<T, A::D> {
        ...
        Box {
            ptr: ptr,
            d: a.get_deallocator()
        }
    }
}

This way, when calling new_from_allocator, if A::D is a zero-sized type, then the d field of Box<T, A::D> takes up zero size, and so the size of the resulting Box<T, A::D> is a single word.

Answer 34 · 2017-05-05T17:40:24.000Z

Is there a timeline for when this will land? I'm working on some allocator stuff, and it'd be nice if this stuff were there for me to build off of.

If there's interest, I'd be happy to lend some cycles to this, but I'm relatively new to Rust, so that might just create more work for the maintainers in terms of having to code review a newbie's code. I don't want to step on anybody's toes, and I don't want to make more work for people.

Answer 35 · 2017-05-19T16:52:25.000Z

Ok we've recently met to evaluate the state of allocators and I think there's some good news for this as well! It looks like support has not yet landed in libstd for these APIs, but everyone's still happy with them landing at any time!

One thing we discussed is that changing over all the libstd types may be a bit premature due to possible inference issues, but regardless of that it seems like a good idea to land the Allocator trait and the Layout type in the proposed std::heap module for experimentation elsewhere in the ecosystem!

@joshlf if you'd like to help out here I think that'd be more than welcome! The first piece will likely be landing the basic type/trait from this RFC into the standard library, and then from there we can start experimenting and playing around with collections in libstd as well.

Answer 36 · 2017-05-19T21:55:02.000Z

@alexcrichton I think your link is broken? It points back here.

One thing we discussed is that changing over all the libstd types may be a bit premature due to possible inference issues

Adding the trait is a good first step, but without refactoring existing APIs to use it they won't see much usage. In #27336 (comment) I propose we can refactor the crates behind the facade immediately, but add newtype wrappers in std. Annoying to do, but allows us to make progress.

Answer 37 · 2017-05-19T22:37:18.000Z

@alexcrichton What would be the process for getting object allocators in? My experiments so far (soon to be public; I can add you to the private GH repo if you're curious) and the discussion here have led me to believe that there's going to be a near-perfect symmetry between the allocator traits and object allocator traits. E.g., you'll have something like (I changed Address to *mut u8 for symmetry with *mut T from ObjectAllocator<T>; we'd probably end up with Address<T> or something like that):

unsafe trait Allocator {
    unsafe fn alloc(&mut self, layout: Layout) -> Result<*mut u8, AllocErr>;
    unsafe fn dealloc(&mut self, ptr: *mut u8, layout: Layout);
}

unsafe trait ObjectAllocator<T> {
    unsafe fn alloc(&mut self) -> Result<*mut T, AllocErr>;
    unsafe fn dealloc(&mut self, ptr: *mut T);
}

Thus, I think experimenting with both allocators and object allocators at the same time could be useful. I'm not sure if this is the right place for that, though, or whether there should be another RFC or, at the very least, a separate PR.

Answer 38 · 2017-05-20T05:02:56.000Z

Oh I meant to link here which also has information about the global allocator. @joshlf is that what you're thinking?

Answer 39 · 2017-05-30T14:35:49.000Z

It sounds like @alexcrichton wants a PR that provides the Allocator trait and Layout type, even if its not integrated into any collection in libstd.

If I understand that correctly, then I can put up a PR for that. I had not done so because I keep trying to get at least integration with RawVec and Vec prototyped. (At this point I have RawVec done, but Vec is a bit more challenging due to the many other structures that build off of it, like Drain and IntoIter etc...)

Answer 40 · 2017-05-30T14:50:29.000Z

actually, my current branch seems like it might actually build (and the one test for integration withRawVec passed), so I went ahead and posted it: #42313

Answer 41 · 2017-05-30T14:58:05.000Z

@hawkw asked:

Forgive me if I'm missing something obvious, but is there a reason for the Layout trait described in this RFC not to implement Copy as well as Clone, since it's just POD?

The reason I made Layout only implement Clone and not Copy is that I wanted to leave open the possibility of adding more structure to the Layout type. In particular, I still am interested in trying to have the Layout attempt to track any type structure used to construct it (e.g. 16-array of struct { x: u8, y: [char; 215] }), so that allocators would have the option of exposing instrumentation routines that report on what types their current contents are composes from.

This would almost certainly have to be an optional feature, i.e. it seems like the tide is firmly against forcing developers to to use the type-enriched Layout constructors. so any instrumentation of this form would need to include something like an "unknown memory blocks" category to handle allocations that do not have the type information.

But nonetheless, features like this were the main reason why I did not opt to make Layout implement Copy; I basically figured that an implementation of Copy would be a premature constraint on Layout itself.

Answer 42 · 2017-06-06T04:00:12.000Z

@alexcrichton @pnkfelix

Looks like @pnkfelix has this one covered, and that PR is getting traction, so let's just go with that. I'm looking over it and making comments now, and it looks great!

Answer 43 · 2017-06-06T18:59:51.000Z

Currently the signature for Allocator::oom is:

    fn oom(&mut self, _: AllocErr) -> ! {
        unsafe { ::core::intrinsics::abort() }
    }

It was brought to my attention, though, that Gecko at least likes to know the allocation size as well on OOM. We may wish to consider this when stabilizing to perhaps add context like Option<Layout> for why OOM is happening.

Answer 44 · 2017-06-06T19:13:53.000Z

@alexcrichton
Might it be worth having either multiple oom_xxx variants or an enum of different argument types? There are a couple of different signatures for methods that could fail (e.g., alloc takes a layout, realloc takes a pointer, an original layout, and a new layout, etc), and there might be cases in which an oom-like method would want to know about all of them.

Answer 45 · 2017-06-06T22:21:32.000Z

@joshlf that's true, yes, but I'm not sure if it's useful. I wouldn't want to just add features because we can, they should continue to be well motivated.

A point for stabilization here is also "determine what the requirements are on the alignment provided to fn dealloc", and the current implementation of dealloc on Windows uses align to determine how to correctly free. @ruuda you may be interested in this fact.

Answer 46 · 2017-06-07T10:41:16.000Z

A point for stabilization here is also "determine what the requirements are on the alignment provided to fn dealloc", and the current implementation of dealloc on Windows uses align to determine how to correctly free.

Yes, I think this is how I initially ran into this; my program crashed on Windows because of this. As HeapAlloc makes no alignment guarantees, allocate allocates a bigger region and stores the original pointer in a header, but as an optimization this is avoided if the alignment requirements would be satisfied anyway. I wonder if there is a way to convert HeapAlloc into an alignment-aware allocator that does not require alignment on free, without losing this optimization.

Answer 47 · 2017-06-07T11:23:18.000Z

@ruuda

As HeapAlloc makes no alignment guarantees

It does provide a minimum alignment guarantee of 8 bytes for 32bit or 16 bytes for 64bit, it just doesn't provide any way to guarantee alignment higher than that.

The _aligned_malloc provided by the CRT on Windows can provide allocations of higher alignment, but notably it must be paired with _aligned_free, using free is illegal. So if you don't know whether an allocation was done via malloc or _aligned_malloc then you're stuck in the same conundrum that alloc_system is in on Windows if you don't know the alignment for deallocate. The CRT does not provide the standard aligned_alloc function which can be paired with free, so even Microsoft hasn't been able to solve this problem. (Although it is a C11 function and Microsoft doesn't support C11 so that's a weak argument.)

Do note that deallocate only cares about the alignment to know whether it is overaligned, the actual value itself is irrelevant. If you wanted a deallocate that was truly alignment independent, you could simply treat all allocations as overaligned, but you'd waste a lot of memory on small allocations.

Answer 48 · 2017-06-07T15:42:17.000Z

@alexcrichton wrote:

Currently the signature for Allocator::oom is:
    fn oom(&mut self, _: AllocErr) -> ! {
        unsafe { ::core::intrinsics::abort() }
    }
It was brought to my attention, though, that Gecko at least likes to know the allocation size as well on OOM. We may wish to consider this when stabilizing to perhaps add context like Option<Layout> for why OOM is happening.

The AllocErr already carries the Layout in the AllocErr::Exhausted variant. We could just add the Layout to the AllocErr::Unsupported variant as well, which I think would be simplest in terms of client expectations. (It does have the drawback of silghtly increasing the side of the AllocErr enum itself, but maybe we shouldn't worry about that...)

Answer 49 · 2017-06-07T15:45:12.000Z

Oh I suspect that's all that's needed, thanks for the correction @pnkfelix!

Answer 50 · 2017-06-20T14:37:59.000Z

I'm going to start repurposing this issue for the tracking issue for std::heap in general as it will be after #42727 lands. I'll be closing a few other related issues in favor of this.

Answer 51 · 2017-06-20T14:52:39.000Z

Is there a tracking issue for converting collections? Now that the PRs are merged, I'd like to

Discuss the associated error type
Discuss converting collections to use any local allocator, (especially leveraging associated error type)

Answer 52 · 2017-06-20T15:11:43.000Z

I've opened #42774 to track integration of Alloc into std collections. With historical discussion in the libs team that's likely to be on a separate track of stabilization than an initial pass of the std::heap module.

Answer 53 · 2017-06-20T17:22:16.000Z

While reviewing allocator-related issues I also came across #30170 which @pnkfelix awhile back. It looks like the OSX system allocator is buggy with high alignments and when running that program with jemalloc it's segfaulting during deallocation on Linux at least. Worth considering during stabilization!

Answer 54 · 2017-06-21T11:49:42.000Z

I've opened #42794 as a place to discuss the specific question of whether zero-sized allocations need to match their requested alignment.

Answer 55 · 2017-06-21T11:50:34.000Z

(oh wait, zero-sized allocations are illegal in user allocators!)

Answer 56 · 2017-07-08T08:33:00.000Z

Since the alloc::heap::allocate function and friends are now gone in Nightly, I’ve updated Servo to use this new API. This is part of the diff:

-use alloc::heap;
+use alloc::allocator::{Alloc, Layout};
+use alloc::heap::Heap;

-        let ptr = heap::allocate(req_size as usize, FT_ALIGNMENT) as *mut c_void;
+        let layout = Layout::from_size_align(req_size as usize, FT_ALIGNMENT).unwrap();
+        let ptr = Heap.alloc(layout).unwrap() as *mut c_void;

I feel the ergonomics are not great. We went from importing one item to importing three from two different modules.

Would it make sense to have a convenience method for allocator.alloc(Layout::from_size_align(…))?
Would it make sense to make <Heap as Alloc>::_ methods available as free functions or inherent methods? (To have one fewer item to import, the Alloc trait.)

Answer 57 · 2017-07-08T08:33:54.000Z

Alternatively, could the Alloc trait be in the prelude or is it too niche of a use case?

Answer 58 · 2017-07-08T08:38:10.000Z

@SimonSapin IMO there isn't much of a point in optimizing the ergonomics of such a low-level API.

Answer 59 · 2017-07-08T16:26:54.000Z

@SimonSapin

I feel the ergonomics are not great. We went from importing one item to importing three from two different modules.

I had the exact same feeling with my codebase - it's pretty clunky now.

Would it make sense to have a convenience method for allocator.alloc(Layout::from_size_align(…))?

Do you mean in the Alloc trait, or just for Heap? One thing to consider here is that there's now a third error condition: Layout::from_size_align returns an Option, so it could return None in addition to the normal errors you can get when allocating.

Alternatively, could the Alloc trait be in the prelude or is it too niche of a use case?

IMO there isn't much of a point in optimizing the ergonomics of such a low-level API.

I agree that it's probably too low-level to put in the prelude, but I still think that there's value in optimizing the ergonomics (selfishly, at least - that was a really annoying refactor 😝 ).

Answer 60 · 2017-07-08T18:27:59.000Z

@SimonSapin did you not handle OOM before? Also in std all three types are available in the std::heap module (they're supposed to be in one module). Also did you not handle the case of overflowing sizes before? Or zero-sized types?

Answer 61 · 2017-07-08T18:55:37.000Z

did you not handle OOM before?

When it existed the alloc::heap::allocate function returned a pointer without a Result and did not leave a choice in OOM handling. I think it aborted the process. Now I’ve added .unwrap() to panic the thread.

they're supposed to be in one module

I see now that heap.rs contains pub use allocator::*;. But when I clicked Alloc in the impl listed on the rustdoc page for Heap I was sent to alloc::allocator::Alloc.

As to the rest, I haven’t looked into it. I’m porting to a new compiler a big pile of code that was written years ago. I think these are callbacks for FreeType, a C library.

Answer 62 · 2017-07-09T00:20:27.000Z

When it existed the alloc::heap::allocate function returned a pointer without a Result and did not leave a choice in OOM handling.

It did give you a choice. The pointer it returned could have been a null pointer which would indicate the heap allocator failed to allocate. This is why I'm so glad it switched to Result so people don't forget to handle that case.

Answer 63 · 2017-07-09T06:24:48.000Z

Oh well, maybe the FreeType ended up doing a null check, I don’t know. Anyway, yes, returning a Result is good.

Answer 64 · 2017-07-13T11:05:53.000Z

Given #30170 and #43097, I am tempted to resolve the OS X issue with ridiculously big alignments by simply specifying that users cannot request alignments >= 1 << 32.

One very easy way to enforce this: Change the Layout interface so that align is denoted by a u32 instead of a usize.

@alexcrichton do you have thoughts on this? Should I just make a PR that does this?

Answer 65 · 2017-07-13T11:22:47.000Z

@pnkfelix Layout::from_size_align would still take usize and return an error on u32 overflow, right?

Answer 66 · 2017-07-13T11:29:14.000Z

@SimonSapin what reason is there to have it continue taking usize align, if a static precondition is that it is unsafe to pass a value >= 1 << 32 ?

Answer 67 · 2017-07-13T11:31:30.000Z

and if the answer is "well some allocators might support an alignment >= 1 << 32", then we're back to the status quo and you can disregard my suggestion. The point of my suggestion is basically a "+1" to comments like this one

Answer 68 · 2017-07-13T11:40:38.000Z

Because std::mem::align_of returns usize

Answer 69 · 2017-07-13T11:42:57.000Z

@SimonSapin ah, good old stable API's... sigh.

Answer 70 · 2017-07-13T14:22:58.000Z

@pnkfelix limiting to 1 << 32 seems reasonable to me!

Answer 71 · 2017-10-16T17:16:25.000Z

@rfcbot fcp merge

Ok this trait and its types have bake for awhile now and also been the underlying implementation of the standard collections since its inception. I would propose starting out with a particularly conservative initial offering, namely only stabilizing the following interface:

pub struct Layout { /* ... */ }

extern {
    pub type void;
}

impl Layout {
    pub fn from_size_align(size: usize, align: usize) -> Option<Layout>;
    pub unsafe fn from_size_align_unchecked(size: usize, align: usize) -> Layout;

    pub fn size(&self) -> usize;
    pub fn align(&self) -> usize;
}

pub unsafe trait Alloc {
    unsafe fn alloc(&mut self, layout: Layout) -> *mut void;
    unsafe fn alloc_zeroed(&mut self, layout: Layout) -> *mut void;
    unsafe fn dealloc(&mut self, ptr: *mut void, layout: Layout);

    // all other methods are default and unstable
}

/// The global default allocator
pub struct Heap;

impl Alloc for Heap {
    // ...
}

impl<'a> Alloc for &'a Heap {
    // ...
}

/// The "system" allocator
pub struct System;

impl Alloc for System {
    // ...
}

impl<'a> Alloc for &'a System {
    // ...
}

Original proposal

pub struct Layout { /* ... */ }

impl Layout {
    pub fn from_size_align(size: usize, align: usize) -> Option<Layout>;
    pub unsafe fn from_size_align_unchecked(size: usize, align: usize) -> Layout;

    pub fn size(&self) -> usize;
    pub fn align(&self) -> usize;
}

// renamed from AllocErr today
pub struct Error {
    // ...
}

impl Error {
    pub fn oom() -> Self;
}

pub unsafe trait Alloc {
    unsafe fn alloc(&mut self, layout: Layout) -> Result<*mut u8, Error>;
    unsafe fn dealloc(&mut self, ptr: *mut u8, layout: Layout);

    // all other methods are default and unstable
}

/// The global default allocator
pub struct Heap;

impl Alloc for Heap {
    // ...
}

impl<'a> Alloc for &'a Heap {
    // ...
}

/// The "system" allocator
pub struct System;

impl Alloc for System {
    // ...
}

impl<'a> Alloc for &'a System {
    // ...
}

Notably:

Only stabilizing alloc, alloc_zeroed, and dealloc methods on the Alloc trait for now. I think this solves the most pressing issue we have today, defining a custom global allocator.
Remove the Error type in favor of just using raw pointers.
Change the u8 type in the interface to void
A stripped down version of the Layout type.

There are still open questions such as what to do with dealloc and alignment (precise alignment? fits? unsure?), but I'm hoping we can resolve them during FCP as it likely won't be an API-breaking change.

Answer 72 · 2017-10-16T17:51:18.000Z

+1 to getting something stabilized!

Renames AllocErr to Error and moves the interface to be a bit more conservative.

Does this eliminate the option for allocators to specify Unsupported? At the risk of harping more on something I've been harping on a lot, I think that #44557 is still an issue.

Layout

It looks like you've removed some of the methods from Layout. Did you mean to have the ones you left out actually removed, or just left as unstable?

Answer 73 · 2017-10-16T18:15:37.000Z

impl Error {
    pub fn oom() -> Self;
}

Is this a constructor for what’s today AllocErr::Exhausted? If so, shouldn’t it have a Layout parameter?

Answer 74 · 2017-10-16T18:16:41.000Z

Team member @alexcrichton has proposed to merge this. The next step is review by the rest of the tagged teams:

Concerns:

*mut u8 resolved by #32838 (comment)

Once these reviewers reach consensus, this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

Answer 75 · 2017-10-16T19:00:13.000Z

I'm really excited about getting to stabilize some of this work!

One question: in the above thread, @joshlf and @Ericson2314 raised an interesting point about the possibility of separating the Alloc and Dealloc traits in order to optimize for cases in which alloc requires some data, but dealloc requires no extra information, so the Dealloc type can be zero-sized.

Was this question ever resolved? What are the disadvantages of separating the two traits?

Answer 76 · 2017-10-16T19:05:14.000Z

@joshlf

Does this eliminate the option for allocators to specify Unsupported?

Yes and no, it would mean that such an operation is not supported in stable rust immediately, but we could continue to support it in unstable Rust.

Did you mean to have the ones you left out actually removed, or just left as unstable?

Indeed! Again though I'm just proposing a stable API surface area, we can leave all the other methods as unstable. Over time we can continue to stabilize more of the functionality. I think it's best to start as conservative as we can.

@SimonSapin

Is this a constructor for what’s today AllocErr::Exhausted? If so, shouldn’t it have a Layout parameter?

Aha good point! I sort of wanted to leave the possibility though to make Error a zero-sized type if we really needed it, but we can of course keep the layout-taking methods in unstable and stabilize them if necessary. Or do you think that the layout-preserving Error should be stabilized in the first pass?

@cramertj

I hadn't personally seen such a question/concern yet (I think I missed it!), but I wouldn't personally see it as being worth it. Two traits is twice the boilerplate in general, as now everyone would have to type Alloc + Dealloc in collections for example. I would expect that such a specialized use would not want to inform the interface all other users end up using, personally.

Answer 77 · 2017-10-16T19:41:52.000Z

@cramertj @alexcrichton

I hadn't personally seen such a question/concern yet (I think I missed it!), but I wouldn't personally see it as being worth it.

In general I agree that it's not worth it with one glaring exception: Box. Box<T, A: Alloc> would, given the current definition of Alloc, have to be at least two words large (the pointer that it already has and a reference to an Alloc at the very least) except in the case of global singletons (which can be implemented as ZSTs). A 2x (or more) blowup in the space required to store such a common and fundamental type is concerning to me.

Answer 78 · 2017-10-16T20:38:13.000Z

@alexcrichton

as now everyone would have to type Alloc + Dealloc in collections for example

We could add something like this:

trait Allocator: Alloc + Dealloc {}
impl<T> Allocator for T where T: Alloc + Dealloc {}

Answer 79 · 2017-10-16T21:03:19.000Z

A 2x (or more) blowup in the space required to store such a common and fundamental type

Only when you use a custom allocator that is not process-global. std::heap::Heap (the default) is zero-size.

Answer 80 · 2017-10-16T21:07:44.000Z

Or do you think that the layout-preserving Error should be stabilized in the first pass?

@alexcrichton I don’t really understand why this proposed first pass is at it is at all. There’s barely more than could already be done by abusing Vec, and not enough for example to use https://crates.io/crates/jemallocator.

What still needs to be resolved to stabilize the whole thing?

Answer 81 · 2017-10-16T21:31:45.000Z

Only when you use a custom allocator that is not process-global. std::heap::Heap (the default) is zero-size.

That seems like the primary use case of having parametric allocators, no? Imagine the following simple definition of a tree:

struct Node<T, A: Alloc> {
    t: T,
    left: Option<Box<Node<T, A>>>,
    right: Option<Box<Node<T, A>>>,
}

A tree constructed from those with a 1-word Alloc would have a ~1.7x blowup in size for the whole data structure compared to a ZST Alloc. That seems pretty bad to me, and these kinds of applications are sort of the whole point of having Alloc be a trait.

Answer 82 · 2017-10-16T21:40:32.000Z

@cramertj

We could add something like this:

We're also going to have actual trait aliases :) #41517

Answer 83 · 2017-10-17T00:05:02.000Z

@glaebhoerl Yeah, but stabilization still seems a ways off since there isn't an implementation yet. If we disable manual impls of Allocator I think we can switch to trait-aliases backwards-compatibly when they arrive ;)

Answer 84 · 2017-10-17T20:29:40.000Z

@joshlf

A 2x (or more) blowup in the space required to store such a common and fundamental type is concerning to me.

I'd imagine all implementations today are just a zero-size type or a pointer large, right? Isn't the possible optimization that some pointer-size-types can be zero sized? (or something like that?)

@cramertj

We could add something like this:

Indeed! We've then taken one trait to three though. In the past we've never had a great experience with such traits. For example Box<Both> does not cast to Box<OnlyOneTrait>. I'm sure we could wait for language features to smooth all this out, but it seems like those are a long way off, at best.

@SimonSapin

What still needs to be resolved to stabilize the whole thing?

I don't know. I wanted to start with the absolute smallest thing so there would be less debate.

Answer 85 · 2017-10-17T20:33:18.000Z

I'd imagine all implementations today are just a zero-size type or a pointer large, right? Isn't the possible optimization that some pointer-size-types can be zero sized? (or something like that?)

Yeah, the idea is that, given a pointer to an object allocated from your your type of allocator, you can figure out which instance it came from (e.g., using inline metadata). Thus, the only information you need to deallocate is type information, not runtime information.

Answer 86 · 2017-10-17T20:39:28.000Z

To come back to alignment on deallocate, I see two ways forward:

Stabilize as proposed (with alignment on deallocate). Giving away ownership of manually allocated memory would be impossible unless the Layout is included. In particular, it is impossible to build a Vec or Box or String or other std container with a stricter alignment than required (for instance because you don't want the boxed element to straddle a cache line), without deconstructing and deallocating it manually later (which is not always an option). Another example of something that would be impossible, is filling a Vec using simd operations, and then giving it away.
Do not require alignment on deallocate, and remove the small-allocation optimization from Windows’ HeapAlloc-based alloc_system. Always store the alignment. @alexcrichton, as you committed that code, do you remember why it was put there in the first place? Do we have any evidence that it saves a significant amount of memory for real-world applications? (With microbenchmarks it is possible to make the results come out either way depending the allocation size — unless HeapAlloc was rounding up sizes anyway.)

In any case, this is a very difficult trade off to make; the memory and performance impact will depend highly on the type of application, and which one to optimize for is application-specific as well.

Answer 87 · 2017-10-17T20:51:17.000Z

I think we may actually be Just Fine (TM). Quoting the Alloc docs:

Some of the methods require that a layout fit a memory block.
What it means for a layout to "fit" a memory block means (or
equivalently, for a memory block to "fit" a layout) is that the
following two conditions must hold:

The block's starting address must be aligned to layout.align().

The block's size must fall in the range [use_min, use_max], where:

use_min is self.usable_size(layout).0, and

use_max is the capacity that was (or would have been)
returned when (if) the block was allocated via a call to
alloc_excess or realloc_excess.

Note that:

the size of the layout most recently used to allocate the block
is guaranteed to be in the range [use_min, use_max], and

a lower-bound on use_max can be safely approximated by a call to
usable_size.

if a layout k fits a memory block (denoted by ptr)
currently allocated via an allocator a, then it is legal to
use that layout to deallocate it, i.e. a.dealloc(ptr, k);.

Note that last bullet. If I allocate with a layout with alignment a, then it should be legal for me to deallocate with alignment b < a because an object which is aligned to a is also aligned to b, and thus a layout with alignment b fits an object allocated with a layout with alignment a (and with the same size).

What this means is that you should be able to allocate with an alignment which is greater than the minimum alignment required for a particular type and then allow some other code to deallocate with the minimum alignment, and it should work.

Answer 88 · 2017-10-17T23:19:57.000Z

Isn't the possible optimization that some pointer-size-types can be zero sized? (or something like that?)

There was an RFC for this recently and it seems very unlikely that it could be done due to compatibility concerns: rust-lang/rfcs#2040

For example Box<Both> does not cast to Box<OnlyOneTrait>. I'm sure we could wait for language features to smooth all this out, but it seems like those are a long way off, at best.

Trait object upcasting on the other hand seems uncontroversially desirable, and mostly a question of effort / bandwidth / willpower to get it implemented. There was a thread recently: https://internals.rust-lang.org/t/trait-upcasting/5970

Answer 89 · 2017-10-18T03:13:40.000Z

@ruuda I was the one who wrote that alloc_system implementation originally. alexcrichton merely moved it around during the great allocator refactors of <time period>.

The current implementation requires that you deallocate with the same alignment specified that you allocated a given memory block with. Regardless of what the documentation may claim, this is the current reality that everyone must abide by until alloc_system on Windows is changed.

Allocations on Windows always use a multiple of MEMORY_ALLOCATION_ALIGNMENT (although they remember the size you allocated them with to the byte). MEMORY_ALLOCATION_ALIGNMENT is 8 on 32bit and 16 on 64bit. For overaligned types, because the alignment is greater than MEMORY_ALLOCATION_ALIGNMENT, the overhead caused by alloc_system is consistently the amount of alignment specified, so a 64byte aligned allocation would have 64 bytes of overhead.

If we decided to extend that overaligned trick to all allocations (which would get rid of the requirement to deallocate with the same alignment that you specified when allocating), then more allocations would have overhead. Allocations whose alignments are identical to MEMORY_ALLOCATION_ALIGNMENT will suffer a constant overhead of MEMORY_ALLOCATION_ALIGNMENT bytes. Allocations whose alignments are less than MEMORY_ALLOCATION_ALIGNMENT will suffer an overhead of MEMORY_ALLOCATION_ALIGNMENT bytes approximately half of the time. If the size of the allocation rounded up to MEMORY_ALLOCATION_ALIGNMENT is greater than or equal to the size of the allocation plus the size of a pointer, then there is no overhead, otherwise there is. Considering 99.99% of allocations will not be overaligned, do you really want to incur that sort of overhead on all those allocations?

Answer 90 · 2017-10-18T14:28:20.000Z

@ruuda

I personally feel that the implementation of alloc_system today on Windows is a bigger benefit than having the ability to relinquish ownership of an allocation to another container like Vec. AFAIK though there's no data to measure the impact of always padding with the alignment and not requiring an alignment on deallocation.

@joshlf

I think that comment is wrong, as pointed out alloc_system on Windows relies on the same alignment being passed to deallocation as was passed on allocation.

Answer 91 · 2017-10-18T17:44:28.000Z

Considering 99.99% of allocations will not be overaligned, do you really want to incur that sort of overhead on all those allocations?

It depends on the application whether the overhead is significant, and whether to optimize for memory or performance. My suspicion is that for most applications either is fine, but a small minority cares deeply about memory, and they really cannot afford those extra bytes. And another small minority needs control over alignment, and they really need it.

Answer 92 · 2017-10-18T18:00:16.000Z

@alexcrichton

I think that comment is wrong, as pointed out alloc_system on Windows relies on the same alignment being passed to deallocation as was passed on allocation.

Doesn't that imply that alloc_system on Windows doesn't actually properly implement the Alloc trait (and thus maybe we ought to change the requirements of the Alloc trait)?

@retep998

If I'm reading your comment correctly, isn't that alignment overhead present for all allocations regardless of whether we need to be able to deallocate with a different alignment? That is, if I allocate 64 bytes with 64 byte alignment and I also deallocate with 64 byte alignment, the overhead you described is still present. Thus, it's not a feature of being able to deallocate with different alignments so much as it is a feature of requesting larger-than-normal alignments.

Answer 93 · 2017-10-18T23:38:43.000Z

@joshlf The overhead caused by alloc_system currently is due to requesting larger-than-normal alignments. If your alignment is less than or equal to MEMORY_ALLOCATION_ALIGNMENT, then there is no overhead caused by alloc_system.

However, if we changed the implementation to allow deallocating with different alignments, then the overhead would apply to nearly all allocations, regardless of alignment.

Answer 94 · 2017-10-18T23:46:55.000Z

Ah I see; makes sense.

Answer 95 · 2017-10-19T05:39:20.000Z

What is the meaning of implementing Alloc for both Heap and &Heap? In what cases would the user use one of those impls vs the other?

Is this the first standard library API in which *mut u8 would mean "pointer to whatever"? There is String::from_raw_parts but that one really does mean pointer to bytes. I am not a fan of *mut u8 meaning "pointer to whatever" -- even C does better. What are some other options? Maybe a pointer to opaque type would be more meaningful.

@rfcbot concern *mut u8

Answer 96 · 2017-10-19T14:01:19.000Z

@dtolnay Alloc for Heap is sort of "standard" and Alloc for &Heap is like Write for &T where the trait requires &mut self but the implementation does not. Notably that means that types like Heap and System are threadsafe and do not need to be synchronized when allocating.

More importantly, though, usage of #[global_allocator] requires that the static it's attached to , which has type T, to have Alloc for &T. (aka all global allocators must be threadsafe)

For *mut u8 I think that *mut () may be interesting, but I don't personally feel too compelled to "get this right" here per se.

Answer 97 · 2017-10-19T15:44:28.000Z

The main advantage of *mut u8 is that it is very convenient to use .offset with byte offsets.

Answer 98 · 2017-10-19T20:30:31.000Z

For *mut u8 I think that *mut () may be interesting, but I don't personally feel too compelled to "get this right" here per se.

If we go with *mut u8 in a stable interface, aren't we locking ourselves in? In other words, once we stabilize this, we won't have a chance to "get this right" in the future.

Also, *mut () seems a bit dangerous to me in case we ever make an optimization like RFC 2040 in the future.

Answer 99 · 2017-10-19T20:33:21.000Z

The main advantage of *mut u8 is that it is very convenient to use .offset with byte offsets.

True, but you could easily do let ptr = (foo as *mut u8) and then go about your merry way. That doesn't seem like enough of a motivation to stick with *mut u8 in the API if there are compelling alternatives (which, to be fair, I'm not sure there are).

Answer 100 · 2017-10-19T20:36:41.000Z

Also, *mut () seems a bit dangerous to me in case we ever make an optimization like RFC 2040 in the future.

That optimization will already probably never happen-- it'd break too much existing code. Even if it did, it would be applied to &() and &mut (), not *mut ().