nodejs/node

Feature: Immutable Buffer `buffer.readonly()`

mikeal opened this issue · 44 comments

Is your feature request related to a problem? Please describe.

I have an object with a buffer attached and I need to be able to pass around that Buffer but ensure that it won’t mutate.

Describe the solution you'd like

A method on Buffer instances to put it into a “read only mode” would be ideal. I’d prefer it not be in the constructor so that I don’t have to perform a memcopy in order to get it.

Describe alternatives you've considered

There isn’t much you can do except force a full copy every time you pass it around, which is pretty bad. Object.freeze() won’t work because all the mutations happen through methods that are effectively invisible to Object.freeze().

However, Object.freeze() has some negative performance implications while implementing this in the Buffer object itself would not have the same problems. This would be a nice features of Node.js Core.

I think this would have to be a language feature, because the only difference between Buffer and Uint8Array is that the former has a few more prototype methods.

/cc @nodejs/open-standards (I guess?)

This would definitely be interesting but I agree we might want to push to tc39 first

You can also define a proxy over the buffer though that would probably be expensive since it'd be hit on every access. I wonder if a proxy with only a set trap (and not a get trap) would be slow or not.

you'd also want to intercept .buffer I assume

Yeah, that’s a good point to figure out when bringing this to TC39 – would we want read-only ArrayBufferViews, or read-only ArrayBuffers, or both?

I think read-only ArrayBuffers would be a useful feature to represent memory that should not or cannot be written to. However, if we still need to write to the memory through a different object, we might need two ArrayBuffers, which would technically act like a View I guess?

Borrowing ArrayBuffer.prototype methods and .calling them on a Proxy would throw, though, wouldn’t it? (since internal slots don’t tunnel)

This is an interesting problem, and I agree that it would probably be best solved at the engine level and across Javascript as a whole.

Is the idea for the API that you would start with a mutable buffer and at some point freeze it?

Do you want a read-only view on something which may still change out from under you? If so, a Proxy-based solution may work, if the performance can be good enough. Otherwise, we need to freeze the underlying ArrayBuffer somehow.

For context, would it be reasonable to copy the Buffer into an immutable structure, or would that be too slow for your needs?

Copying would certainly be too slow, at least for the cases where I think this would be useful. A read only view on an otherwise mutable structure would work for most cases I could imagine but @mikeal may have other requirements. The most common case I would imagine for this is some bit of code that creates/mutates the buffer then passes it off to some other context, after which it would no longer be modified.

I agree with @jasnell. Would it be possible to allow freezing both buffers and views? For example,

  • ArrayBufferView.freeze() marks the view as "frozen", preventing all future write accesses through the view.
  • ArrayBuffer.freeze() marks the buffer as "frozen", preventing all future write accesses to the buffer, including write accesses through views.

Would this work and cover all use cases? It would certainly cause some overhead, but still much less than copying the data into a new buffer.

The name couldn't be "freeze" as that already has a meaning; and I suspect TC39 would likely be more interested in a generic mechanism to make internal slots immutable rather than one-offs on specific builtins.

A read-only view would be fine, as long as it didn't cost much and looked to the reader like any other binary type.

Read-only views might not be enough, though, e.g., to represent memory that must not or cannot be written to. In that case, bypassing the read-only behavior via .buffer should not be possible.

The name couldn't be "freeze" as that already has a meaning

Fair point. I chose the name for lack of a better idea and I think it resembles the semantics quite well.

I suspect TC39 would likely be more interested in a generic mechanism to make internal slots immutable rather than one-offs on specific builtins.

Maybe, I don't know. By the way, the type of [[ArrayBufferData]] is currently "a distinct and mutable sequence of byte-sized (8 bit) numeric values", so making it immutable would be a change of the type, not of the internal field itself as far as I understand it.

Read-only views might not be enough, though, e.g., to represent memory that must not or cannot be written to.

Even if it could be worked around by accessing the underlying ArrayBuffer, making just a Buffer read-only would already be of great benefit to prevent accidental modification of buffers that are intended to be immutable (due to bugs or misunderstanding about a module's API), where modification results in code breakage.

Though I agree that read-only ArrayBuffers to represent read-only external data (which might even segfault if modification is attempted) would also be useful, that's more for benefit of embedders and C++ module authors rather than javascript programmers.

The two seem fairly orthogonal to me: support for read-only external ArrayBuffers would obviously result in read-only ArrayBufferViews, but wouldn't necessarily give you the ability to make read-only views of normal ArrayBuffers, let alone the ability to make an existing ArrayBufferView read-only.
Conversely, read-only views don't require support for read-only ArrayBuffers.

Okay, so assuming that what we’d ideally want is:

  1. Read-only ArrayBuffers (presumably r/o from the start, not explicitly frozen?), which can only have read-only ArrayBufferViews
  2. Read-only ArrayBufferViews, over both writable and read-only ArrayBuffers

Is that something that somebody would be willing to try and work out in TC39?

Because otherwise I don’t think that this issue is particularly actionable for Node.js…

@littledan

I'd be happy to author such a proposal but I'd need a champion.

Because otherwise I don’t think that this issue is particularly actionable for Node.js…

Well, let's not forget the original question was for Buffers, which is also the only case I personally care about, and Buffer is a node.js specific class that happens to be implemented as a subclass of Uint8Array. If the original request, the ability to mark a Buffer as read-only (or at least get a read-only slice of it), can be implemented by nodejs without requiring getting a language change through a committee, that would satisfy me just fine (and I'd presumably not be alone in that).

Getting read-only ArrayBuffers or ArrayBufferViews in general may be nice to have, but I wouldn't have any immediate application for them myself, whereas I've felt the desire for readonly Buffers quite often.

There is already a Stage 1 proposal for introducing read-only records and tuples (https://github.com/tc39/proposal-record-tuple). A read-only ArrayBuffer is essentially a Tuple as defined by the proposal. I think what would be potentially feasible at the language level is something like Uint8Array.from(#[1,2,3]) (that is, allow TypedArray.prototype.from()` to take a readonly tuple as input to produce a typed and still read-only view over the top.

I think immutability should be a property of the object itself, not of where it took the data from (or something like Uint8Array.from(Uint8Array.from(#[1])) might be a case of confusion). Also, that proposal is not near to being completed, and it may change significantly as it goes through further design.

Fair, but anything we may do here should be aligned or done along with that work as it progresses.

I will celebrate the day we get tuples in javascript, however I don't see it being helpful here. Even if read-only ArrayBuffers could be created, I would be extremely surprised if passing a tuple to Uint8Array.from would yield a read-only buffer, I'd expect it to be treated as any other iterable. It also doesn't sound great if making a read-only buffer would require first making a tuple of the data (resulting in data copy) and then making a read-only buffer from that (almost certainly resulting in a second data copy)

If the original request, the ability to mark a Buffer as read-only (or at least get a read-only slice of it), can be implemented by nodejs without requiring getting a language change through a committee, that would satisfy me just fine (and I'd presumably not be alone in that).

Fwiw, I wouldn’t see how that’s doable without non-trivial performance impact.

Fwiw, I wouldn’t see how that’s doable without non-trivial performance impact.

Yeah this is something I can't judge. E.g. I don't know if node can somehow make a ReadOnlyBuffer class that overrides and blocks element-setting and the various mutator methods on Uint8Array and Buffer. Obviously this could be bypassed by someone intent on doing so, but it seems perfectly adequate to catch bugs and accidents.

If the performance impact would be limited to this ReadOnlyBuffer it might still be acceptable, then if read-only ArrayBufferViews ever land in javascript it could be updated to use those.

bmeck commented

It might be more prudent to first go through TC53 instead of TC39]first which likely has a bigger interest in things like this for putting values into ROM as much as possible unlike most desktop runtimes. That might at least be a first step to gather implementer interest before presenting TC39 a proposal.

which likely has a bigger interest in things like this for putting values into ROM as much as possible unlike most desktop runtimes

Fwiw, I don’t think that’s the major motivation here. Making read-only memory available to JS seems to be a secondary concern, the primary one being the ability to prevent mutations from accidental programming errors.

A good way to drive an eventual standard anywhere would be to build something in Node.js.

Buffer was built before we had binary in the language, this is an area Node.js has lead standards before, I don’t see why it can’t lead again.

@mikeal Well, the reason that doing this inside of Node.js core is tricky is that we have switched Buffer to being based the language-provided feature, rather than keeping our own.

The only thing I could see happening from a technical perspective is providing a read-only Buffer variant that is essentially a Proxy for a “real” Buffer. But that doesn’t seem to be what you had in mind?

If it performed well that would be sufficient, but the last time I looked at Proxy objects this was problematic.

As the .buffer property of Uint8Array houses the underlying memory, that can be used by other API's [1], it would seem prudent that this is a feature of ArrayBuffer rather than Buffer or Uint8Array.

Since ArrayBuffer can not be written by itself, it might be a good idea to either have a one-time mutating operation arrayBuffer.lock() or a means to retrieve a read-only adapter: new Uint8Array(arrayBuffer.readonly()) or new Uint8Array(new ReadonlyArrayBuffer(arrayBuffer)).

[1]

const a = new Uint8Array(8)
const b = new Uint8Array(a.buffer)
b[0] = 1
a[0] === 1

On a different note: Is this related to having "secure" buffers? I.e. https://libsodium.gitbook.io/doc/memory_management#guarded-heap-allocations

I would be very happy for a tc39 proposal 😊

See also https://github.com/tc39/proposal-readonly-collections and https://github.com/domenic/proposal-arraybuffer-transfer as potentially related. If such a feature were to exist, I would expect it to have a high level of synergy with the read-only collections proposal.

I'd like to ask the champions of the readonly collections proposal, @erights @phoddie : is anything that Node contributors could do that would be of use to you in developing your proposal in this area? It's hard for me to tell how it's been going since it was first presented at TC39.

Good question, @littledan. We do seem a bit stalled on this proposal. Let me share a few thoughts.

The read-only collections proposal combines immutable collections with some clever methods for working with them. Having read-only collections with well defined behaviors is essential for embedded work. We use them in XS today, but they have non-normative behaviors. I believe these are the fundamental questions that need to be addressed to allow for conforming implementations:

  • How does a collection transition from mutable to immutable?
  • What happens when you try to mutate an immutable collection?
  • How do you detect a collection is immutable?

While ArrayBuffer isn't strictly a collection, it needs to be addressed as backing store for both TypedArrays and DataViews.

The methods described in the proposal should work well for embedded (note: we have not implemented theme yet in XS). They do provide more functionality than we strictly need, however. A discussion of what is minimally necessary -- particularly from a security perspective -- would be valuable to help focus the immutable collection proposal and move it forward.

How does a collection transition from mutable to immutable?

IMO, immutable types should be immutable when instantiated. Given the perf issues of Object.freeze() I’m a bit sour on dynamically making collections immutable and none of the use cases I have require that I move a collection from mutable to immutable, they can all be handled by instantiating them in an immutable form.

What happens when you try to mutate an immutable collection?

Throw an exception.

How do you detect a collection is immutable?

Something along the lines of Object.isMutable().

@mikeal - I'm not certain whether the performance issues of Object.freeze in some engines are inherent in the feature or a consequence of frozen objects being a relatively unused feature on the web and hence not well optimized yet. If you know, I would be interested.

...none of the use cases I have require that I move a collection from mutable to immutable

It is pretty common for us to create a collection and then want to make it immutable. Requiring a collection to be immutable only at creation implies cloning a mutable collection into an immutable one. Given that we do this at build time, the performance and memory hit isn't a concern. For other uses, it could be.

Throw an exception.

I tend to agree. Which exception is less obvious. JavaScript seems to prefer reusing existing exceptions but here I think a new exception is needed for detection to be reliable.

Something along the lines of Object.isMutable()

Putting it on object makes some sense but also seems to imply an expanded scope. (A better alternative doesn't leap to mind, alas.)

This feature indeed would be very useful. Can someone recap where we are at please? Is there an official proposal already?

There has been no activity on this feature request for 5 months and it is unlikely to be implemented. It will be closed 6 months after the last non-automated comment.

For more information on how the project manages feature requests, please consult the feature request management document.

Please keep this open.

There has been no activity on this feature request for 5 months and it is unlikely to be implemented. It will be closed 6 months after the last non-automated comment.

For more information on how the project manages feature requests, please consult the feature request management document.

I certainly will use this feature a lot, when reading from cached buffers, I likely use subarray for getting an view and will be very useful if this subarray are only read only to not modify the cached buffer

There has been no activity on this feature request for 5 months and it is unlikely to be implemented. It will be closed 6 months after the last non-automated comment.

For more information on how the project manages feature requests, please consult the feature request management document.

To recap: this needs someone to champion a proposal at TC39.

I'm going to close the issue because, while the discussion is interesting, it's inactionable for Node.js.

Just FYI – there is a proposal at TC39 for read-only collections which provides for immutable ArrayBuffers. The champions are @erights and me. It hasn't been very active, so it may still be appropriate to close this here.