Make paging state strongly typed

Question

Make paging state strongly typed

Opened this issue 2 months ago · 12 comments

Currently, paging state is represented both in the driver's internals and in the public API as Bytes blob. As it's no use allowing the users to construct paging state on their own, I suggest creating an opaque newtype for it, with a private constructor.

Additionally, the core benefit that Bytes bring - being able to subslice a bigger allocation - does not apply in the case of paging state. I believe that paging state could be better represented as Arc<[u8]>.

Answer 1 · 2024-04-24T08:44:36.000Z

As it's no use allowing the users to construct paging state on their own, I suggest creating an opaque newtype for it, with a private constructor.

They may want to serialize paging state and deserialize it later

Answer 2 · 2024-04-24T09:14:24.000Z

As it's no use allowing the users to construct paging state on their own, I suggest creating an opaque newtype for it, with a private constructor.

They may want to serialize paging state and deserialize it later

Can't this be said about any struct that we intentionally restrict access to?

Answer 3 · 2024-04-24T09:53:23.000Z

No, this is actually existing use case, that's how you can implement paging e.g. in your web service.
Even Java Driver docs mention this use case: https://java-driver.docs.scylladb.com/scylla-3.11.2.x/manual/paging/index.html
It's hardly comparable to serializing any other random struct.

Answer 4 · 2024-04-24T10:18:14.000Z

Then it would make sense to expose constructor (from_raw) and accessor (into_raw/into_inner).

Answer 5 · 2024-04-24T10:25:54.000Z

What is the problem with current version (just using Bytes)?
Custom struct will force users to create their own newtype wrapper to be able to serialize it, that's 2 more levels of abstraction for which I don't see a compelling benefit

Answer 6 · 2024-04-24T10:30:03.000Z

What is the problem with current version (just using Bytes)?

The answer is:

Additionally, the core benefit that Bytes bring - being able to subslice a bigger allocation - does not apply in the case of paging state. I believe that paging state could be better represented as Arc<[u8]>.

When I see Bytes, I expect some shared ownership of the whole frame. This is misleading, because paging state's Bytes are currently created from an exclusively owned buffer that is distinct from the frame Bytes.

Answer 7 · 2024-04-24T10:31:26.000Z

Custom struct will force users to create their own newtype wrapper to be able to serialize it

Not if we provide the two functions I mentioned:

Then it would make sense to expose constructor (from_raw) and accessor (into_raw/into_inner).

Answer 8 · 2024-04-24T10:54:07.000Z

Custom struct will force users to create their own newtype wrapper to be able to serialize it

Not if we provide the two functions I mentioned:

Yes, even then, because people use frameworks for serialization (serde, rkyv etc)

Then it would make sense to expose constructor (from_raw) and accessor (into_raw/into_inner).

Answer 9 · 2024-04-24T10:56:26.000Z

What is the problem with current version (just using Bytes)?

The answer is:

Additionally, the core benefit that Bytes bring - being able to subslice a bigger allocation - does not apply in the case of paging state. I believe that paging state could be better represented as Arc<[u8]>.

When I see Bytes, I expect some shared ownership of the whole frame. This is misleading, because paging state's Bytes are currently created from an exclusively owned buffer that is distinct from the frame Bytes.

Now I don't understand at all. You don't want Bytes because it implies shared ownership (it doesn't, can be used here without any problem), and you propose Arc<[u8]> which is definitely a shared-ownership. I'm lost.

Answer 10 · 2024-04-24T11:03:45.000Z

I expect some shared ownership of the whole frame.

Perhaps I should have made this bold:

I expect some shared ownership of the whole frame.

Bytes enable subslicing while retaining shared ownership; instead, when you have an Arc<[u8]> in hand, you know the exact size of the allocation.

This makes huge difference when you want to have control over memory allocated:

when you hold a Bytes that refer to the result frame, they keep it allocated all the time;
when you hold a Bytes pointing to the paging state only, you might wonder whether they keep the frame allocated;
when you hold an Arc<[u8]> pointing to the paging state only, you have no such worries.

Such precaution (about using Bytes carefully` in deserialization) was included in this the deserialization framework refactor by @piodul.

Answer 11 · 2024-04-24T11:10:28.000Z

User of the driver doesn't care about the frame and has no reason to think about it in terms of result frame. This is something only maintainer of this driver would do.

Answer 12 · 2024-04-24T11:36:50.000Z

As discussed face to face, deserialization refactor is going to enable the user deserialize frame subslices into Bytes, so the user will have to care about the result frame.
I, as the maintainer, get confused about Bytes being used there.