IntersectMBO/cardano-base

Convert from Hash to/from PinnedSizedBytes safely

lehins opened this issue ยท 9 comments

There is a way to implement a total function that does this conversion because we now statically both of their sizes. This came up in #289

There seems to be an odd distinction between PackedBytes and PinnedSizedBytes at play here. Is there a reason we can't just use the same type for both?

@kozross They are quite different underneath. PackedBytes are backed by Word64s, while PinnedSizedBytes are backed by an actual ByteArray

I understand they are implemented quite differently: why can't we unify their representations? I don't really see why they need to be separate in the first place.

It is right in the name: one is pinned another one is not ๐Ÿ˜„

Yeah, I guess that's true, but at the same time, I feel that making both things use ByteArray underneath would make such conversions easier. We could also share APIs in most cases, provided we don't accidentally mix pinned and unpinned things (which these wrappers would prevent).

I'm also wondering if there'd ever come a time when we might want to generate hashes via the FFI: we don't need this currently, but we might one day.

I feel that making both things use ByteArray underneath would make such conversions easier.

I see what you are trying to say. Putting pinnedness aside, the whole point why PackedBytes exist is because this:

data Hash32 = Hash32 {-# UNPACK #-} !Word64 {-# UNPACK #-} !Word64 {-# UNPACK #-} !Word64 {-# UNPACK #-} !Word64

will use 30% less memory than

newtype Hash32 = Hash32 (PinnedSizedBytes 32)

We store a lot of hashes in memory and it makes a difference.

I'm also wondering if there'd ever come a time when we might want to generate hashes via the FFI: we don't need this currently, but we might one day.

We are already producing hashes over ffi with libsodium, we just immediately convert them to PackedBytes, so they don't contribute to memory fragmentation and consume less memory, as mentioned above.

Interesting - I've learned something new today. In that case, the difference in representation definitely makes sense. It would still be nice to have a safe conversion though.

It would still be nice to have a safe conversion though.

That's exactly what this ticket is about ๐Ÿ˜‰