kvark/copyless

Question: Why mem::replace?

Opened this issue · 7 comments

I just ran across this and wanted to understand how this crate avoids memcpy. One thing I stumbled upon is this:

let ptr = mem::replace(&mut self.0, ptr::null_mut());

Why is this using mem::replace, which seems like it'd do an unnecessary write, as opposed to let ptr = &mut *(self.0);?

Here it is with just

let ptr = self.0;

https://play.rust-lang.org/?version=stable&mode=release&edition=2018&gist=e6d56b0b8587b398020085b9ec9c5130

You can see that because we never wrote null into self.0, the Drop implementation deallocates the pointer we just passed on to a Box. In this case, x in Foo::Small(x) gets set to 0 given the surrounding allocations, indicating corruption. Eventually there will be a second dealloc on the same pointer.

You could also write out literally what mem::replace effectively does, but I'm not sure how that would interact with the optimizer. You still avoid memcpy with this:

let ptr = self.0;
self.0 = ptr::null_mut();
playground::foo:
	pushq	%rax
	movl	$804, %edi
	movl	$4, %esi
	callq	*__rust_alloc@GOTPCREL(%rip)
	movw	$1024, (%rax)
	popq	%rcx
	retq

That's why I think Option< NonNull<T>> should be used instead of *mut T. I'd rather have the type checker remind me of null / None handling (and in the example .take() the pointer) than doing C-like programming: see #1

Although it does generate very slightly worse code, because the NonNull constructor checks its invariant. If you're optimizing away a big memcpy that's not a huge problem.

@@ -3,6 +3,8 @@ playground::foo:
        movl    $804, %edi
        movl    $4, %esi
        callq   *__rust_alloc@GOTPCREL(%rip)
+       testq   %rax, %rax
+       je      .LBB9_1
        movw    $1024, (%rax)
        popq    %rcx
        retq

Oh d'oh, I entirely missed the connection with drop. Sorry! If you don't object I could submit a PR to add a comment along those lines.

However, if it's only about drop, then why not let ptr = self.0; mem::forget(self)? Then you could even remove the conditional from drop.

@cormacrelf yep, there has to be an out-of-memory check somewhere within an allocation, that's why my allocation code aborts on the none case. NonNull::new should be a no-op (simple cast) assembly-wise, and the check should come from my checking if _.is_none() { abort() } the OOM, which has to be there

The code could indeed be further improved as @RalfJung suggested:
replace Option<NonNull<T>> by NonNull<T> altogether (thanks to the abort on null from OOM-handling), and then, on drop, use the pointer, unconditionally freeing it, thanks to a mem::forget on the init() function.

kvark commented

Hi everyone! Thank you for fruitful discussion here!
Apparently, my watch setting was off by default, so I'm only now discovering what happened here :)