phpv8/v8js

Another few segfaults :-(

redbullmarky opened this issue · 6 comments

The following script:

<?php
ini_set('v8js.flags', '--trace-gc');
class my8 extends \V8Js
{
	public function getthing() {
		return function() {
		};
	}
}

$j = new my8();
for($i = 0; $i < 20000; $i++) {
	$r = $j->executeString('
	(() => {
		var a = PHP.getthing()
		
	});
	', null, V8Js::FLAG_FORCE_ARRAY);

	$r();
 	echo "done {$i}\n";
	gc_collect_cycles();
}

gives me:

done 1534
PHP Warning:  Fatal V8 error in v8::Isolate::Dispose(): Disposing the isolate that is entered by a thread. in t3.php on line 22

and regardless of how many times i run it, it ALWAYS stops on 1534, unless i take out the blank newline in the JS block (after var a = PHP.getthing()).

Similarly, this script:

<?php
ini_set('v8js.flags', '--trace-gc');
class my8 extends \V8Js
{
	public function getthing() {
		return new class() {
			public function testfunc() {
			}
		};
	}
}

$j = new my8();
for($i = 0; $i < 20000; $i++) {
	$r = $j->executeString('
	(() => {
		var a = PHP.getthing()

		a.testfunc();
	});
	', null, V8Js::FLAG_FORCE_ARRAY);

	$r();
 	echo "done {$i}\n";
	gc_collect_cycles();
}

gives me:

done 4037
[64458:0x55d180949ac0]     1115 ms: Scavenge 1.3 (1.9) -> 0.8 (1.9) MB, 0.4 / 0.1 ms  (average mu = 1.000, current mu = 1.000) allocation failure
Segmentation fault (core dumped)

and again, it's always very consistent with how many iterations it manages. Anyone have any suggestions as to what might be causing this? And whether there are any workarounds I can try?

These pair of scripts do complete when gc_collect_cycles is removed, but they're both to replicate something that's happening on a project that DOESN'T use gc_collect_cycles

my hunch after a little trial-and-error debugging is that perhaps v8js_weak_object_callback needs to do more. if i comment out the zval_ptr_dtor(&value); line or the ctx->weak_objects.erase(object); line, neither of my examples above crash. but i can't just do that as i don't know what the full side-effects would be.

@stesie is it possible that either/both of those things might need conditions of some sort around them?

EDIT: commenting out Z_ADDREF_P(&value); of v8js_construct_callback also allows both to pass.

@redbullmarky have you tried compiling/linking with address sanitation? (plus php's reference count debugging)
I wonder whether it'd complain that you are leaking memory with the zval_ptr_dtor commented out.

Also I wonder why it stops crashing if you remove the Z_ADDREF_P since I'd expect it to access memory after free even earlier...

(however this is without having tried it out/debugged myself so far)

Maybe it helps to provide some details as to why the code was added in the past (I'm not up to date with php 8 internals, hence it indeed may need to be different nowadays):

  • PHP does reference counting on objects, and every time the reference counter goes back to zero, the object is freed/destroyed
  • so if you'll call a function, which creates an object, then exports it to V8 and afterwards exits the function (so the php variables gets out of scope), ... then without the Z_ADDREF_P above the reference count should go back to zero leaving the function, and the object will be freed
  • if then JavaScript code uses the reference afterwards, strange things are about to happen, since it'll again use the object which actually has been freed before
  • that's why php-v8js itself increases the reference count and asks V8 to call back if the JavaScript variable (holding the reference to the PHP object) is cleared
  • hence if you'll overwrite the V8 variable plus V8's garbage collector kicks in, then the weak object callback should be called, decreasing the reference counter (possibly to zero) and hence making PHP delete the object

@stesie this is for 7.4, though 8.0 and 8.1 both also segfault for the same reasons.

I even got as far as debugging the PHP source and there seems to be an entry on the gc stack where zend_ref counted is a pointer to 0x11 - hence the segfault

nothing I do with the PHP code above helps either; I’ve tried holding references (in PHP) to every object and structure that is created, but still it goes wrong. Adding a newline to the JS code affects the number of iterations which is odd, but I guess it’s just moving the point at which the V8 gc kicks in, either delaying or bringing forward the issue.

commenting out Z_ADDREF_P(&value); addresses this issue, but a whole bunch of tests (including the original one written for #6 when that was added) fail, so it’s certainly not the right thing to do. If i copy the line and repeat it (so it adds two refs) it passes too, but that obviously isn’t right either 😂

All these things lead me to believe that perhaps V8Js is doing something with respect to references that breaks GC. But I dont understand the internals (or C++/PHP internals) enough yet to get a solution.

I sent you an email with respect to getting some additional support, but anything that can be done to help with these two cases would be huge for us.

@stesie could it be that, whilst the zval_ptr_dtr does indeed do the right thing, the erase(object) line is actually destroying the object in some way entirely such that the GC can’t deal with it properly?

I don’t know when PHP gc kicks in if I’m honest, and from what I’ve seen, debugging this code only ever shows a refcount of 1 which kind of contradicts my thoughts.

But yeah, I’m at a loss with this one :(

@redbullmarky thanks for pointing out this is about 7.4 ... got a bit confused here :)

As mentioned in the referenced PR I think this is caused by V8's GC kicking in while PHP's GC is running... and only then.

I'm still not 100% sure, but if this happens it seems like PHP gets confused about its objects and possibly freeing the wrong ones... and if it happens to "randomly" free the V8Js object (i.e. my8 in your example) then it starts shutting down V8 while it's running. Which is why you see V8 crying Disposing the isolate that is entered by a thread

... due to some weird call path from v8js_weak_object_callback to v8js_free_storage (which frees all left overs).
And then everything is of course nuts ..

@stesie i'll try and write a few formal tests around these (and the november one) and get them committed, but having tried a bunch of tests at my end, your fix appears to have resolved both issues for me also! both of the examples above now run to completion every time.

thank you so much for your help!