Circular dependency between `ReflectionMethod` and `ReflectionClass`
staabm opened this issue · 4 comments
today I was looking into why the unit test-suite requires 3-4GB of RAM on my windows machine.
phpunit is getting inefficient because it needs to export/serialize all data of data-providers.
while looking deeper I noticed that the current class hierarchy contains a Circular dependency between ReflectionMethod
and ReflectionClass
.
my thesis is, that a circular reference in this repos classes will also translate to inefficiences at PHP runtime, because the garbage collector need to do some extra work to free memory.
I want to leave this issue here to discuss whether we can do something about that.
This is also true for reflection itself: the PHP engine has gc_collect_cycles()
for this, but reflection as a unidirectional graph requires caching in the reflector itself, plus more computational complexity, IMO.
We did try (as much as possible) to remove all AST from the library memory profile, and then we tried making all reflection symbols as immutable as possible.
Making the graph without references requires to keep all references in the reflector, and making the API of all symbols a bit awkward to use (must pass the reflector at every call).
today I was looking into why the unit test-suite requires 3-4GB of RAM on my windows machine.
I think the initial problem I found will be fixed with sebastianbergmann/phpunit#5774
(regarding resource consumption/performance in the test-suite; the PR improves the test-suite runtime by 6x and reduces memory footprint to a third)
This is also true for reflection itself: the PHP engine has
gc_collect_cycles()
for this, but reflection as a unidirectional graph requires caching in the reflector itself, plus more computational complexity, IMO.
I wonder whether there is a way, which would help to get an idea whether the idea is worth exploring.
so maybe some kind of benchmark which would proof that the circulat design actually has negative runtime impact
The general problem is API.
Before: ReflectionMethod#getClass(): ReflectionClass
After: ReflectionMethod#getClass(Reflector $reflector): ReflectionClass
This effectively completely breaks the internal references, which are instead asked to the Reflector
each time. All in-memory caching can then be done in the Reflector
. On the other end, it makes the API worse to traverse.
This set of problems is similar to what you'd encounter with doctrine/orm
: each "unit of work" is a graph of potentially bi-directional objects, which comes with some advantages and some disadvantages. This allows business logic to operate with the pure-looking (they aren't, but that's the idea) in-memory objects, without worrying too much about traversal.
In a pure functional world, the reflector would contain the I/O, and the reflected symbols wouldn't have any I/O.
I think it's worth exploring this idea for a new major version, but we'd need to identify all methods that perform this sort of traversal, such as (for example):
- parent / child in an inheritance traversal
- symbol declaring another symbol (finding the interface/trait/class where a method, constant or property is declared in)
- constant expression evaluation
- etc.
We'd then be able to also declare many of our classes completely @immutable
, whilst having all traversal methods be impure and declared as such, for example:
/** @immutable */
final class ReflectionClass
{
// this is pure, by object definition
public function getName(): string { /* nothing magic here */ }
// this is not pure, but also not really part of the object
public static function getParent(self $self, Reflector $reflector): ReflectionClass { /* use the reflector for lookups here */ }
BTW, the reflection adapters that keep BC with ext-reflection
would not change: they'd still be affected by GC loop issues.