Idea: Avoid fragmentation/cache clears internally

Question

Idea: Avoid fragmentation/cache clears internally

TysonAndre opened this issue 2 years ago · 0 comments

Related to #369

Making a note on approaches I'd thought of for avoiding fragmentation to refer to later

If apc_sma_malloc() fails in apc_persist(), maybe it could fall back to the fallback serializer and try again if the serialization was significantly smaller.

Also see #472 - igbinary should result in smaller objects than php's serialize() due to the binary representation and deduplication.
#403 - if objects are over a certain size, then it might make sense to try to compress their *serialize() output with a string compression algorithm if a large enough block can't be allocated (or just all the time for large enough blocks if configured by the user)
BEST_FIT_LIMIT is hardcoded at 3. Could it be made configurable to allow anything from 3...100 (might be tolerable in applications that are mostly reads)?
In the event of ties, it might make sense to prefer allocating the smaller blocks at lower addresses to avoid fragmentation, so that small blocks will be close together (e.g. try to minimize (double)(address - start + TOTAL_SHARED_MEMORY_SIZE) * (cur->size - realsize) only when realsize is under some threshold)
The logic around MINBLOCKSIZE is out of sync with apcu's use case for the allocator - raise it to the calculated smallest entry for the configuration? #define MINBLOCKSIZE (ALIGNWORD(1) + ALIGNWORD(sizeof(block_t))) - that allows you to have a min block size of 40. But apcu_store('', true); results in a mem_size of 120 on 64-bit platforms (to track metadata, create the extra fields of zend_string, etc) according to apcu_cache_info. **So it seems like it's possible to split and create wasted blocks of size 40-119 which would never get used if an application allocates variable size small objects, worsening fragmentation. **

	if (cur->size == realsize || (cur->size > realsize && cur->size < (realsize + (MINBLOCKSIZE + fragment)))) {
		/* cur is big enough for realsize, but too small to split - unlink it */
		*(allocated) = cur->size - block_size;
		prv = BLOCKAT(cur->fprev);
		prv->fnext = cur->fnext;
		BLOCKAT(cur->fnext)->fprev = OFFSET(prv);
		NEXT_SBLOCK(cur)->prev_size = 0;  /* block is alloc'd */

Maybe APCu could use a slab allocator if the size of the data + entry (similar to PHP's emalloc in php-src/Zend/zend_alloc.c and aligned to a page size (4KB in emalloc's case) with fixed sizes for segments, or https://github.com/memcached/memcached/blob/master/slabs.c)

A simple approach that might reduce the number of very small objects would be to allocate large blocks (e.g. 64KB long) for data less than 4KB long, and use the original approach for allocating small nodes within those large blocks.

(So instead of a hundred 160 byte objects contributing to fragmentation, there would be 1 block of size 16KB mixed with other normal blocks of size 4KB or larger)

Reusing something that already exists might be better, though APCu may lose some introspection capabilities. https://www.boost.org/doc/libs/1_80_0/doc/html/interprocess/allocators_containers.html#interprocess.allocators_containers.allocator_introduction.allocator was a project I saw, though I'm not familiar with it and I assume C++ isn't practical

For use cases with millions of keys or heavy writes/deletes/expires with variable size keys (I don't use apcu for that use case), the worst case performance would improve by switching to the slab allocator (similar to zend_alloc.c) to avoid the performance issues from potentially walking through a large number of tiny linked list nodes in sma_allocate to find one that's large enough

Userland workarounds:

Raise the apc.cache_size. The default of 32M is way too small for modern servers
Install igbinary with apcu support configured and set apc.serializer=igbinary in ini settings to make cache entries smaller
Maybe use APCuIterator to check if entries are past their expiry date or no longer needed, manually and call it from an admin-only php endpoint , or manually delete small keys that ended up with large addresses when the reported fragmentation is large enough, or delete keys that weren't accessed recently

(I'm not too familiar with this, but believe that keys that are expired only get evicted under certain circumstances, such as apcu_store affecting the same cache slot and encountering the expired node during the linked list traversal)
Set up memcached/redis servers running alongside your web servers and use the memcached/redis pecls to read/write to those on 127.0.0.1 if you do have a lot of keys and run into fragmentation in your use case