krakjoe/apcu

Preloading with `apc.preload_path` ignores `apc.serializer=igbinary`

tPl0ch opened this issue · 2 comments

Dockerfile to replicate

FROM php:8.1-cli-alpine

# https://pecl.php.net/package/apcu
ARG PHP_APCU_VERSION=5.1.21

# https://pecl.php.net/package/igbinary
ARG PHP_IGBINARY_VERSION=3.2.7

RUN apk add --no-cache --virtual .build-deps $PHPIZE_DEPS \
    && pecl install apcu-${PHP_APCU_VERSION} igbinary-${PHP_IGBINARY_VERSION} \
    && docker-php-ext-enable apcu igbinary \
    && apk del .build-deps

RUN mkdir -p /usr/local/etc/php/preload.d/

# Works as expected
RUN php -r 'file_put_contents("/usr/local/etc/php/preload.d/test.data", serialize(["foo", "bar"]));'
RUN php -d apc.serializer=php -d apc.enable_cli=1 -d apc.preload_path=/usr/local/etc/php/preload.d -r 'var_export(apcu_key_info("test"));'

# Will not preload
RUN php -r 'file_put_contents("/usr/local/etc/php/preload.d/test.data", igbinary_serialize(["foo", "bar"]));'
RUN php -d apc.serializer=igbinary -d apc.enable_cli=1 -d apc.preload_path=/usr/local/etc/php/preload.d -r 'var_export(apcu_key_info("test"));'

APCu only supports the https://www.php.net/manual/en/function.serialize.php and does the equivalent of unserialize() (does not support other formats)

All past/current/future releases of igbinary start with a 4-byte header version number. Normally that's \x00\x00\x00\x02, but I'm working on a version 3 \x00\x00\x00\x02 in that order of byte offsets 0..3(I'm a maintainer of igbinary)

An approach used in other igbinary integrations would be to check for those 2 values and that the length is 5 or more (need at least one byte of data). APCu would have to normally check if the igbinary serializer is loaded if it did that, using the normal steps to find a serializer (adapted for the module init phase(MINIT)). That seems doable if the apcu maintainers are interested

  • This is only available if igbinary.so is loaded before apcu.so in php ini files, I believe?
// relevant snippet from php_apc.c
		/* Avoid race conditions between MINIT of apc and serializer exts like igbinary */
		apc_cache_serializer(apc_user_cache, APCG(serializer_name));

static zval data_unserialize(const char *filename) and static int apc_load_data(apc_cache_t* cache, const char *data_file) are the relevant parts of the code

I'd expect some memory savings from the string deduplication when using this when there are repeated strings/arrays within individual entries (apc caches each entry separately, by design)

An alternative to consider might be to assume that foo.igbinary.data refers to the igbinary serializer and try that if igbinary is registered. This would also work for alternative serializers, such as foo.msgpack.data for msgpack. (Continue to default to php unserialize() for unrecognized names)