newrelic/newrelic-php-agent

Thousands of E_WARNING "Failed to open stream: No such file or directory" after upgrading from 10.17.0.7 to 10.18.0.8

ruudk opened this issue ยท 26 comments

ruudk commented

Description

We bumped from 10.17.0.7 to 10.18.0.8 and noticed lots of E_WARNING errors being reported:

include(/var/www/app/var/cache/prod/pools/system/9HGRhXbDyv/N/J/lJlB08wWBKnILB1B6L9w): Failed to open stream: No such file or directory

Steps to Reproduce

I don't know

Expected Behavior

No warning.

Relevant Logs / Console output

include(/var/www/app/var/cache/prod/pools/system/9HGRhXbDyv/N/J/lJlB08wWBKnILB1B6L9w): Failed to open stream: No such file or directory

 in Symfony\Component\Cache\Adapter\PhpFilesAdapter::doFetch called at /var/www/app/vendor/symfony/cache/Traits/AbstractAdapterTrait.php (193)
 in Symfony\Component\Cache\Adapter\AbstractAdapter::getItem called at /var/www/app/vendor/doctrine/persistence/src/Persistence/Mapping/AbstractClassMetadataFactory.php (200)
 in Doctrine\Persistence\Mapping\AbstractClassMetadataFactory::getMetadataFor called at /var/www/app/vendor/doctrine/orm/src/EntityManager.php (318)
 in Doctrine\ORM\EntityManager::getClassMetadata called at /var/www/app/vendor/doctrine/orm/src/UnitOfWork.php (3410)
 in Doctrine\ORM\UnitOfWork::getSingleIdentifierValue called at /var/www/app/vendor/doctrine/orm/src/AbstractQuery.php (434)
 in Doctrine\ORM\AbstractQuery::processParameterValue called at /var/www/app/vendor/doctrine/orm/src/AbstractQuery.php (479)
 in Doctrine\ORM\AbstractQuery::processArrayParameterValue called at /var/www/app/vendor/doctrine/orm/src/AbstractQuery.php (415)
 in Doctrine\ORM\AbstractQuery::processParameterValue called at /var/www/app/vendor/doctrine/orm/src/Query.php (458)
 in Doctrine\ORM\Query::resolveParameterValue called at /var/www/app/vendor/doctrine/orm/src/Query.php (402)
 in Doctrine\ORM\Query::processParameterMappings called at /var/www/app/vendor/doctrine/orm/src/Query.php (318)
 in Doctrine\ORM\Query::_doExecute called at /var/www/app/vendor/doctrine/orm/src/AbstractQuery.php (1212)
 in Doctrine\ORM\AbstractQuery::executeIgnoreQueryCache called at /var/www/app/vendor/doctrine/orm/src/AbstractQuery.php (1166)
 in Doctrine\ORM\AbstractQuery::execute called at /var/www/app/vendor/doctrine/orm/src/AbstractQuery.php (901)
 in Doctrine\ORM\AbstractQuery::getResult called at /var/www/app/src/Shared/SomeRepository.php (35)
 [truncated]]
 in Symfony\Component\HttpKernel\HttpKernel::handleRaw called at /var/www/app/vendor/symfony/http-kernel/HttpKernel.php (76)

Your Environment

Symfony v7.0
PHP 8.3.2
Debian 12

ruudk commented

I have the feeling the SDK is somehow hooking into either Doctrine ORM, DBAL or Symfony Cache.

ruudk commented

We have caching enabled in Doctrine, it seems to be related with that:

    $configurator->extension(
        'framework',
        [
            'cache' => [
                'default_redis_provider' => 'redis_connection',
                'pools' => [
                    'doctrine_metadata.cache' => [
                        'adapter' => 'cache.system',
                    ],
                    'doctrine_query.cache' => [
                        'adapter' => 'cache.system',
                    ],
                    'cache.local' => [
                        'adapter' => 'cache.adapter.filesystem',
                    ],
                ],
            ],
        ],
    );


$configurator->extension(
        'doctrine',
        [
            'orm' => [
                'entity_managers' => [
                    'default' => [
                        'metadata_cache_driver' => [
                            'type' => 'pool',
                            'pool' => 'doctrine_metadata.cache',
                        ],
                        'query_cache_driver' => [
                            'type' => 'pool',
                            'pool' => 'doctrine_query.cache',
                        ],
                        'result_cache_driver' => [
                            'type' => 'service',
                            // This is a custom result cache, that extends from \Doctrine\Common\Cache\Cache and stores in Redis.
                            'id' => CachePoolName::internalServiceId(CachePoolName::DOCTRINE_RESULT),
                        ],
                        'second_level_cache' => [
                            'enabled' => true,
                            'log_enabled' => false,
                        ],
                    ],
                ],
            ],
        ],
    );

Hi @ruudk , thanks for engaging here.

JIT in enabled automatically by PHP. Previously, PHP would disable it when it detected extensions hooking in to PHP the way that our agent did; however, now it no longer disables JIT with PHP agent 10.18.

Would you be able to try your setup with 10.18 agent and with PHP JIT disabled to rule that out as a factor?

I deployed 10.18.0.8 together with opcache.jit = 0 in php.ini, but still the same problem. Had to rollback immediately.

@ruudk Thanks for trying that out and updating us.
We'll continue to investigate on our end.
What version of Doctrine do you have?

Are you able to provide your opcache settings?

doctrine/dbal                         3.8.3
doctrine/orm                          2.19.0
[opcache]
opcache.enable = 1
opcache.memory_consumption = 192
opcache.max_accelerated_files = 99999
opcache.validate_timestamps = 0
opcache.revalidate_freq = 1
opcache.interned_strings_buffer = 32
opcache.enable_file_override = On
opcache.use_cwd = Off
opcache.file_update_protection = 0

No additional opcache jit settings? They are all default?

No other opcache settings, we take the default settings from php:8.3.3-fpm-bookworm

Ok, thank you.

Same problem here, I think the problem is when the error handler gets overridden, like in https://github.com/symfony/symfony/blob/7.1/src/Symfony/Component/Cache/Adapter/PhpFilesAdapter.php#L131 newrelic still reports these errors.

@mvanduijker
Correct insight and this is the line of investigation we were pursuing.

Prior to 10.18 any errors handled by custom error handlers like a framework's error handlers were hidden from the agent. Now PHP agent 10.18 sees all errors, regardless of what hander handled them (an enhancement in error detection) but does result in more errors being reported.

Hi. Is NR treating this as a bug, or a new behaviour we need to get used to? Or is this undecided?

Many Thanks

@zsistla in this case the framework tries to suppress the errors / warnings because they are expected and then handled by themselves. How should frameworks do that otherwise? How can we ignore those messages if this is a feature and not a bug?

@zsistla in this case the framework tries to suppress the errors / warnings because they are expected and then handled by themselves. How should frameworks do that otherwise? How can we ignore those messages if this is a feature and not a bug?

Hi @mvanduijker, thank you for the question.
It is a little more nuanced then the framework tries to suppress because they are expected.
First regardless of what error handler is used, these errors will always be handled by PHPs standard error handler:

[E_ERROR](https://www.php.net/manual/en/errorfunc.constants.php#constant.e-error), [E_PARSE](https://www.php.net/manual/en/errorfunc.constants.php#constant.e-parse), [E_CORE_ERROR](https://www.php.net/manual/en/errorfunc.constants.php#constant.e-core-error), [E_CORE_WARNING](https://www.php.net/manual/en/errorfunc.constants.php#constant.e-core-warning), [E_COMPILE_ERROR](https://www.php.net/manual/en/errorfunc.constants.php#constant.e-compile-error), [E_COMPILE_WARNING](https://www.php.net/manual/en/errorfunc.constants.php#constant.e-compile-warning)

Now let's take symfony as an example as it's not suppressing so much as handling them based on loglevel.
It uses a loglevel to determine whether to record an error or not. For instance, if the loglevel were WARNING, E_USER_DEPRECATED and E_DEPRECATED would not be recorded by the symfony handler, but all the following would:

        \E_NOTICE => [null, LogLevel::WARNING],
        \E_USER_NOTICE => [null, LogLevel::WARNING],
        \E_STRICT => [null, LogLevel::WARNING],
        \E_WARNING => [null, LogLevel::WARNING],
        \E_USER_WARNING => [null, LogLevel::WARNING],
        \E_COMPILE_WARNING => [null, LogLevel::WARNING],
        \E_CORE_WARNING => [null, LogLevel::WARNING],
        \E_USER_ERROR => [null, LogLevel::CRITICAL],
        \E_RECOVERABLE_ERROR => [null, LogLevel::CRITICAL],
        \E_COMPILE_ERROR => [null, LogLevel::CRITICAL],
        \E_PARSE => [null, LogLevel::CRITICAL],
        \E_ERROR => [null, LogLevel::CRITICAL],
        \E_CORE_ERROR => [null, LogLevel::CRITICAL],

So the above ERRORS were all considered worthy of recording; however, since they are handled by the symfony error handler none of them would be passed back to PHPs standard error handler aside from the errors that are always passed back to PHPs standard error handler (E_ERROR, E_PARSE, E_CORE_ERROR, E_CORE_WARNING, E_COMPILE_ERROR, E_COMPILE_WARNING). Previous to 10.18, if they were not handled by PHPs standard error handler, they were hidden from the agent, even the ones the symfony error handler found worthy of recording.

With PHP agent 10.18+, all errors are visible to the agent, but the side effect you are seeing is that more errors are being recorded. The agent does have an INI setting that specify what errors to record.
https://docs.newrelic.com/docs/apm/agents/php-agent/configuration/php-agent-configuration/#inivar-err-ignore-errors
Sets the error levels for the PHP agent to ignore. The value for this setting uses similar syntax to PHP's error_reporting option.
Please refer to the above link for examples on how to set this. For example, you could configure the PHP agent to ignore E_WARNING and E_DEPRECATION level errors.

So what does that mean, that all NR PHP SDK customers need to manually exclude these errors that are normally not visible?

I believe NR should handle this in the Symfony integration. And not bother users with it.

I'm seeing the same issue with the Smarty Template Engine. I don't want to set newrelic.error_collector.ignore_errors to exclude all E_WARNING. I wish there was at least a way to ignore errors if the error originated from a specific file, but I don't see an option for that. Ideally, the agent would revert back to the old behavior of not reporting errors that other error handlers handled/suppressed.

We're having the same issue with Laravel framework, since the upgrade we're receiving E_WARNING AND E_DEPRECATION even though they are set to be ignored in the php.ini.

Already tried to suppress it from the framework side but no success.

Differently than @ssigwart ๐Ÿ˜… I just set newrelic.error_collector.ignore_errors = E_DEPRECATED, as my main problem was E_DEPRECATED being reported as errors

@zsistla I think I was not really clear.
Some parts of a framework or library, a temporary error handler is created to catch and suppress expected warnings or errors.
I think this happens quite often with caching libraries and specific for file adapters. They will directly try to load a cache file without checking if the file exists (you will save a stat on a filesystem, this will increase performance) but will throw a warning when not exists (this is expected) and will be handled by the temporary custom error handler. We are now flooded by these warnings with this version of the agent.

One thing that I started to wonder is, @zsistla says

Prior to 10.18 any errors handled by custom error handlers like a framework's error handlers were hidden from the agent. Now PHP agent 10.18 sees all errors, regardless of what hander handled them (an enhancement in error detection) but does result in more errors being reported.

But shouldn't new relic respect the configuration on php.ini?
Because this is the current error_reporting configuration error_reporting = E_ALL & ~E_DEPRECATED & ~E_STRICT and deprecations still being reported to New Relic after the update.

We've downgraded to the 10.17.0.7 agent for now, having handled warnings recorded as errors is quite a nuisance and unwanted behaviour.
We would still like to see warnings in our logs, but not when they are expected and handled by the framework.

For Red Hat (yum) based distros you can use the versionlock plugin to lock the agent to 10.17.0.7:

yum install -y yum-plugin-versionlock
yum versionlock newrelic-daemon-10.17.0.7-1.x86_64 \
  newrelic-php5-10.17.0.7-1.x86_64 \
  newrelic-php5-common-10.17.0.7-1.noarch

But shouldn't new relic respect the configuration on php.ini?
Because this is the current error_reporting configuration error_reporting = E_ALL & ~E_DEPRECATED & ~E_STRICT and deprecations still being reported to New Relic after the update.

I agree, but also think it should also take into account things custom error handles suppress. I have error_reporting set to E_ALL because I want to know about errors, but then there are places where I want to suppress errors (e.g. from frameworks). If a framework was hiding errors from NewRelic, there's probably a good reason to do so because it's not actually an issue. If there are frameworks that are hiding true issues in the error handler, I think the error handler should be updated to handle it better (e.g. allow error handling to continue after the current handler or have it use the NewRelic API to report the error).

Thanks for all the feedback in this thread. We are continuing to investigate the issues and usage cases that are detailed here and the more details you provide the more we can work to resolve the problems.

A few notes:

  1. @airtonzanon regarding:

Differently than @ssigwart ๐Ÿ˜… I just set newrelic.error_collector.ignore_errors = E_DEPRECATED, as my main problem was E_DEPRECATED being reported as errors

Yes, this is a separate issue with a fix that will go out in the next release

  1. @airtonzanon
    We check error_reporting to see if we should/shouldn't record here:
    https://github.com/newrelic/newrelic-php-agent/blob/main/agent/php_error.c#L501
    Can you provide any more details of your setup (PHP version, framework version, etc) where you see the error_reporting getting ignored?

Just tried it out and it does indeed fix the problem. Thanks everyone ๐Ÿ’ช ๐Ÿ’™

I just tried it out too, but I'm still seeing lots of reported errors from the Smarty template engine.

The biggest sources of errors are :

  • Attempt to read property "value" on null
  • Trying to access array offset on value of type null
  • Undefined array key "..."

The library has this error handler that should be suppressing these:
https://github.com/smarty-php/smarty/blob/82397ec7f074c56844d2e489ac1c40ca2ca3e756/src/ErrorHandler.php#L70-L96

Sorry for the confusion. This did fix it. Thank you. It looks like the yum update didn't see the new package on a handful of our servers and those were the ones I was seeing errors from.