stackkit/laravel-google-cloud-tasks-queue

Log errors on gcr - Uncaught signal 14 (Alarm clock)

Closed this issue ยท 10 comments

Laravel Version: 11.34.2
PHP Version: 8.2
stackkit/laravel-google-cloud-tasks-queue Version: 4.2.1

Hi, on gcr I'm seeing this type of errors related to the POST call on /handle-task:

Screenshot 2024-12-03 alle 12 44 33

Uncaught signal: 14, pid=43, tid=43, fault_addr=0.
[core:notice] [pid 1:tid 1] AH00052: child pid 3563 exit signal Alarm clock (14)

They seem to be related to the timeout management (signal SIGALRM 14) regulated by the $timeout parameter of the jobs (default laravel 60s).
The jobs are processed correctly and also the related responses to the scheduler but it seems that the process remains active for the duration of the timeout and then exits with an error.

Do you also see this behavior or could it be something related to the docker image? The pcntl extension appears to be activated correctly.

Thanks.

jfradj commented

Hello,

I'm seeing the same errors on my Google App Engine application:
SIGALRM errors

I can't say what's the precise issue but I can 100% affirm is related to the library.
Indeed, I've create cloud tasks by hand on an API (instead of using job dispatch and the library) and everything is working as intended. Tasks are being processed successfully and no AppEngine instances are being killed.
So probably not related to docker, image or pcntl.

@marickvantuil any idea ๐Ÿ™ ?

Thanks

Thanks for the report. I've been debugging this issue the past few days and I've come to the conclusion that the pcntl functions don't play nice with the Cloud Run/AppEngine environment. I've tested this with Cloud Run with both this package and a dummy application that uses phpunit/php-invoker (package with similar timout management using pcntl extension too) and both have the same uncaught signal issue in Cloud Run.

Apparently the pcntl extension is more meant for cli processes and not so much for web requests. So knowing that, I'm currently working on an alternative solution based on set_time_limit. It needs some more testing but my initial testing looked promising. In case you're willing to help me verify, version v4.2.3-beta.5 can be used to test (the previous beta versions were attempts to make pcntl work ๐Ÿ™ƒ).

Thanks @marickvantuil i can test around january 7th.
Thanks again for the help!

@marickvantuil I finally came to the conclusion that trying to make Google Cloud Task a Laravel job provider is like trying to fit a square peg into a round hole. They're way too different.
I spent almost a full week on this subject, trying every open-source library without getting it to work or feeling confident about the quality and coherence.
That's why I finally coded my own library, which is heavily inspired by the Laravel's jobs but without relying on it at all.

I can create and pushed a task just like that:

ProcessStripeWebhook::dispatch($request->all())
            ->withTaskId($signature)
            ->onQueue('stripe')
            ->delay(10)
        ;

I'm in a big rush right now but I planned to create an open-source library and a medium post to explain my point of view and how I created it.

Hi @marickvantuil, about set_time_limit:

Apparently the pcntl extension is more meant for cli processes and not so much for web requests. So knowing that, I'm currently working on an alternative solution based on set_time_limit.

I fear that it is not the most suitable solution as I read in spec:

The set_time_limit() function and the configuration directive max_execution_time only affect the execution time of the script itself. Any time spent on activity that happens outside the execution of the script such as system calls using system(), stream operations, database queries, sleep, etc. is not included when determining the maximum time that the script has been running.

I try to verify the issue better, but if so I find it quite problematic in some scenarios.
What do you think?

I have done a little more reading and to my surprise there have been made changes to PHP's timeout handling so that time spent on stream operations/database queries/sleep is taken into account. It should already work with PHP 8.1+ with the flag enabled, and 8.3+ with ZTS enabed:

php/php-src#6504
php/php-src#10141

I do wish there were more and better options of solving this, because I have a feeling many applications don't have this enabled and it's currently unclear when this will be enabled for non-ZTS builds. Therefore I'm not 100% satisfied with the proposed solution! Having said that, I do think that having something is better than nothing in this case, and this might just be the only currently feasible option.

But any other ideas and solutions are much appreciated :-). If nothing comes up, I will merge the PR.

Great catch! I didn't notice this flag. At this point I agree that it's better to have something than nothing unless another solution comes to mind.
I'll check it out with my colleagues on Monday, in the meantime thanks.

Hi @marickvantuil, it wasn't immediate to compile and it includes an zts php 8.4 image with the --enable-zts and --disable-zend-signals flags (moreover I still don't understand why this change and configuration hasn't been documented, it seems that with php 8.3 Zend Max Execution Timers is enabled by default for ZTS builds on Linux) but from the first tests we've done at the moment everything seems to work properly.
Let's try to delve deeper with some other tests.
@jfradj Do you have any feedback or observations on this?
Thanks

Hi @marickvantuil sorry for the delay, from the tests we have done we have not found any problems, so in my opinion it could be released. What do you think?
Thanks.

Thanks @illambo for testing and no worries for the delay. I've released this fix in v4.3.0