Azure/azure-functions-host

Flex Consumption timer functions are killed

Opened this issue · 11 comments

Investigative information

Please provide the following:

  • Timestamp: 2024-10-16T10:00:00.0021088Z
  • Function App version: 4
  • Function App name: flexconsumptiontimer-test-func
  • Function name(s) (as appropriate): TimerFunction
  • Invocation ID: c2c3130e-7709-4710-af5c-ee2b4c024f95
  • Region: northeurope

Repro steps

  1. Create a simple timer-triggered C# function like this:
Function source code
using Microsoft.Azure.Functions.Worker;
using Microsoft.Extensions.Logging;

namespace FlexConsumptionTimer
{
    public class TimerFunction
    {
        private readonly ILogger _logger;

        public TimerFunction(ILoggerFactory loggerFactory)
        {
            _logger = loggerFactory.CreateLogger<TimerFunction>();
        }

        [Function("TimerFunction")]
        public async Task Run([TimerTrigger("0 0 10 * * *")] TimerInfo myTimer, CancellationToken cancellationToken)
        {
            _logger.LogInformation("C# Timer trigger function executed at: {time}", DateTime.Now);

            // Register a callback to log when the function is cancelled
            // so that we can track if Functions worker is cancelling the function
            using var reg = cancellationToken.Register(() =>
            {
                _logger.LogWarning(
                    "Someone (user or Azure Functions host) requested to cancel the job! Trace: {trace}",
                    Environment.StackTrace);
            });

            await Task.Delay(1000 * 60 * 5, cancellationToken); // Wait for 5 minutes

            if (myTimer.ScheduleStatus is not null)
            {
                _logger.LogInformation("Next timer schedule at: {time}", myTimer.ScheduleStatus.Next);
            }
        }
    }
}
Program.cs
using Microsoft.Azure.Functions.Worker;
using Microsoft.Extensions.DependencyInjection;
using Microsoft.Extensions.Hosting;

var host = new HostBuilder()
    .ConfigureFunctionsWebApplication()
    .ConfigureServices(services =>
    {
        services.AddApplicationInsightsTelemetryWorkerService();
        services.ConfigureFunctionsApplicationInsights();
    })
    .Build();

host.Run();
Package references
<PackageReference Include="Microsoft.Azure.Functions.Worker" Version="1.23.0" />
<PackageReference Include="Microsoft.Azure.Functions.Worker.Extensions.Http" Version="3.2.0" />
<PackageReference Include="Microsoft.Azure.Functions.Worker.Extensions.Http.AspNetCore" Version="1.3.2" />
<PackageReference Include="Microsoft.Azure.Functions.Worker.Extensions.Timer" Version="4.3.1" />
<PackageReference Include="Microsoft.Azure.Functions.Worker.Sdk" Version="1.18.1" />
<PackageReference Include="Microsoft.ApplicationInsights.WorkerService" Version="2.22.0" />
<PackageReference Include="Microsoft.Azure.Functions.Worker.ApplicationInsights" Version="1.4.0" />
  1. Create a new Azure Functions instance and necessary dependencies:
az group create --name "<RG name>" --location "northeurope"
az storage account create --name "<Storage Account name>" --location "northeurope" --resource-group "<RG name>" --sku Standard_LRS --allow-blob-public-access false --https-only
az functionapp create --resource-group "<RG name>" --name "flexconsumptiontimer-test-func" --storage-account "<Storage Account name>" --flexconsumption-location "northeurope" --runtime dotnet-isolated --runtime-version 8.0
  1. Deploy the previously created function:
func azure functionapp publish flexconsumptiontimer-test-func

Expected behavior

Function execution finishes successfully.

Actual behavior

Function execution is being killed by the host within a minute of starting the function.
It looks like there is some scaling down involved (that's my guess based on DrainMode mention in logs), which kills the ongoing execution of the function?

Known workarounds

Manual rerun of the function seems to work.

Related information

We had this issue earlier too (back in August) - then a simple package update seemed to help as the functions were working fine till the beginning of October.

This issue is not occurring on the Consumption plan.

Programming language used: C#
Bindings used: Timer

Logs
10/16/2024, 10:00:00.002 AM Executing 'Functions.TimerFunction' (Reason='Timer fired at 2024-10-16T09:59:59.9998519+00:00', Id=c2c3130e-7709-4710-af5c-ee2b4c024f95)
10/16/2024, 10:00:25.346 AM DrainMode mode enabled
10/16/2024, 10:00:25.346 AM Calling StopAsync on the registered listeners
10/16/2024, 10:00:25.347 AM Stopping the listener 'Microsoft.Azure.WebJobs.Host.Listeners.SingletonListener' for function 'TimerFunction'
10/16/2024, 10:00:25.356 AM Someone (user or Azure Functions host) requested to cancel the job! Trace: at System.Environment.get_StackTrace() at FlexConsumptionTimer.TimerFunction.<Run>b__2_0() in <...>\FlexConsumptionTimer\TimerFunction.cs:line 24 at System.Threading.CancellationTokenSource.Invoke(Delegate d, Object state, CancellationTokenSource source) at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state) at System.Threading.CancellationTokenSource.ExecuteCallbackHandlers(Boolean throwOnFirstException) at Microsoft.Azure.Functions.Worker.Handlers.InvocationHandler.TryCancel(String invocationId) in D:\a\_work\1\s\src\DotNetWorker.Grpc\Handlers\InvocationHandler.cs:line 147 at Microsoft.Azure.Functions.Worker.GrpcWorker.InvocationCancelRequestHandler(InvocationCancel request) in D:\a\_work\1\s\src\DotNetWorker.Grpc\GrpcWorker.cs:line 127 at Microsoft.Azure.Functions.Worker.GrpcWorker.ProcessRequestCoreAsync(StreamingMessage request) in D:\a\_work\1\s\src\DotNetWorker.Grpc\GrpcWorker.cs:line 109 at System.Runtime.CompilerServices.AsyncMethodBuilderCore.Start[TStateMachine](TStateMachine& stateMachine) at Microsoft.Azure.Functions.Worker.GrpcWorker.ProcessRequestCoreAsync(StreamingMessage request) at Microsoft.Azure.Functions.Worker.GrpcWorker.<>c__DisplayClass11_0.<Microsoft.Azure.Functions.Worker.Grpc.IMessageProcessor.ProcessMessageAsync>b__0() in D:\a\_work\1\s\src\DotNetWorker.Grpc\GrpcWorker.cs:line 66 at System.Threading.Tasks.Task`1.InnerInvoke() at System.Threading.ExecutionContext.RunFromThreadPoolDispatchLoop(Thread threadPoolThread, ExecutionContext executionContext, ContextCallback callback, Object state) at System.Threading.Tasks.Task.ExecuteWithThreadLocal(Task& currentTaskSlot, Thread threadPoolThread) at System.Threading.ThreadPoolWorkQueue.Dispatch() at System.Threading.PortableThreadPool.WorkerThread.WorkerThreadStart()
10/16/2024, 10:00:25.380 AM Executed 'Functions.TimerFunction' (Failed, Id=c2c3130e-7709-4710-af5c-ee2b4c024f95, Duration=25380ms)
10/16/2024, 10:00:25.414 AM Stopped the listener 'Microsoft.Azure.WebJobs.Host.Listeners.SingletonListener' for function 'TimerFunction'
10/16/2024, 10:00:25.415 AM Call to StopAsync complete, registered listeners are now stopped

Hi @konrad11901 Thanks for reporting I have checked and created same for run every min same funct it working fine. locally.can you check it again.Image

@bhagyshricompany I checked that and locally it seems to be working fine. So it looks like the issue occurs only after deployment to Azure.

Hi @konrad11901 Thanks for your patience can you check it again it seems working.Image

@bhagyshricompany It's still not working for me for timer-triggered runs. The function starts, but it's being killed <1min after.
Image

can you try with some other plan like consumption and different region.please.

@bhagyshricompany Sorry for a late response. A few days ago I deployed a new Flex Consumption function app in Sweden Central and it seems to work just fine (last 5 daily executions finished successfully).
Today I created a brand new North Europe-based Flex Consumption function (used the same template as for creating an app in Sweden Central) and the issue still occurs. It also still occurs on the original function app that I deployed in North Europe (there hasn't been a single successfully finished execution).
So the problem might occur only in North Europe...? I haven't tested other regions yet.

Other plans (Consumption, Premium, App Service Plan) were working always fine for us, this issue is specifically about Flex Consumption.

For reference, here are the details about the Sweden Central app:

  • Function App name: flexconsumptiontimer2-test-func
  • Function name(s) (as appropriate): TimerFunction
  • Invocation ID: 5572594e-159b-462c-9260-a07d4200e19e (last run's ID)

@satvu please comment.Thanks

@pragnagopa and @paulbatum are investigating.

I am having this issue on East US with a .NET 8 Isolated Flex Consumption function app that only has a single Timer function (there should only ever be 1 instance running).

This is the order of operations of an unsuccessful run. First the Drain Mode message appears, then it calls StopAsync on the SingletonListener which cancels the running application. The order of logs including DrainMode, StopAsync, etc, is the same exact order as when a run is successful and the SingletonListener.StopAsync is subsequently called. Maybe it thinks the run completed when it hasn't? I'm using an async function returning a task with its CancellationToken passed through the whole application. It is not a durable function. Most of the runs are unsuccessful, but I'd say about 1/3 of them complete. The runs only take up to 30-40 seconds to complete, but it can be canceled as early as 5 seconds in.

Here's an unsuccessful run. The timestamps read bottom to top.

Image

Here's a successful run:

Image

We are having the same issues with both Queue and Timer triggers:
Azure/azure-functions-dotnet-worker#2900

All invocations get cancelled after a few seconds or sometimes even a minute, essentially our function apps are unusable.

same here - it is durable function, net-isolated, sometimes killed while executing any of the app activities
noticed that if the first log message is
POST /AzureFunctionsRpcMessages.FunctionRpc/EventStream
then in between every Executing Executed messages there is "Invoke" dependency message in the log
in this case the app is NOT prematurely killed

but when function wake up and above message is missing
then there is no "Invoke" dependency message in the log in between every Executing Executed messages
and the app is killed in 2-5 sec

it starts again shortly but this redundant restart complicates my life

Consumption AU-East