Azure/azure-functions-dotnet-worker

Worker Process startup timeout (30 sec) prevents reliably debugging HostBuilder startup code

david-peden-q2 opened this issue ยท 47 comments

When launching an azure function with func start --dotnet-isolated-debug, visual studio pauses for a period of time to allow you to attach the debugger to the newly created process. This works fine.

What doesn't work well is that if you set a breakpoint anywhere in your HostBuilder code and cross the 30 second mark, the underlying process recycles because of a timeout which kills your debug session. It makes it exceedingly difficult, if not impossible, to debug anything non-trivial.

image

Thank you for opening this!

This is an instance where our debug behavior doesn't seem to be properly applied.

@ankitkumarr this would require a host change to correctly identify the debug scenario and ensure those timeouts are not applied. We do that for a number of things and looks like this was just missed. I'll loop in additional folks as this is likely impacting other languages as well.

Discussed in triage. Some design work will be required to settle on the right behavior, we'll then open the required issues in the appropriate repos.

Yes, same issue, can no longer effectively debug and develop azure function 5. It really seems like a downgrade and tooling is always a 2nd thought. So broken.

We've just recently starting investigating.NET 5 for our stack which is exclusively built on top of functions currently .net core 3.1, wanting to take advantage of all the perf and language improvements we want to start moving toward .NET 5 ahead of the release of .NET 6 in November.

So far it is nearly impossible to get a function instance running locally so that one can debug, there are some changes that the only way to know if your function runs is either debug or going log crazy, I think you know which we all prefer.

What's the outlook on the tooling to get functions on .NET 5 reliably debuggable?

@RiannVR -- what version of Visual Studio are you using? With 16.10, hitting F5 should launch the debugger and attach correctly: https://docs.microsoft.com/en-us/azure/azure-functions/dotnet-isolated-process-developer-howtos?pivots=development-environment-vs&tabs=browser#prerequisites.

Debugging startup code has an issue, but setting breakpoints in the function itself should work as expected.

I have an example visual studio .net project, with a basic set of timer functions that F5 will not work properly. #fail
FunctionApp2.zip

I'm suspecting the issue above is related to the monitor for the timer. @jwisener do you have a storage account and/or emulator connection string setup when debugging?

I just created a new azure function project in visual studio 2019 v 10.16.01, and setup 14 timer functions. So it's using the default, which is local storage emulator. The project is pretty basic (no added dependencies outside of what visual studio creates). It's pretty straight forward.

image

That was an example project to reproduce the issue we are having on an actual project. Our actual project, uses an azure storage account and it does the same thing. so it doesn't seem to matter if it's an actual azure "storage account" or the emulator.

Thanks. We'll take a look to see what is going on here. Can you confirm that running standalone (outside of VS/No debugger) runs as expected?

Yeah, it runs outside visual studio (in our azure environment fine). It's our locally debugging abilities that have been devastated. Well on our actual project, running outside of "visual studio" (via command line) and attaching the debugger, the issue we have: it's not loading the appsettings files we have properly. (aka running from the command line func host start --dotnet-isolated-debug and then attaching via the debugger). So we have to have visual studio launch it as an "executable" and then attach the debugger. (it's a pain), but that is the only way I could get the appsetting files to loaded properly. But, if you set a break in the startup, you've only got 30 seconds before the debugger drops and a new process is started.

We are not only finding this issue when debugging locally, but also when deployed to Azure. We have this deployed to a Windows Functions App Service Plan (Consumption model) and have the app settings set as below. Please can anyone advise what might be the issue here? The errors appear to be the same as in this issue and when debugging TimerTriggers locally in VS 2019. The deployed functions are seen as .NET 5 functions by the portal.
image
image

@brettsam I tried both with VS 16.9.4 and with Rider 2021.1.3, Rider never attaches due to the func startup not returning the worker PID and VS simply starts up, and then doesn't hit any breakpoint even though they exist in the program.cs, I've also successfully attached by starting the func in cmd and then a Debugger.Launch which starts up VS.

@DavidGodbold it seems you're getting function timeouts https://docs.microsoft.com/en-us/azure/azure-functions/functions-scale#timeout the default is 5 minutes, it seems to be cancellable the current execution and then logging the end of the previous execution.

One thing that I found 'odd' was that with dotnet-isolated you don't get cancellation token support so how does it cancel really then if you are doing an outbound call?

Thanks @RiaanVR, but these TimerTrigger functions work OK if I run the code locally using the Core Tools invoked from Powershell, but suffer the timeout issue when using the VS native debugging. However the real blocker is the fact that our functions suffer the same timeout issue even when deployed to Azure.

@brettsam I'm disappointed to have issue #488 merged into this issue as we are suffering the same timertrigger function timeout issue even when the functions are deployed to an Azure functions instance. These functions run fine from my local machine when running func.exe using the Core Tools v3 though.

@DavidGodbold I want to make sure we're not missing something about your scenario. #488 is about a VS debugging issue and you're talking about running into this with a deployed App. Are you running into problems with a remote debugging scenario? Or purely running into timeouts outside of debugging scenarios?

I'm also having this issue now, locally. I'd been debugging TimerTriggers locally a couple of months ago so I'm not sure what changed. My functions work in Azure.

@DavidGodbold I want to make sure we're not missing something about your scenario. #488 is about a VS debugging issue and you're talking about running into this with a deployed App. Are you running into problems with a remote debugging scenario? Or purely running into timeouts outside of debugging scenarios?

@fabiocav I am having issues with running into timeouts outside of debugging scenarios (as well as experiencing the issues raised within this issue). Please see my comment here

Just installed 16.10.2 released today and a new Functions project is still broken out of the box. What do you need from us to help you find and fix this issue?

image

Just a quick update. Creating a function app using the Azure CLI and deploying using the Azure Pipelines seemed to work and the functions work as expected. It probably needs to be made clear that creating a function app using the Portal and changing the app settings to run dotnet-isolated will not work. This was using a consumption plan.

az functionapp create --name $functionAppName --storage-account $storageName --consumption-plan-location $region --resource-group $resourceGroup --subscription $subscription --runtime dotnet-isolated --runtime-version 5.0 --functions-version 3

The fix has now been merged into the host. Just pending deployment at this point.

Keeping this open until the Core Tools version with the changes is available.

What will be the steps to configure timeout? Is it per-app or global for the (isolated) worker?

You shouldn't have to touch anything here. When debugging, the tooling will extend the worker timeout automatically.

The fix has now been merged into the host. Just pending deployment at this point.

Keeping this open until the Core Tools version with the changes is available.

Do you know when the fix will be released and be integrated into VS?

@brettsam - Could this also be related to the same issue? https://developercommunity.visualstudio.com/t/Azure-functions-debug-fails-to-start-on-/1474373

There is a sample repo available in the devcom issue.

@vijayrkn it's a slightly different issue, but this fix will help with it. Though, the repro shared there seems to require additional (unexpected) steps, so I'd like to validate that. There's some additional work we're considering in the WebJobs SDK/Host layer to address the timer initialization issue

@ThreeSevenths the Core Tools update will happen once global deployment of the changes completes, which we estimate will happen at the end of next week (by July 30th), due to deployment restrictions. We can, however, share details on how you can use the version of Core Tools that targets that version before then, if needed.

I am still getting the timeout error but only for Timer functions. When we comment out the timer function and leave the others running, debug tools work fine. Do we have a fix to this problem yet?
image

I am having the same issue as @ImranShafqat-123... I confirmed that I am running the latest versions of VS (16.11.1), Microsoft.Azure.Functions.Worker (1.5.1), and Microsoft.Azure.Functions.Worker.Extensions.Timer (4.1.0). The confusing part is that I was able to get my app with a queue trigger function and a timer trigger function to work end of last week but am not able to do so this week. Any help on this would be greatly appreciated!

I can get around the issue by changing a bit the ncrontab value. Say going from "0 0 3 * * 0" to "0 0 4 * * 0" and F5 it again it works. BTW, I trigger the timer manually using postman trick.

I just encountered this same issue with a timer function. I found it seems to occur when it tries to fire a past due timer on startup.

For example, if I have an ncrontab expression of "0 */1 * * * *" and run it initially it runs fine. If I give it a minute and let it pass the time it should have fired I see the follow details in the debug:

image

It's consistent for me that every time I let enough time to by to where it's firing an IsPastDue execution Visual Studio hangs for a bit and the worker process start fails. I frequently see the below window while it's trying to fire up:

image

I can also confirm that @cgronseth's suggestion of changing the ncrontab expression makes it work again. I'm guessing behind the scenes this clears out the past due details so that it no longer thinks it has any past due executions.

Once you get to the point where it would need to run again, though, it begins erroring out again.

Running into the same issue here. Azure Function app in Visual studio 16.11.3 with .Net5 gets the "Starting worker process failed" when trying to debug the next day or after the scheduled event is past due.
All dependencies are at the latest version
image

Workaround as mentioned (@cgronseth) is to change the cron schedule slightly.
Pretty horrible debugging experience at the moment, is there a better work-around or any ETA on a fix?

The workaround for me was to clear the blob service in the AzureStorageEmulator.exe, but still waiting for a definitely solution.

clear blob

I'm suffering the same problem in VS2019 v16.11.3 running my .NET 5 isolated function, whether I run against the Azure Storage Emulator or Azurite.

Managed to work around it using VSCode with the latest version of the Azure Functions Core Tools (3.0.3734). This lets me debug the function just fine and, so far, has never hung.
Not the ideal experience but gets things going until this is resolved.

Can confirm that a past due timer interval also causes debugging to hang when trying to use the direct from VS launch option. Very annoying, causes the development experience for our team to be a mess.

Visual Studio 2022 Preview 4.1 works just fine. VS 2019 Version 16.11.4 isn't working. Same problem as everybody!
Azure Function, Timer Trigger, .NET 5.0 Just the template, nothing else

I tried some things here

Loaded the project started in 2022 into 2019 no luck
Loaded the project started in 2019 into 2022 no luck
Started new project using .NET Framework rather than .NET 5 in 2019 no luck
Modified the crontab no luck
Added RunOnStartup no luck

The workaround for me was to clear the blob service in the AzureStorageEmulator.exe, but still waiting for a definitely solution.

clear blob

Our current work around in VS2019 is that we added a Post-Build event that clears the blob section of the Storage Emulator as mentioned by Flavio.
Works well for us and dramatic improvement in the debugging experience for the developers.

Sadly, nothing works. We changed for a Worker Service and now things are perfectly working.

satvu commented

Part 2 of the fix has been merged into core tools. Pending deployment.

Note on how time triggers are related to the initial issues with debugging.

@david-peden-q2 / @jwisener what version of Core Tools are you running? Are you still seeing this with the latest version of the tools?

For v3.0, this was fixed in Core Tools 3.0.3904

This issue has been automatically marked as stale because it has been marked as requiring author feedback but has not had any activity for 4 days. It will be closed if no further activity occurs within 3 days of this comment.

@msftbot - Like many, I started looking at Functions in the simplest way possible - let me run something 1 minute into the future. On .NET3.1, this worked great. I then went to add EF Core, which needed to update logging to 5, and created a nightmare. The solution, it seemed, was to update to .net5, and use the new isolated process. As this thread attests, that has never worked. Like the original author, and countless others, I came to this thread to help the team debug the issue. I have been frustrated to the point of not using functions in any project by the lack of any progress to fix this problem. The salt is your bot automatically marking an issue that you have not fixed stale. It's not for lack of community involvement.

-Trevor

I am using Core Tools 3.0.3904 and the issue has been resolved for my timer-triggered function that was encountering the timeout problem. Thanks!

@fabiocav - It is fixed for me finally! I am running 3.0.3904 and I can debug again! Thank you all.

@fabiocav it seems to be fixed. I did have to clear my storage acct, before I could get it resolved.

This issue has been automatically marked as stale because it has been marked as requiring author feedback but has not had any activity for 4 days. It will be closed if no further activity occurs within 3 days of this comment.