Azure Container Apps Restarts every 30 seconds
Asked Answered
U

3

8

I have an Azure Container App that's based on the hosted BackgroundService model. It's essentially just a long running console app that overrides the BackgroundService.ExecuteAsync method and waits for the stop signal (via the passed cancellation token). When I run locally in Docker, it's perfect - everything runs as expected. When I deploy as an Azure Container App, it deploys and runs - although I manually had to set the scale minimum to 1 to get it to run at all - but it restarts every 30 secs or so which is obviously not ideal. My guess is that the Azure Container Apps docker host is somehow checking my instance for health and isn't satisfied so tries to restart it? Just a guess. What am I missing?

using FR911.DataAccess.Repository;
using FR911.Infrastructure.Commands;
using FR911.Utils;
using FR911.Utils.Extensions;
using SimpleInjector;

IHost host = Host.CreateDefaultBuilder(args)
    .ConfigureServices(services =>
    {
        services.AddFR911Log4NetConfig();        
        services.AddTransient<ICommandProcessor, CommandProcessor>();
        Container container = new Container();
        container.Register(typeof(ICommandHandler<,>), new List<Type>()
            {
                //typeof(CacheSyncCommandHandler),
            });

#if DEBUG
        container.Verify();
#endif

        services.AddSingleton<Container>(container);
        services.AddHostedService<Worker>();
    })
    .Build();

await host.RunAsync();
    public class Worker : BackgroundService
    {
        private readonly ILogger<Worker> _logger;
        private ICommandProcessor _commandProcessor;

        public Worker(ILogger<Worker> logger, ICommandProcessor cmdProcessor)
        {
            _logger = logger;            
            _commandProcessor = cmdProcessor;
        }
        
        protected override async Task ExecuteAsync(CancellationToken stoppingToken)
        {
            _logger.LogInformation("Worker starting at: {time}", DateTimeOffset.Now);

            DateTime? lastGC = null;
            while (!stoppingToken.IsCancellationRequested)
            {
                _logger.LogInformation("Worker running at: {time}", DateTimeOffset.Now);
                await Task.Delay(1000, stoppingToken);
            }
            _logger.LogInformation("Worker stopping at: {time}", DateTimeOffset.Now);
        }
    }
24 May 2022 12:10:46.5732022-05-24 12:10:46,248 FR911.Worker.Worker fr911worker-app-20--vki2kmn-cf5bff474-5w6mh INFO Worker starting at: 05/24/2022 12:10:46 +00:00
24 May 2022 12:10:46.5732022-05-24 12:10:46,249 FR911.Worker.Worker fr911worker-app-20--vki2kmn-cf5bff474-5w6mh INFO Worker running at: 05/24/2022 12:10:46 +00:00
24 May 2022 12:10:46.5732022-05-24 12:10:46,251 Microsoft.Hosting.Lifetime fr911worker-app-20--vki2kmn-cf5bff474-5w6mh INFO Application started. Press Ctrl+C to shut down.
24 May 2022 12:10:46.5732022-05-24 12:10:46,252 Microsoft.Hosting.Lifetime fr911worker-app-20--vki2kmn-cf5bff474-5w6mh INFO Hosting environment: Production
24 May 2022 12:10:46.5732022-05-24 12:10:46,336 Microsoft.Hosting.Lifetime fr911worker-app-20--vki2kmn-cf5bff474-5w6mh INFO Content root path: /app
24 May 2022 12:10:47.6402022-05-24 12:10:47,637 FR911.Worker.Worker fr911worker-app-20--vki2kmn-cf5bff474-5w6mh INFO Worker running at: 05/24/2022 12:10:47 +00:00
24 May 2022 12:10:48.6402022-05-24 12:10:48,638 FR911.Worker.Worker fr911worker-app-20--vki2kmn-cf5bff474-5w6mh INFO Worker running at: 05/24/2022 12:10:48 +00:00
24 May 2022 12:10:49.6392022-05-24 12:10:49,637 FR911.Worker.Worker fr911worker-app-20--vki2kmn-cf5bff474-5w6mh INFO Worker running at: 05/24/2022 12:10:49 +00:00
24 May 2022 12:10:50.6432022-05-24 12:10:50,638 FR911.Worker.Worker fr911worker-app-20--vki2kmn-cf5bff474-5w6mh INFO Worker running at: 05/24/2022 12:10:50 +00:00
24 May 2022 12:10:51.6422022-05-24 12:10:51,638 FR911.Worker.Worker fr911worker-app-20--vki2kmn-cf5bff474-5w6mh INFO Worker running at: 05/24/2022 12:10:51 +00:00
24 May 2022 12:10:52.6412022-05-24 12:10:52,638 FR911.Worker.Worker fr911worker-app-20--vki2kmn-cf5bff474-5w6mh INFO Worker running at: 05/24/2022 12:10:52 +00:00
24 May 2022 12:10:53.6622022-05-24 12:10:53,637 FR911.Worker.Worker fr911worker-app-20--vki2kmn-cf5bff474-5w6mh INFO Worker running at: 05/24/2022 12:10:53 +00:00
24 May 2022 12:10:54.6402022-05-24 12:10:54,638 FR911.Worker.Worker fr911worker-app-20--vki2kmn-cf5bff474-5w6mh INFO Worker running at: 05/24/2022 12:10:54 +00:00
24 May 2022 12:10:55.6382022-05-24 12:10:55,636 FR911.Worker.Worker fr911worker-app-20--vki2kmn-cf5bff474-5w6mh INFO Worker running at: 05/24/2022 12:10:55 +00:00
24 May 2022 12:10:56.6392022-05-24 12:10:56,637 FR911.Worker.Worker fr911worker-app-20--vki2kmn-cf5bff474-5w6mh INFO Worker running at: 05/24/2022 12:10:56 +00:00
24 May 2022 12:10:57.6402022-05-24 12:10:57,638 FR911.Worker.Worker fr911worker-app-20--vki2kmn-cf5bff474-5w6mh INFO Worker running at: 05/24/2022 12:10:57 +00:00
24 May 2022 12:10:58.6402022-05-24 12:10:58,638 FR911.Worker.Worker fr911worker-app-20--vki2kmn-cf5bff474-5w6mh INFO Worker running at: 05/24/2022 12:10:58 +00:00
24 May 2022 12:10:59.6402022-05-24 12:10:59,638 FR911.Worker.Worker fr911worker-app-20--vki2kmn-cf5bff474-5w6mh INFO Worker running at: 05/24/2022 12:10:59 +00:00
24 May 2022 12:11:00.6402022-05-24 12:11:00,637 FR911.Worker.Worker fr911worker-app-20--vki2kmn-cf5bff474-5w6mh INFO Worker running at: 05/24/2022 12:11:00 +00:00
24 May 2022 12:11:01.6432022-05-24 12:11:01,638 FR911.Worker.Worker fr911worker-app-20--vki2kmn-cf5bff474-5w6mh INFO Worker running at: 05/24/2022 12:11:01 +00:00
24 May 2022 12:11:02.6392022-05-24 12:11:02,637 FR911.Worker.Worker fr911worker-app-20--vki2kmn-cf5bff474-5w6mh INFO Worker running at: 05/24/2022 12:11:02 +00:00
24 May 2022 12:11:03.6402022-05-24 12:11:03,638 FR911.Worker.Worker fr911worker-app-20--vki2kmn-cf5bff474-5w6mh INFO Worker running at: 05/24/2022 12:11:03 +00:00
24 May 2022 12:11:04.6412022-05-24 12:11:04,637 FR911.Worker.Worker fr911worker-app-20--vki2kmn-cf5bff474-5w6mh INFO Worker running at: 05/24/2022 12:11:04 +00:00
24 May 2022 12:11:05.6492022-05-24 12:11:05,636 FR911.Worker.Worker fr911worker-app-20--vki2kmn-cf5bff474-5w6mh INFO Worker running at: 05/24/2022 12:11:05 +00:00
24 May 2022 12:11:06.6642022-05-24 12:11:06,638 FR911.Worker.Worker fr911worker-app-20--vki2kmn-cf5bff474-5w6mh INFO Worker running at: 05/24/2022 12:11:06 +00:00
24 May 2022 12:11:07.6392022-05-24 12:11:07,637 FR911.Worker.Worker fr911worker-app-20--vki2kmn-cf5bff474-5w6mh INFO Worker running at: 05/24/2022 12:11:07 +00:00
24 May 2022 12:11:08.6402022-05-24 12:11:08,638 FR911.Worker.Worker fr911worker-app-20--vki2kmn-cf5bff474-5w6mh INFO Worker running at: 05/24/2022 12:11:08 +00:00
24 May 2022 12:11:09.6402022-05-24 12:11:09,637 FR911.Worker.Worker fr911worker-app-20--vki2kmn-cf5bff474-5w6mh INFO Worker running at: 05/24/2022 12:11:09 +00:00
24 May 2022 12:11:10.6412022-05-24 12:11:10,637 FR911.Worker.Worker fr911worker-app-20--vki2kmn-cf5bff474-5w6mh INFO Worker running at: 05/24/2022 12:11:10 +00:00
24 May 2022 12:11:11.6392022-05-24 12:11:11,638 FR911.Worker.Worker fr911worker-app-20--vki2kmn-cf5bff474-5w6mh INFO Worker running at: 05/24/2022 12:11:11 +00:00
24 May 2022 12:11:12.6402022-05-24 12:11:12,637 FR911.Worker.Worker fr911worker-app-20--vki2kmn-cf5bff474-5w6mh INFO Worker running at: 05/24/2022 12:11:12 +00:00
24 May 2022 12:11:13.6402022-05-24 12:11:13,638 FR911.Worker.Worker fr911worker-app-20--vki2kmn-cf5bff474-5w6mh INFO Worker running at: 05/24/2022 12:11:13 +00:00
24 May 2022 12:11:14.6392022-05-24 12:11:14,636 FR911.Worker.Worker fr911worker-app-20--vki2kmn-cf5bff474-5w6mh INFO Worker running at: 05/24/2022 12:11:14 +00:00
24 May 2022 12:11:14.9312022-05-24 12:11:14,930 Microsoft.Hosting.Lifetime fr911worker-app-20--vki2kmn-cf5bff474-5w6mh INFO Application is shutting down...
Umberto answered 24/5, 2022 at 12:22 Comment(0)
V
19

I'm an Engineering Manager in Container Apps.

Your Container App was being restarted because it was failing the readiness probes.

If your Container App’s HTTP ingress is set to ‘Enabled’, the platform will try to ping it on the specified Target port (80 by default). If the platform can’t successfully ping it, it will be considered 'unhealthy' and will be restarted. Please refer to Health probes in Azure Container Apps to learn about the default health probes and how to specify your own settings.

If your Container App is not listening in the specified ingress port (for example, if your app is processing messages from a queue and not expecting external http requests) set HTTP ingress to ‘Disabled’. When HTTP ingress is set to ‘Disabled’, health probes won't be configured, and your app won't be pinged.

If your Container App is listening on the specified Targe port, but it requires a longer startup time, you can define a longer initial delay and/or longer period between pings.

Also, make sure that the Target port specified in the HTTP configuration is the same that is EXPOSEd in the dockerfile of your Container App

Vonnie answered 27/5, 2022 at 21:18 Comment(3)
Thanks Vini. Jeff Hollan had indicated the same to me in the Discord channel, although this is a bit confusing because the portal showed no Health Probes at all - even though I understand the default. The doc learn.microsoft.com/en-us/azure/container-apps/… doesn't clearly say much about it except for "java", although the config does, and the portal clearly does not. Maybe some friction you can remove? Thanks for the detailed response!Umberto
One side-effect of turning off ingress so that the health probes don't fail is that you can't currently update a containerapp via publish. The error is: Error code ContainerAppInvalidIngressTargetPort Message Ingress section of the container app must have a targetPort specified. TargetPort must be in the range of [1,65535].Umberto
Turning off ingress did resolve this for me as well. However, it also caused issues with publish. No errors during publish but its partial. 1) publish will push the image to the repo 2) but not switch to it 3) so you have to manually switch to it in Revision Management (Azure web) by adding a new one and then deleting the older one 4) this will then shut the old one down. Not great but its recoverable. Hopefully I can find a better solutionCown
U
0

Turning off Ingress for my service fixed the problem. Having it turned on and the service not providing any accessible endpoints seemed to be the problem.

enter image description here

Umberto answered 24/5, 2022 at 22:36 Comment(2)
typo: You meant turning on Ingress and having it off was the issue since health probes were not accessible.Halfhardy
No, turning off Ingress fixed my problem. Having Ingress on made the service probe for a response on the default ingress port. I did not have anything listening on that port, so the Container service thought my instance was unhealthy and continuously restarted it. With ingress off now it runs fine.Umberto
F
-1

In my case a temp. workaround if you need to access your services from outside : I had to change my python applications' port number to 80 , EXPOSE 80 in dockerfile and leaving the Ingress enabled and setting the target port 80 in Ingress settings on Azure.. But still I don't know what to do if you need your app on a different port, without health /readiness probe failed issue :(

    Connecting... (this Error happens ONLY when app is an other port instead of Port 80 !)
{"TimeStamp":"2023-07-21T12:13:58Z","Type":"Normal","ContainerAppName":null,"RevisionName":null,"ReplicaName":null,"Msg":"Connecting to the events collector...","Reason":"StartingGettingEvents","EventSource":"ContainerAppController","Count":1}
{"TimeStamp":"2023-07-21T12:13:59Z","Type":"Normal","ContainerAppName":null,"RevisionName":null,"ReplicaName":null,"Msg":"Successfully connected to events server","Reason":"ConnectedToEventsServer","EventSource":"ContainerAppController","Count":1}
{"TimeStamp":"2023-07-21 12:03:46 \u002B0000 UTC","Type":"Warning","ContainerAppName":"sample3","RevisionName":"sample3--v2","ReplicaName":"sample3--v2-5c979f6568-vkm2c","Msg":"readiness probe failed: connection refused","Reason":"ReplicaUnhealthy","EventSource":"ContainerAppController","Count":3}

enter image description here

Ferrocyanide answered 21/7, 2023 at 12:57 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.