No such file or directory: mod_wsgi : Unable to connect to WSGI daemon process 'web2py' on '/var/run/apache2/wsgi.30303.0.1.sock'
Asked Answered
O

1

7

The webapp is deployed on EC2 and following error is faced randomly once or twice a day making the webapp inaccessible for some period of time. It is automatically corrected after some time.

(2)No such file or directory: [client xxx.xx.xx.xxx:xxxxx] mod_wsgi (pid=xxxxx): Unable to connect to WSGI daemon process 'web2py' on '/var/run/apache2/wsgi.30303.0.1.sock'.

Application Stack web2py mod_wsgi Apache2

The logs are different every time before the error:

[Thu Sep 28 06:25:01.528334 2017] [mpm_event:notice] [pid 30303:tid 140438078609280] AH00493: SIGUSR1 received.  Doing graceful restart
[Thu Sep 28 06:25:02.318551 2017] [ssl:warn] [pid 30303:tid 140438078609280] AH01906: ip-172-31-0-91.eu-west-1.compute.internal:443:0 server certificate is a CA certificate (BasicConstraints: CA == TRUE !?)
[Thu Sep 28 06:25:02.318574 2017] [ssl:warn] [pid 30303:tid 140438078609280] AH01909: ip-172-31-0-91.eu-west-1.compute.internal:443:0 server certificate does NOT include an ID which matches the server name
[Thu Sep 28 06:25:02.318664 2017] [wsgi:warn] [pid 30303:tid 140438078609280] mod_wsgi: Compiled for Python/2.7.11.
[Thu Sep 28 06:25:02.318669 2017] [wsgi:warn] [pid 30303:tid 140438078609280] mod_wsgi: Runtime using Python/2.7.12.
[Thu Sep 28 06:25:02.319205 2017] [mpm_event:notice] [pid 30303:tid 140438078609280] AH00489: Apache/2.4.18 (Ubuntu) OpenSSL/1.0.2g mod_wsgi/4.3.0 Python/2.7.12 configured -- resuming normal operations
[Thu Sep 28 06:25:02.319225 2017] [core:notice] [pid 30303:tid 140438078609280] AH00094: Command line: '/usr/sbin/apache2'
[Thu Sep 28 06:25:09.327495 2017] [mpm_event:error] [pid 30303:tid 140438078609280] AH00485: scoreboard is full, not at MaxRequestWorkers
[Thu Sep 28 06:28:39.560285 2017] [mpm_event:error] [pid 30303:tid 140438078609280] AH00485: scoreboard is full, not at MaxRequestWorkers
[Thu Sep 28 06:45:27.583870 2017] [wsgi:error] [pid 30307:tid 140437629064960] (2)No such file or directory: [client 172.31.32.163:24210] mod_wsgi (pid=30307): Unable to connect to WSGI daemon process 'web2py' on '/var/run/apache2/wsgi.30303.0.1.sock'.
[Thu Sep 28 06:49:14.503732 2017] [wsgi:error] [pid 30307:tid 140437603886848] (2)No such file or directory: [client 172.31.14.173:37726] mod_wsgi (pid=30307): Unable to connect to WSGI daemon process 'web2py' on '/var/run/apache2/wsgi.30303.0.1.sock'.

Let me know if more information is required.

Ophiolatry answered 28/9, 2017 at 7:20 Comment(0)
T
6

This is caused by doing a graceful restart of Apache when HTTP clients are using keep alive connections and issuing multiple requests over the same connection.

The problem is that the way Apache manages the mod_wsgi daemon process means they are shutdown straight away still even if is a graceful restart. In the meantime, the Apache child worker processes which accept the requests initially and proxy to the mod_wsgi daemon processes will keep running until all client connections drop. This means that when have keep alive connections and subsequent request over same client connection needs to go to the WSGI application, that it will fail as the prior incarnation of the mod_wsgi daemon processes are now gone.

In this situation one can't allow the old Apache child worker process to connect to new mod_wsgi daemon processes as the reason for a restart may have been a configuration change and allowing old child worker process to connect to new instances of daemon processes, could introduce a security problem if under the new configuration the request being handled in that way was not allowed.

Do accept this is a rare scenario, and likelihood of a security issue arising is slim. It is probably reasonable to consider a new option to mod_wsgi to say that connecting to newer daemon process in this case is okay, and not rotate the listener socket for the daemon processes on any restart.

That this can occur has been known all along (10 years), but an issue for it has been created on GitHub against mod_wsgi to consider such an option.

Tryout answered 28/9, 2017 at 7:46 Comment(5)
Update. Issue closed in mod_wsgi 4.6.0 thanks to the WSGISocketRotation directive.Trio
I have those errors and they can last for hours after the logrotate, with hundreds of error lines still being written in the old log file. I'm serving a REST API. I guess the clients keep their connection alive a long time. I saw this other answer from @Gragam-Dumpleton and set KeepAlive Off in apache config.Trio
It happened again today, despite KeepAlive being set to Off in the VirtualHost config. I don't get it. (Using mod_wsgi 4.5.11)Trio
Also, using the same browser instance (same firefox tag) as a client, I sometimes get a 503, sometimes the expected response. Is the browser maintaining several open connections in parallel?Trio
Looks like WSGISocketRotation does not work with mod_wsgi 4.6.5. I added WSGISocketRotation Off to the server.conf but wsgi deamon is always startet with other Port. Error 503 still occurs after reload. Any suggestions?Certitude

© 2022 - 2024 — McMap. All rights reserved.