504 Gateway Timeout - Two EC2 instances with load balancer
Asked Answered
B

12

26

This might be the impossible issue. I've tried everything. I feel like there's a guy at a switchboard somewhere, twirling his mustache.

The problem:

I have Amazon EC2 running an application. It functions without issue when there is only one instance and no load balancer.

But in my production environment I have two identical instances running behind one load-balancer and when performing certain tasks, like a feature that generates a PDF and attaches it to an email, nothing happens at all, and when using Google Developer tools with the Network tab I get the error "504 Gateway Timeout" once the timeout hits (I have it set at 30 seconds).

My Database is external, on Amazon RDS.

I think.... If I could force a client to stay connected to their initial server they logged in at, this problem would be solved, because it's my understanding that the 504 Gateway Timeout is happening when instance-1 tries to reach out to instance-2 to perform the task.

This happens ONLY WHEN using Load Balancing, but never when connecting straight to one of my two servers.

Load Balancer Settings:

  • The load balancer has a CRECORD on my Registrar so that app.myapplication.com points to myloadbalancerDNSname.elb.amazonaws.com
  • The load balancer has 2 healthy instances, each in the same region but they are in different availability zones.
  • The load balancer is using the same Security Groups as the Instances (allow ALL IPs on ports 22, 80, and 443)
  • The load balancer has cross-zone load balancing turned on.
  • CORS (in Amazon S3) is enabled to GET, POST, PUT, DELETE from * to * (I have no idea how this is associated with my instances but anyway I did it as the instructions said)
  • The load balancer has listeners configured as such:
    • Load Balancer Protocol:HTTP Load Balancer Port:80 Instance Protocol:HTTP Instance Port:80
    • Load Balancer Protocol:HTTPS Load Balancer Port:443 Instance Protocol:HTTP Instance Port:80 (cipher chosen correctly per my Cert provider, and SSL fields 100% surely correct)

Some more ideas:

That being said, I'm not testing with HTTPS, but normal HTTP instead. I'm not convinced SSL is setup properly even though my certificate provider said it is. The reason I'm suspicious is that when I try to key in https://app.myapplication.com I get the error "(failed) net::ERR_CONNECTION_CLOSED" in Google Developer Tools, in the Network tab. But this should be non-applicable because I'm having the problem even using regular HTTP. I can troubleshoot SSL later.

So to reiterate, my problem is having the "504 Gateway Timeout" problem when using some functions, but also occasionally at random instead of loading the page (but rarely). This 504 problem happens ONLY WHEN using Load Balancing, but never when connecting straight to one of my two instances.

I don't know which question to ask, because I've Followed every document to the T, double and triple checked all suggestions all over the web and NOTHING.

Boiney answered 24/10, 2014 at 6:56 Comment(0)
B
4

In my case, it turns out that there was no problem with the load balancer. The final solution ending up being Ubuntu's hosts file in which there was an inexplicable entry to route traffic from some mystery IP to my application's host name. So, during the process of creating the PDF, paths were getting re-written by the PDF generator to point at the mystery server, and hence the Gateway timeout issues. I have no idea why it was occasionally working and not failing.

127.0.0.1 localhost
127.0.1.1 ubuntu-server
42.139.126.191 app.myapp.com

This is what it looked like, so I removed that third line and all the gears started turning again. :P

Boiney answered 5/1, 2015 at 9:33 Comment(0)
F
13

What web server are you using? I had a very similar issue with nginx and AWS load balancing. I added keepalive_timeout 75s; to the http block in my nginx config file and haven't see the issue since.

Make sure you restart nginx after you add and save that line (on ubuntu sudo service nginx restart. On redhat stop nginx /path/to/nginx/executable -s stop then /path/to/nginx/executable to start up nginx)

This fix was recommended by AWS on their help page AWS Load balancer troubleshooting

Fauver answered 31/12, 2014 at 22:55 Comment(1)
A similar fix for apache is KeepAlive On KeepAliveTimeout 75.Ious
L
8

First, what is the Idle Timeout for your ELB set to? You'll find it at the very bottom of the "Description" tab for your load balancer. You can read more about the idle timeout here in the ELB documentation. The default is 60 seconds. You should also consider setting or increasing Keep-alive in your web server. How you do that will depend on what web server you are using.

Second, if you think it's due to the client being switched from one instance to the other then you should enable session stickiness in the ELB. This will ensure that a client is always directed to the same back-end instance by the load balancer. To enable this, again go to the "Description" tab then click on the Edit link next to each entry in the Port Configuration section. You'll likely want to choose the "Enable Load Balancer Generated Cookie Stickiness" option since that will tell the ELB to manage all aspects of the stickiness.

Leanto answered 24/10, 2014 at 14:46 Comment(3)
----> Idle Timeout on the ELB is the default value, 60 seconds. On my 2 instances (inside of the configuration file /etc/apache2/apache2.conf since the webserver is Apache, so I hope this is the relevant file) I also found that Keep-alive is on, MaxKeepAliveRequests is 100, MaxAliveTimeout is 5. I also edited Session Stickiness on the ELB to "Enable Load Balancer Generated Cookie Stickiness" so thanks for that. But finally the end result is still the same, "504 Gateway Timeout".Boiney
In that case you might want to go through the list of possibilities here: docs.aws.amazon.com/ElasticLoadBalancing/latest/DeveloperGuide/…Leanto
I still get the feeling there's a dude at a control board twirling his mustache, getting a rise out of my troubles. I've been through that list a few times and all are accounted for. This must be SSL related (it's the only thing I haven't fully explored) so I'll troubleshoot and iron out my SSL installation in the next 2 weeks and post another response here. Thanks for your time and I marked your answer as useful.Boiney
B
4

In my case, it turns out that there was no problem with the load balancer. The final solution ending up being Ubuntu's hosts file in which there was an inexplicable entry to route traffic from some mystery IP to my application's host name. So, during the process of creating the PDF, paths were getting re-written by the PDF generator to point at the mystery server, and hence the Gateway timeout issues. I have no idea why it was occasionally working and not failing.

127.0.0.1 localhost
127.0.1.1 ubuntu-server
42.139.126.191 app.myapp.com

This is what it looked like, so I removed that third line and all the gears started turning again. :P

Boiney answered 5/1, 2015 at 9:33 Comment(0)
B
4

We use Amazon EC2 instances behind an Amazon ELB and we were getting 504 GATEWAY_TIMEOUT errors. We use Apache and PHP on Ubuntu web servers.

In our case, the error was due to the servers running out of memory. We didn't see the "out of memory" in our Apache error logs. There was a 504 line entry in the Apache access logs. We confirmed the "out of memory" by looking into the syslog file ( /var/log/syslog ) and fixed the memory issue.

This resolved the 504 error for us.

Belden answered 16/10, 2015 at 4:35 Comment(0)
J
2

Most probably idle timeout is the culprit and the default value is 60 seconds. AWS ALB

Jipijapa answered 24/12, 2018 at 16:43 Comment(0)
C
1

Check security groups settings. The port 80 may be restricted to access.

Cardon answered 5/11, 2017 at 7:2 Comment(2)
absolutely not. 80 is through most of the world talk to each otherAsiaasian
thanks. as little as this is my life is new again...gosh!Quinquennial
E
1

as of December 2022

For me it was a security group configuration. In my case, I had the following configuration:

Internet (request) -> Load balancer on HTTP:80 -> (forward to) -> EC2 Application on HTTP: 3501

(ie the load balancer listening on port 80 was forwarding traffic to a service that was listening on port 3501.

I changed the security group to allow incoming TCP requests on port=3501, coming from inside the security group itself. This allowed the forwarded traffic to reach the application. I assume setting it to a less restrictive option, such as ::/0 should also work, but it's less secure.

enter image description here

Emblements answered 29/11, 2022 at 0:14 Comment(0)
T
0

In my case : I edited the inbound security group rules.Go to:

EC2-->Security Groups-->Edit Inbound Rules for Corresponding Security Group and make sure that source is correctly selected(for me selecting anywhere solved the problem)

Theseus answered 24/2, 2021 at 10:49 Comment(0)
D
0

In my case I removed All Traffic rule in outbound rules. By adding All Traffic rule and allowing from anywhere solved my problem

Drisko answered 26/7, 2022 at 6:18 Comment(0)
F
0

Another reason for intermittent 504 Gateway Time-out might be that the Gateway cannot reach one of the instances due to missing rule in the Network ACL that blocks cross AZ traffic. AWS doc. default/custom ACL doesn't take care of cross AZ traffic

How to resolve? say the union of the AZ networks is local 172.20.0.0/16 then you should have this range with relevant port(s) allowed in the network ACL. Note that if you're now getting CORS errors ("Access to XMLHttpRequest at '...' from origin '...' has been blocked by CORS policy: No 'Access-Control-Allow-Origin' header is present on the requested resource.") then you probably didn't add all relevant ports

If all above still doesn't help consider consulting with a cloud networking SME or troubleshoot using built in tools such as AWS Network Manager \ Reachability Analyzer to validate GW eni can access the relevant instance(s)

Fortnight answered 4/10, 2023 at 23:52 Comment(0)
T
0

Also make sure to check your application proxy settings. If you're using docker, theres high possibility that you have directed traffic to container ip or host ip which might have changed.

Teresitateressa answered 6/5, 2024 at 1:9 Comment(0)
C
-1

Hi check you connection and application latency this is timeout error

Cyprinid answered 10/12, 2022 at 8:14 Comment(1)
As it’s currently written, your answer is unclear. Please edit to add additional details that will help others understand how this addresses the question asked. You can find more information on how to write good answers in the help center.Lazo

© 2022 - 2025 — McMap. All rights reserved.