Rails 4 - Ruby 2.2.2 - Amazon AWS S3 - dragonfly 1.0.12 - dragonfly-s3_data_store 1.2 - fog-aws 0.10.0
Around 99% of the time we have no issues. The issue usually only happens during times when usage is high but I noticed it happen when there were almost no users as well. The line that throws the error:
# excon/lib/excon/socket.rb
# line 100 inside the connection method.
addrinfo = ::Socket.getaddrinfo(*args)
The error happens everywhere in the application. Sometimes the error is seen when there is not a remote connection. - I am no longer able to verify this.
I used Rails loggers to capture the arguments being passed in and there is seemingly no difference between a pass and a fail. Here are some examples:
# PASS
["s3.amazonaws.com", 443, 0, 1, nil, nil, false]
["mybucket.s3.amazonaws.com", 443, 0, 1, nil, nil, false]
# FAIL
["mybucket.s3-us-west-1.amazonaws.com", 443, 0, 1, nil, nil, false]
I came across several forums that lead me to believe an update was needed to the excon gem. I upgraded the Excon gem from 0.45.4 to 0.51.0. In addition to that I also updated the Fog gem from 1.36.0 to 1.38.0.
After upgrading the error went from "getaddrinfo: Name or service not known (SocketError)" to "Excon::Error::Socket: getaddrinfo: No address associated with hostname (SocketError)"
The url captured for a failed response is different than one of the urls that passes. I will look in to this further.
UPDATE:
The dragonfly initializer specifies the same path as the one that fails and because url_host overrides the default functionality I decided to remove it.
# myapp/config/initializers/dragonfly.rb
...
url_host: 'mybucket.s3-us-west-1.amazonaws.com'
This resulted in no change. The same url is still used and is the only one that fails.
getaddrinfo()
is about DNS and it might fail for various reasons (so maybe there is a need to retry). – Inveighsysctl net.ipv4.ip_local_port_range
? – Plataea