rails can't reconnect to postgres database
Asked Answered
P

0

9

I have a rails app (4.1.5) backed by a postgres database (9.2.13) using the pg gem (0.17.1), unicorn server (1.1.0) with 2 worker processes.

The rails app runs jobs using sidekiq (2.17.7)

At some point the postgres db went into recovery mode. The following error was thrown by multiple jobs:

PG::ConnectionBad:FATAL:  the database system is in recovery mode
FATAL:  the database system is in recovery mode

The database came back up but jobs continued to throw the following two errors:

PG::Error:invalid encoding name: utf8
/home/ruby/data42/shared/bundle/ruby/2.2.0/gems/activerecord-4.1.5/lib/active_record/connection_adapters/postgresql_adapter.rb:908:in `set_client_encoding'
/home/ruby/data42/shared/bundle/ruby/2.2.0/gems/activerecord-4.1.5/lib/active_record/connection_adapters/postgresql_adapter.rb:908:in `configure_connection'
/home/ruby/data42/shared/bundle/ruby/2.2.0/gems/activerecord-4.1.5/lib/active_record/connection_adapters/postgresql_adapter.rb:603:in `reconnect!'
/home/ruby/data42/shared/bundle/ruby/2.2.0/gems/activerecord-4.1.5/lib/active_record/connection_adapters/abstract_adapter.rb:313:in `verify!'
/home/ruby/data42/shared/bundle/ruby/2.2.0/gems/activerecord-4.1.5/lib/active_record/connection_adapters/abstract/connection_pool.rb:453:in `block in checkout_and_verify'
/home/ruby/data42/shared/bundle/ruby/2.2.0/gems/activesupport-4.1.5/lib/active_support/callbacks.rb:82:in `run_callbacks'

and

ActiveRecord::ConnectionTimeoutError:could not obtain a database connection within 5.000 seconds (waited 5.000 seconds)
/home/ruby/data42/shared/bundle/ruby/2.2.0/gems/activerecord-4.1.5/lib/active_record/connection_adapters/abstract/connection_pool.rb:190:in `block in wait_poll'
/home/ruby/data42/shared/bundle/ruby/2.2.0/gems/activerecord-4.1.5/lib/active_record/connection_adapters/abstract/connection_pool.rb:181:in `loop'
/home/ruby/data42/shared/bundle/ruby/2.2.0/gems/activerecord-4.1.5/lib/active_record/connection_adapters/abstract/connection_pool.rb:181:in `wait_poll'
/home/ruby/data42/shared/bundle/ruby/2.2.0/gems/activerecord-4.1.5/lib/active_record/connection_adapters/abstract/connection_pool.rb:136:in `block in poll'
/home/ruby/.rvm/rubies/ruby-2.2.2/lib/ruby/2.2.0/monitor.rb:211:in `mon_synchronize'
/home/ruby/data42/shared/bundle/ruby/2.2.0/gems/activerecord-4.1.5/lib/active_record/connection_adapters/abstract/connection_pool.rb:146:in `synchronize'
/home/ruby/data42/shared/bundle/ruby/2.2.0/gems/activerecord-4.1.5/lib/active_record/connection_adapters/abstract/connection_pool.rb:134:in `poll'

It looks to me like rails notices the connection isn't active and tries to reset the connection. In active record's postgresql_adapter.rb the following method is getting called:

  # Close then reopen the connection.
  def reconnect!
    super
    @connection.reset
    configure_connection
  end

my guess is that the connection.reset is not actually working, so when the pg gem goes to set the encoding (the first part of the configure_connection method) it masks the fact that there is no connection by throwing the encoding specific error.

The following is a method from the pg gem (.17.1) ext/pg_connection.c/2804

/*
* call-seq:


*    conn.set_client_encoding( encoding )
 *
 * Sets the client encoding to the _encoding_ String.
 */
static VALUE
pgconn_set_client_encoding(VALUE self, VALUE str)
{
    PGconn *conn = pg_get_pgconn( self );

    Check_Type(str, T_STRING);

    if ( (PQsetClientEncoding(conn, StringValuePtr(str))) == -1 ) {
        rb_raise(rb_ePGerror, "invalid encoding name: %s",StringValuePtr(str));
    }

    return Qnil;
}

So if those guesses are correct, why isn't the connection reset working?
Restarting the app re-establishes the connection to the database correctly, but I'm hoping for a non-manual solution to this problem.

Phlebitis answered 29/12, 2015 at 15:45 Comment(11)
Did you tried upgrading pg gem?Thunell
you can increase the timeout time by ActiveRecord::Base.connection.execute('set statement_timeout to 10000') or in database.yml variables: statement_timeout: 1000000Sikorski
@Thunell I haven't tried upgrading yet. I was going to reserve that as more of a last ditch effort. As far as I can tell the implementation of reset hasn't changed in more recent versions.Phlebitis
i would recommend opening an issue on the rails issue tracker. especially if you can provide a minimal reproducible scenario.Bestead
It doesn't look like a problem with Rails. You should try to figure out why your database switched to recovery mode.Valerievalerio
I have the same error. Any news on this one?Pilewort
If you are hosting the postgres server, look at: dba.stackexchange.com/questions/61650/…. It's a response about postgres databses stucked in recovery mode. Hope it helps.Otis
Seems like we could use some more info. Were you able to connect to the database outside of rails? There are plenty of folks who have a setup like yours - this seems like it probably is not an issue with rails or postgres - probably something else to do with your app or usage. Leaking connections?Amicable
recovery mode means it's a PG issue. Try this : dba.stackexchange.com/questions/61650/…Suzisuzie
This looks like pg gem bug. We experience PostgreSQL disconnection problems a couple times a month and it always recover automatically. I suggest upgrading pg gem, I know this was suggested before but that was 4 years ago.Adelina
According to this fix bitbucket.org/ged/ruby-pg/commits/a7f98864a7d6 it could be encoding or connection problem. Not the encoding only.Okra

© 2022 - 2024 — McMap. All rights reserved.