MySql Replication - slave lagging behind master
Asked Answered
S

10

13

I have a master/slave replication on my MySql DB.

my slave DB was down for a few hours and is back up again (master was up all the time), when issuing show slave status I can see that the slave is X seconds behind the master.

the problem is that the slave dont seem to catch up with the master, the X seconds behind master dont seem to drop...

any ideas on how I can help the slave catch up?

Suppliant answered 17/12, 2011 at 20:50 Comment(8)
you have locking tables?Cabalism
eventually the slave will catch up, unless you have tons of queries like updates and inserts on the master. do you have a lot of queries coming from the server?Cabalism
the master was busy with inserts/updates while the slave was down, currently the diff between the two gets bigger...Suppliant
is the master still inserting/updating records? how many seconds the replication is behind?Cabalism
the master is still "working hard" and the slave is about 7000 seconds behindSuppliant
if the master is working hard the slave will have hard time to catch up because the binlog will send evens that the slave will not process fast enough, can you slow down the load on the master?Cabalism
do you think restarting the slave will help free it up?Suppliant
here's a link that might help you dba.stackexchange.com/questions/8236/…Cabalism
F
20

Here is an idea

In order for you to know that MySQL is fully processing the SQL from the relay logs. Try the following:

STOP SLAVE IO_THREAD;

This will stop replication from downloading new entries from the master into its relay logs.

The other thread, known as the SQL thread, will continue processing the SQL statements it downloaded from the master.

When you run SHOW SLAVE STATUS\G, keep your eye on Exec_Master_Log_Pos. Run SHOW SLAVE STATUS\G again. If Exec_Master_Log_Pos does not move after a minute, you can go ahead run START SLAVE IO_THREAD;. This may reduce the number of Seconds_Behind_Master.

Other than that, there is really nothing you can do except to:

  • Trust Replication
  • Monitor Seconds_Behind_Master
  • Monitor Exec_Master_Log_Pos
  • Run SHOW PROCESSLIST;, take note of the SQL thread to see if it is processing long running queries.

BTW Keep in mind that when you run SHOW PROCESSLIST; with replication running, there should be two DB Connections whose user name is system user. One of those DB Connections will have the current SQL statement being processed by replication. As long as a different SQL statement is visible each time you run SHOW PROCESSLIST;, you can trust mysql is still replicating properly.

Forspent answered 17/12, 2011 at 22:44 Comment(2)
Kind of weird but stopping the threads didn't help me, instead monitoring Exec_Master_Log_Pos and the two conections from system user allow me to not freak out. After restarting the slave, everything comes to normal. Thanks Rolando.Redford
since this is received as accepted answer but unfortunately this solution is not worked for me even after restarting the slave. please check my answer that seems could help someone with this issue and that's worked for me.Hubie
T
7

What binary log format are you using ? Are you using ROW or STATEMENT ?

SHOW GLOBAL VARIABLES LIKE 'binlog_format';

If you are using ROW as a binlog format make sure that all your tables has Primary or Unique Key:

SELECT t.table_schema,t.table_name,engine
FROM information_schema.tables t
INNER JOIN information_schema .columns c
on t.table_schema=c.table_schema
and t.table_name=c.table_name
and t.table_schema not in ('performance_schema','information_schema','mysql')
GROUP BY t.table_schema,t.table_name
HAVING sum(if(column_key in ('PRI','UNI'), 1,0)) =0;

If you execute e.g. one delete statement on the master to delete 1 million records on a table without a PK or unique key then only one full table scan will take place on the master's side, which is not the case on the slave.

When ROW binlog_format is being used, MySQL writes the rows changes to the binary logs (not as a statement like STATEMENT binlog_format) and that change will be applied on the slave's side row by row, which means a 1 million full table scan will take place on the slave's to reflect only one delete statement on the master and that is causing slave lagging problem.

Tat answered 4/7, 2014 at 15:44 Comment(0)
U
3

"seconds behind" isn't a very good tool to find out how much behind the master you really is. What it says is "the query I just executed was executed X seconds ago on the master". That doesn't mean that you will catch up and be right behind the master the next second.

If your slave is normally not lagging behind and the work load on the master is roughly constant you will catch up, but it might take some time, it might even take "forever" if the slave is normally just barely keeping up with the master. Slaves operate on one single thread so it is by design much slower than the master, also if there are some queries that take a while on the master they will block replication while running on the slave.

Uncircumcised answered 17/12, 2011 at 21:31 Comment(0)
K
1

Just check if you have same time and timezones on both the servers, i.e., Master as well as Slave.

Kiosk answered 2/8, 2013 at 11:33 Comment(0)
D
1

If you are using INNODB tables, check that you have innodb_flush_log_at_trx_commit to a value different that 0 at SLAVE.

http://dev.mysql.com/doc/refman/4.1/en/innodb-parameters.html#sysvar_innodb_flush_log_at_trx_commit

Denny answered 22/8, 2014 at 8:25 Comment(0)
B
0

We had exactly the same issue after setting up our slave from a recent backup.

We had changed the configuration of our slave to be more crash-safe:

sync_binlog = 1
sync_master_info = 1
relay_log_info_repository = TABLE
relay_log_recovery = 1

I think that especially the sync_binlog = 1 causes the problem, as the specs of this slave is not so fast as in the master. This config option forces the slave to store every transaction in the binary lo before they are executed (instead of the default every 10k transactions).

After disabling these config options again to their default values I see that the slave is catching up again.

Ballesteros answered 24/1, 2016 at 8:44 Comment(0)
V
0

Just to add the findings in my similar case.

There were few bulk temporary table insert/update/delete were happening in master which occupied most of the space from relay log in slave. And in Mysql 5.5, since being single threaded, CPU was always in 100% and took lot of time to process these records.

All I did was to add these line in mysql cnf file

replicate-ignore-table=<dbname>.<temptablename1>
replicate-ignore-table=<dbname>.<temptablename2>

and everything became smooth again.

Inorder to figure out which tables are taking more space in relay log, try the following command and then open in a text editor. You may get some hints

cd /var/lib/mysql
mysqlbinlog relay-bin.000010 > /root/RelayQueries.txt
less /root/RelayQueries.txt
Verdha answered 21/6, 2016 at 18:16 Comment(0)
D
0

If u have multiple schema's consider using multi threaded slave replication.This is relatively new feature.

This can be done dynamically without stopping server.Just stop the slave sql thread.

STOP SLAVE SQL_THREAD;
SET GLOBAL slave_parallel_threads = 4;
START SLAVE SQL_THREAD;
Dugald answered 21/11, 2017 at 15:9 Comment(2)
Is this applicable in MariaDB or in Oracle MySQL as well ?Frankfrankalmoign
multi-threaded replication appears to have been added in version 5.6Eccles
H
0

I have an issue similar to this. and both of my MySQL server hosted on AWS EC2 (master and replication). by increasing EBS disk size (which automatically increased IOPS) for MySQL slave server, its turned out the solution for me. R/W Throughput and bandwidth is increased R/W latency were decreased.

 R/W Throughput and bandwidth is increased R/W latency were decreased, lagging behind master, Seconds_Behind_Master increased

now my MySQL database replication is catching up to the master. and Seconds_Behind_Master was decreased (it was got increased from day to day).

so if you have MySQL hosted on EC2. I suggest you tried to increase EBS disk size or its IOPS on the slave.

Hubie answered 18/2, 2020 at 23:36 Comment(0)
I
0

I know it's been a while since OP asked but it would have helped me to read the following answer.

In /etc/mysql/mysql.cnf :

    [mysql]
    disable_log_bin
    innodb_flush_log_at_trx_commit=2
    innodb_doublewrite = 0
    sync_binlog=0

disable_log_bin REALLY carried the trick for me.

Inhume answered 4/4, 2022 at 13:2 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.