Apache Solr: Slave replicates 10+ times every time it polls (excessive commits?)
Asked Answered
M

1

18

We're using Apache Solr (3.1.0) to index a lot of articles written for multiple sites. We have a master/slave setup (replication config at the bottom), where server 1 indexes the articles, and server 2 replicates the index. The slave should poll the master every 60 seconds, but instead, we can see 10 to up to 75 consecutive /replication calls nearly every time.

Each Solr core (${solr.core.name} in the slave config) represents a different site. The /replication calls I see most are tied to the biggest site. One of the cores only got 1 call per minute, and I've been able to reproduce this there after calling update?commit=true a few times, so this leads me to think it's related to the amount of commits the master performs.

So my question is, how do I stop the Solr slave from replicating the index dozens of times and force it to replicate just once per minute? I've tried playing with the commitReserveDuration parameter in the master config, but I don't really see any difference.

master replication config:

 <requestHandler name="/replication" class="solr.ReplicationHandler" >
   <lst name="master">
     <str name="replicateAfter">commit</str>
     <str name="replicateAfter">startup</str>
   </lst>
 </requestHandler>

slave replication config:

 <requestHandler name="/replication" class="solr.ReplicationHandler" >
   <lst name="slave">
     <str name="masterUrl">http://${solr.master.server}/search/${solr.core.name}/replication</str>
     <str name="pollInterval">00:00:60</str>
   </lst>
 </requestHandler>
Metaphysical answered 26/2, 2016 at 13:53 Comment(7)
I would try to disable pollInterval (specify no pollInterval) and execute replication by api call triggered by a cron job. Does it help? wiki.apache.org/solr/…Antler
Thanks for the reply. I tried this and calling /replication?command=fetchindex once triggers a lot of /replication calls on the master... I don't see any difference between this and keeping the pollInterval in the config. To be honest, this could be perfectly normal behaviour, but I just can't find any docs describing it.Metaphysical
That was just an idea to track the problem. Sorry, I can't help you further.Antler
@Gurpreet Singh Where can I check this? I haven't seen the amount of commits anywhere yet.Metaphysical
I am facing the same issue in Solr 6.6. Replication is happening before commit on every polling time while executing full import or its auto commit not sure, but select?q*:* is returning different data till import finishs from all master and slaves, while it works fine when I disable polling. Need any clue for same.Heartrending
I left this job a few months after I asked this question and I never found the answer. Can't help you here, sorry.Metaphysical
Is just disabling polling an option? Since you're saying it works fine then.Metaphysical
R
1

in the config you specified replication after as commit , so incase if you are issuing commit from the code very frequently then it will trigger replication , so i would suggest to change to optimize instead of commit. This should solve your problem. Here is the link which gives more details on the replicationafter settings.

Rosariarosario answered 18/3, 2016 at 10:15 Comment(1)
Thanks for your comment. When I change commit to optimize, the slave appears to be out of sync with the master for 5+ minutes, which is way too long. Thanks though, I'll try to find a way to call the optimization from the code and see if that works.Metaphysical

© 2022 - 2024 — McMap. All rights reserved.