Using main and delta indexes in sphinx
Asked Answered
B

2

10

Im switching fulltext searching on my site to sphinx. Im going to be using SphinxSE to perform the searches.

I created 2 indexes, as specified in the manual: http://www.sphinxsearch.com/docs/manual-0.9.9.html#live-updates

It seems to work, and index different stuff in its own index, but Im somewhat confused about how I should handle the index updating, merging, and rebuilding.

The way I understand is I cron it to run "indexer delta --rotate" every 5 mins or so, which would add new submissions to the index. Then once a day, I would merge the delta index into the main index by running "indexer main delta --rotate". then once a month or so, I'll run "indexer --all" to rebuild all indexes.

Am I doing this right, or am I missing something?

Bendicta answered 2/10, 2010 at 23:15 Comment(4)
For the record - that's pretty much my setup, all via cron. +1 for asking though, as I've been sketched on how it is currently running. Lets hear those best practices!Versatile
Its just each time you run any of those commands.... wouldn't the search stop working while its running?Bendicta
Well, in my case.. indexer --all --rotate --config /path/to/sphinx.conf executes in 0.024 seconds (75k docs per second, running 5 indexes for 4 domains). If my indexes grow significantly I'd have a problem.Versatile
This question is tagged incorrectly. thinking-sphinx should just be sphinx.Cleliaclellan
W
2

--rotate would just build index in tmp (need space disk) and switch + restart searchd when it's done.

about delta, you need to use pre-query to compute the "limit" max(id) the main indexes id below the limit, and delta is up to this limit.

if you have a timestamp (indexed if possible) you can use it

main -> where timefile < today() delta -> where timefile >= today()

Washin answered 5/10, 2010 at 19:44 Comment(0)
S
3

Sounds pretty much like the setup I did for a customer. And no, the search won't stop working during updates. From the Sphinx docs:

--rotate is used for rotating indexes. Unless you have the situation where you can take the search function offline without troubling users, you will almost certainly need to keep search running whilst indexing new documents. --rotate creates a second index, parallel to the first (in the same place, simply including .new in the filenames). Once complete, indexer notifies searchd via sending the SIGHUP signal, and searchd will attempt to rename the indexes (renaming the existing ones to include .old and renaming the .new to replace them), and then start serving from the newer files. Depending on the setting of seamless_rotate, there may be a slight delay in being able to search the newer indexes.

Subcontraoctave answered 3/10, 2010 at 0:6 Comment(0)
W
2

--rotate would just build index in tmp (need space disk) and switch + restart searchd when it's done.

about delta, you need to use pre-query to compute the "limit" max(id) the main indexes id below the limit, and delta is up to this limit.

if you have a timestamp (indexed if possible) you can use it

main -> where timefile < today() delta -> where timefile >= today()

Washin answered 5/10, 2010 at 19:44 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.