Updating a Haystack search index with Django + Celery
Asked Answered
K

4

31

In my Django project I am using Celery. I switched over a command from crontab to be a periodic task and it works well but it is just calling a method on a model. Is it possible to update my Haystack index from a periodic task as well? Has anyone done this?

/manage.py update_index

That's the command to update the index from the Haystack documentation but I'm not sure how to call that from a task.

Klapp answered 5/12, 2010 at 12:28 Comment(4)
management commands should optimally just be a thin wrapper around a public API, but sadly it doesn't seem to be the case here; github.com/toastdriven/django-haystack/blob/master/haystack/…Allpowerful
as a hack you can use django.core.management.call_command("update_index"), but I would rather copy+paste the code linked above to work independentlyAllpowerful
@asksol, Thanks for the reply. Why is call_command considered a hack? It seems simpler to do that the copy paste that whole command.Klapp
Ah, not the whole command. Just the part that is actually doing the index, without the Django command stuff.Allpowerful
E
30

the easiest way to do this would probably be to run the management command directly from python and run it in your task

from haystack.management.commands import update_index
update_index.Command().handle()
Environ answered 7/12, 2010 at 0:34 Comment(8)
this worked. Could you explain why this is better than using django.core.management.call_command("update_index") ?Klapp
You're right, that should work just as well, I didn't know about that function :)Environ
Make sure to import app.search_indexes somewhere or it won't work I've found. The models must be registered or they will be skipped.Sherasherar
This no longer works on the 2.0.0 beta. Returns this "ImproperlyConfigured: The key 'None' isn't an available connection."Sennit
You can pass command parameters like so: update_index.Command().handle(age=1)Hannahhannan
django.core.management.call_command("update_index") throws an IO error for me in prod. However, this command works perfectly!Hallmark
@Hannahhannan What does it mean age=1 - is that 1 day or hour? How to pass date from ?Polyclinic
my rebuild index will ask yes or no question while running in the terminalMaybellemayberry
V
12

As for version 2.0.0 beta of haystack, this code should work:

from haystack.management.commands import update_index
update_index.Command().handle(using='default')
Vlada answered 29/6, 2012 at 11:8 Comment(2)
There you can also use 'remove' option to remove non-existing entries. update_index.Command().handle(using='default',remove=True)Uhf
As of haystack 2.1.0, using accepts list of backends. So it should be handle(using=['default'])Missus
R
7

https://github.com/django-haystack/celery-haystack

I find this package to be a great, easy plug-in app to provide haystack indexing via celery. I used it in a few projects.

Reynoso answered 29/8, 2012 at 1:49 Comment(0)
D
7

Also, since version 2 of the haystack you can call rebuild index from python as

from haystack.management.commands import update_index, rebuild_index
rebuild_index.Command().handle(interactive=False)

Where the "interactive" would prevent haystack asking question if you really want to rebuild index. This is equivalent to --no-input command line option.

If you use xapian as FTS backend please remember that multithreaded updates to index would result in DB Write Lock. So, the solution with celery-haystack package does attempt to spread index update into multiple workers (multiple thread) resulting in the lock with xapian.

Dempsey answered 6/7, 2014 at 7:54 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.