Elasticsearch, Tire, and Nested queries / associations with ActiveRecord
Asked Answered
C

2

28

I'm using ElasticSearch with Tire to index and search some ActiveRecord models, and I've been searching for the "right" way to index and search associations. I haven't found what seems like a best practice for this, so I wanted to ask if anyone has an approach that they think works really well.

As an example setup (this is made up but illustrates the problem), let's say we have a book, with chapters. Each book has a title and author, and a bunch of chapters. Each chapter has text. We want to index the book's fields and the chapters' text so you can search for a book by author, or for any book with certain words in it.

class Book < ActiveRecord::Base
  include Tire::Model::Search
  include Tire::Model::Callbacks

  has_many :chapters

  mapping do
    indexes :title, :analyzer => 'snowball', :boost => 100
    indexes :author, :analyzer => 'snowball'
    indexes :chapters, type: 'object', properties: {
      chapter_text: { type: 'string', analyzer: 'snowball' }
    }
  end
end

class Chapter < ActiveRecord::Base
  belongs_to :book
end

So then I do the search with:

s = Book.search do
  query { string query_string }
end

That doesn't work, even though it seems like that indexing should do it. If instead I index:

indexes :chapters, :as => 'chapters.map{|c| c.chapter_text}.join('|'), :analyzer => 'snowball'

That makes the text searchable, but obviously it's not a nice hack and it loses the actual associated object. I've tried variations of the searching, like:

s = Book.search do
  query do
    boolean do
      should { string query_string }
      should { string "chapters.chapter_text:#{query_string}" }
    end
  end
end

With no luck there, either. If anyone has a good, clear example of indexing and searching associated ActiveRecord objects using Tire, it seems like that would be a really good addition to the knowledge base here.

Thanks for any ideas and contributions.

Chellman answered 27/7, 2012 at 17:17 Comment(1)
It took me a bit to test and confirm (got pulled onto a different project) but yes, your answer is great, thanks -- I'm sure it will be helpful to a lot of people just getting started.Chellman
H
52

The support for ActiveRecord associations in Tire is working, but requires couple of tweaks inside your application. There's no question the library should do better job here, and in the future it certainly will.

That said, here is a full-fledged example of Tire configuration to work with Rails' associations in elasticsearch: active_record_associations.rb

Let me highlight couple of things here.

Touching the parent

First, you have to ensure you notify the parent model of the association about changes in the association.

Given we have a Chapter model, which “belongs to” a Book, we need to do:

class Chapter < ActiveRecord::Base
  belongs_to :book, touch: true
end

In this way, when we do something like:

book.chapters.create text: "Lorem ipsum...."

The book instance is notified about the added chapter.

Responding to touches

With this part sorted, we need to notify Tire about the change, and update the elasticsearch index accordingly:

class Book < ActiveRecord::Base
  has_many :chapters
  after_touch() { tire.update_index }
end

(There's no question Tire should intercept after_touch notifications by itself, and not force you to do this. It is, on the other hand, a testament of how easy is to work your way around the library limitations in a manner which does not hurt your eyes.)

Proper JSON serialization in Rails < 3.1

Despite the README mentions you have to disable automatic "adding root key in JSON" in Rails < 3.1, many people forget it, so you have to include it in the class definition as well:

self.include_root_in_json = false

Proper mapping for elasticsearch

Now comes the meat of our work -- defining proper mapping for our documents (models):

mapping do
  indexes :title,      type: 'string', boost: 10, analyzer: 'snowball'
  indexes :created_at, type: 'date'

  indexes :chapters do
    indexes :text, analyzer: 'snowball'
  end
end

Notice we index title with boosting, created_at as "date", and chapter text from the associated model. All the data are effectively “de-normalized” as a single document in elasticsearch (if such a term would make slight sense).

Proper document JSON serialization

As the last step, we have to properly serialize the document in the elasticsearch index. Notice how we can leverage the convenient to_json method from ActiveRecord:

def to_indexed_json
  to_json( include: { chapters: { only: [:text] } } )
end

With all this setup in place, we can search in properties in both the Book and the Chapter parts of our document.

Please run the active_record_associations.rb Ruby file linked at the beginning to see the full picture.

For further information, please refer to these resources:

See this StackOverflow answer: ElasticSearch & Tire: Using Mapping and to_indexed_json for more information about mapping / to_indexed_json interplay.

See this StackOverflow answer: Index the results of a method in ElasticSearch (Tire + ActiveRecord) to see how to fight n+1 queries when indexing models with associations.

Heisler answered 29/7, 2012 at 17:42 Comment(6)
I had a chance to try this out, and it works; thanks very much for the completeness of your answer (and the great work on Tire as well of course). In the meantime I sort of had it working by creating a method that did a chapters.map() to gather the text and then called that method in to_indexed_json() but that's obviously a bit of a hack. This approach is much cleaner. Thanks again.Chellman
I too was looking for a "cleaner" way of doing associations. Thanks for the thoroughness of your answer. The tire gem is very cool. Thanks for the hard work.Brookebrooker
That's interesting -- care to elaborate more on the "cleaner" way? Here or at Github Issues, mail, etc?Heisler
Is this affected by a many to many relationship, using :through => ? I have Resource, Tag, and ResourceTag. So confusing :S. I'm not sure what I should add :touch ontoNonobservance
To be honest, I dont even really need this stuff indexed, I just am trying to get it working so I dont have to use load: true.Nonobservance
@Heisler If book had many categories, how would i search for books of a particular category(given id) ? Book.search{ query { term 'categories.id', 1234}} ?? How can I search need some help on this one #14579892Puffin
I
3

I have created this as a solution in one of my applications, that indexes a deeply nested set of models

https://gist.github.com/paulnsorensen/4744475

UPDATE: I have now released a gem that does this: https://github.com/paulnsorensen/lifesaver

Insufflate answered 9/2, 2013 at 7:26 Comment(2)
Is your gem still working, I see that it hasn't been updated long time agoIover
I haven't touched it in a long time since elasticsearch released its own rails gem. I would suggest making a specific jobs per each document you're indexing and explicitly enqueuing the job in your model/service callbacks.Insufflate

© 2022 - 2024 — McMap. All rights reserved.