Rails caching a paginated collection
Asked Answered
V

2

9

Just doing some research on the best way to cache a paginated collection of items. Currently using jbuilder to output JSON and have been playing with various cache_key options.

The best example I've seen is by using the latest record's updated_at plus the amount of items in the collection.

def cache_key
      pluck("COUNT(*)", "MAX(updated_at)").flatten.map(&:to_i).join("-")
end

defined here: https://gist.github.com/aaronjensen/6062912

However this won't work for paginated items, where I always have 10 items in my collection.

Are there any workarounds for this?

Volscian answered 13/2, 2014 at 23:35 Comment(0)
E
10

With a paginated collection, you're just getting an array. Any attempt to monkey patch Array to include a cache key would be a bit convoluted. Your best bet it just to use the cache method to generate a key on a collection-to-collection basis.

You can pass plenty of things to the cache method to generate a key. If you always have 10 items per page, I don't think the count is very valuable. However, the page number, and the last updated item would be.

cache ["v1/items_list/page-#{params[:page]}", @items.maximum('updated_at')] do

would generate a cache key like

v1/items_list/page-3/20140124164356774568000

With russian doll caching you should also cache each item in the list

# index.html.erb
<%= cache ["v1/items_list/page-#{params[:page]}", @items.maximum('updated_at')] do %>
  <!-- v1/items_list/page-3/20140124164356774568000 -->
  <%= render @items %>
<% end %>

# _item.html.erb
<%= cache ['v1', item] do %>
  <!-- v1/items/15-20140124164356774568000 -->
  <!-- render item -->
<% end %>
Epidemiology answered 21/2, 2014 at 14:28 Comment(8)
You might also add a session key, user_id, etc. and hash the key into a digest.Loaf
What if an item in the middle gets updated? The parent container in index.html.erb won't know about it right?Volscian
That's part of Russian Doll Caching. DHH talks about it here in #5.Epidemiology
Hi @JacobEvanShreve, what if the item doesn't belong to everything? Like a list of paginated users for example.Volscian
That's what @items.max_by(&:updated_at) is for. If any of the users in that particular paginated segment have been updated, the collection will re-render. This does, though, also depend on the type of paginated list. For example, if this is a search based on an infinitely unique string, you may not want to cache the collection, but rather just the individuals.Epidemiology
It works well only with simple lists, when we have list of a products, which have images from child model and we want to expire the product cache when his image is updated, this approach can take a long piece of code :)Boisterous
Actually the count is invaluable: Image an item being destroyed. That would not change the maximum of updated_at, nevertheless at least one of the pages just became invalid. Have a look at cache_key_for_products on guides.rubyonrails.org/caching_with_rails.html#fragment-cachingIndescribable
@items.max_by(&:updated_at) will fetch the items from database anyway and will also process them one by one in ruby. i think it would be better if we used @items.maximum('updated_at') which will fetch only the count directly from database.Ragwort
F
3

Caching pagination collections is tricky. The usual trick of using the collection count and max updated_at does mostly not apply!

As you said, the collection count is a given so kind of useless, unless you allow dynamic per_page values.

The latest updated_at is totally dependent on the sorting of your collection.

Imagine than a new record is added and ends up in page one. This means that one record, previously page 1, now enters page 2. One previous page 2 record now becomes page 3. If the new page 2 record is not updated more recently than the previous max, the cache key stays the same but the collection is not! The same happens when a record is deleted.

Only if you can guarantee that new records always end up on the last page and no records will ever be deleted, using the max updated_at is a solid way to go.

As a solution, you could include the total record count and the total max updated_at in the cache key, in addition to page number and the per page value. This will require extra queries, but can be worth it, depending on your database configuration and record count.

Another solution is using a key that takes into account some reduced form of the actual collection content. For example, also taking into account all record id's.

If you are using postgres as a database, this gem might help you, though I've never used it myself. https://github.com/cmer/scope_cache_key

And the rails 4 fork: https://github.com/joshblour/scope_cache_key

Faux answered 29/6, 2015 at 14:52 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.