Rails 3, will_paginate, random, repeating records, Postgres, setseed failure
Asked Answered
C

2

6

I have a database of movies with attributes. I'd like to return a queried batch of those movies in random order to a template with pagination. I'm using will_paginate. I tried the following:

## MoviesController

movies = Movie.get_movies(query_string)   # a method in Movie model that takes in 
                                          # a query_string and fetches movies 
                                          # with user-set params

@movies = movies.order('random()').page(params[:page]).per_page(16)

This works nicely, except movies are repeated from page to page. Solutions to this problem have been posted here and here. Those links explain that, because the random() seed is reset from page to page, there is no consistency and OFFSET is rendered useless. They offer great solutions for MySQL users, as its rand(n) function takes a seed n. Postgres, however, doesn't do this. You have to declare setseed(n) in SELECT before issuing random() in ORDER.

So I tried the postgres way to set the seed:

@movies = movies.select('setseed(.5)').order('random()').page(params[:page]).per_page(16)

Curiously, that returned Movie objects with absolutely no attributes. The following was raised from the template:

ActiveModel::MissingAttributeError in Movies#action

missing attribute: some_movie_attribute

I debugged this and, once the stack reached action_controller/metal, @movies contained:

[#<Movie >, #<Movie >, #<Movie >, #<Movie >, #<Movie >, #<Movie >, #<Movie >, #<Movie >, #<Movie >, #<Movie >, #<Movie >, #<Movie >, #<Movie >, #<Movie >, #<Movie >, #<Movie >]

That number of Movie objects (18) corresponds to the number of movies returned from the query.

I also tried the following to see if setseed(n) was the problem by removing the random order method:

@movies = movies.select('setseed(.5)').page(params[:page]).per_page(16)

This returned the same non-attribute Movie objects as layed out above. So it appears that setseed(n) is indeed the issue.

I tried a couple of workarounds, like:

# MoviesController

movies = Movie.get_movies(query_string)
shuf_movs = movies.shuffle  ## effectively turns shuf_movs into an array

@movies = shuf_movs.paginate(:page => params[:page], :per_page => 16)

That also returned pages with repeating movies. I thought this was because pagination needed the objects to be ordered by something, and you can't set seeds in Array#shuffle. So I tried writing my own randomization code that would temporarily store an id to be ordered by in the Movie objects. (Please excuse the sloppy code):

# Movie model

attr_accessor :rand_id

# MoviesController

movies = get_movies(query_string)
movies_count = movies.count
r = Random.new
nums = []
rand_movs = []
id = 1
while nums.count != movies_count
  num = r.rand(0..movies_count - 1)
  if !(nums.include?(num))
    movie = movies[num]
    movie.rand_id = id
    rand_movs << movie
    nums      << num
    id += 1
  end
end

@movies = rand_movs.sort_by { |a| a.rand_id }.paginate(:page => params[:page], :per_page => 16)

That still produced repeating movies in different pages. At this point I realize will_paginate doesn't take in what you've sorted by before paginate is called. So I tried this:

@movies = rand_movs.paginate(:order => 'rand_id', :page => params[:page], :per_page => 16)

That still repeats records. :order -> 'rand_id' is ignored because :order only matters when you're dealing with ActiveRecord objects, not arrays.

So setseed(n) appears to be my only hope for achieving randomized non-repeating records using will_paginate. Any ideas?

Thanks!

Cilium answered 28/4, 2012 at 21:29 Comment(1)
The select specifies the columns in the query, and you specified none, only the seed. Not sure if it will work, but try this: .select('*, setseed(1)')Puerperal
F
20

Not a postgres person but ... I'd try

Movie.connection.execute "select setseed(0.5)"
Movie.where(...).order('random()').page(params[:page]).per_page(15)

With regards to Array#shuffle not taking a seed, it uses Kernel.rand so you can seed it using Kernel.srand

Faso answered 29/4, 2012 at 8:4 Comment(4)
Thanks, Frederick. This did the trick. Also, thanks for the heads up re: seeding Array#shuffle with Kernel.Cilium
I am trying to achieve the same. This gives me the same random order on each reload, but how do I get a new random order for each user visiting the site?Libretto
You'd want to pass a different seed for each userFaso
We had lots of issues about setseed randomly not returning the same random order. We found that switching to hash function would lead to more predictable results as mentioned in this solution https://mcmap.net/q/1631345/-do-random-sort-with-seed-values-return-same-set-again-even-some-rows-are-deletedShoffner
K
1

try passing an array of fields to select:

@movies = movies.select(['setseed(.5)', 'some_movie_attribute']).order('random()').page(params[:page]).per_page(16)

Some result is using some_movie_attribute, which isn't being selected by your query, and so isn't available. Adding it as one of the select fields should resolve this one.

Karyn answered 29/4, 2012 at 1:37 Comment(2)
Thanks much. I tried movies.select(['setseed(.5)', '*']).order('random()').page(params[:page]).per_page(16) Passing an array like you suggested, but rather with '*' to call all Movie fields (my template calls a bunch of attributes), renders the template (hooray!). But now with the seed set, the records don't order randomly - they're ordered by id no matter what seed is passed in. :-(Cilium
.select, at least in Rails/ActiveRecord 5.0.7.2 and 6.1.4.1 I tested (and probably some earlier versions as well), filters out the usage of setseed, but the approved answer works well (as it uses .execute straightforward)Hyperaemia

© 2022 - 2024 — McMap. All rights reserved.