Rails: Avoiding duplication errors in Factory Girl...am I doing it wrong?
Asked Answered
L

4

39

Suppose I have a model user, which has a uniqueness constraint on the email field

If I call Factory(:user) once all is well, but if I call it a second time it'll fail with an "entry already exists" error.

I'm currently using a simple helper to search for an existing entry in the DB before creating the factory...and calling any factory I make through that helper.

It works, but it's not entirely elegant, and considering how common I assume this problem must be, I'm guessing there's a better solution. So, is there an inbuilt way in factory girl to return_or_create a factory, instead of just charging ahead with create()? If not, how do most folk avoid duplicate entries with their factories?

Lund answered 16/8, 2011 at 0:5 Comment(2)
I'm having this problem too. Did you add a sequence to the email field so theoretically it changes each time you call Factory(:user). I have that in place and am still running into the problem you have.Class
I had the same issue. I noticed that FactoryGirl had left some bad data around in my test database from an earlier test that had failed miserably and maybe thrown an exception (possibly avoiding cleanup). I fixed it by doing: RAILS_ENV=test bin/rake db:drop RAILS_ENV=test bin/rake db:create RAILS_ENV=test bin/rake db:migrate This cleared out all the old data. Hope this helps @ClassMartinet
S
74

Simple answer: use factory.sequence

If you have a field that needs to be unique you can add a sequence in factory_girl to ensure that it is never the same:

Factory.define :user do |user|
  sequence(:email){|n| "user#{n}@factory.com" }
  user.password{ "secret" }
end

This will increment n each time in order to produce a unique email address such as [email protected]. (See https://github.com/thoughtbot/factory_girl/wiki/Usage for more info)

However this isn't always great in Rails.env.development...

Over time I have found that this is not actually the most useful way to create unique email addresses. The reason is that while the factory is always unique for your test environment it's not always unique for your development environment and n resets itself as you start the environment up and down. In :test this isn't a problem because the database is wiped but in :development you tend to keep the same data for a while.

You then get collisions and find yourself having to manually override the email to something you know is unique which is annoying.

Often more useful: use a random number

Since I call u = Factory :user from the console on a regular basis I go instead with generating a random number. You're not guaranteed to avoid collisions but in practice it hardly ever happens:

Factory.define :user do |user|
  user.email {"user_#{Random.rand(100000000).to_s}@factory.com" }
  user.password{ "secret" }
end

N.B. You have to use Random.rand rather than rand() because of a collision (bug?) in FactoryGirl [https://github.com/thoughtbot/factory_girl/issues/219](see here).

This frees you to create users at will from the command line regardless of whether there are already factory generated users in the database.

Optional extra for making email testing easier

When you get into email testing you often want to verify that an action by a particular user triggered an email to another user.

You log in as Robin Hood, send an email to Maid Marion and then go to your inbox to verify it. What you see in your inbox is something from [email protected]. Who the hell is that?

You need to go back to your database to check whether the email was sent / received by whomever you expected it to be. Again this is a bit of a pain.

What I like to do instead is to generate the email using the name of the Factory user combined with a random number. This makes it far easier to check who things are coming from (and also makes collisions vanishingly unlikely). Using the Faker gem (http://faker.rubyforge.org/) to create the names we get:

Factory.define :user do |user|
  user.first_name { Faker::Name::first_name }
  user.last_name { Faker::Name::last_name }
  user.email {|u| "#{u.first_name}_#{u.last_name}_#{Random.rand(1000).to_s}@factory.com" }
end

finally, since Faker sometimes generates names that aren't email-friendly (Mike O'Donnell) we need to whitelist acceptable characters: .gsub(/[^a-zA-Z1-10]/, '')

Factory.define :user do |user|
  user.first_name { Faker::Name::first_name }
  user.last_name { Faker::Name::last_name }
  user.email {|u| "#{u.first_name.gsub(/[^a-zA-Z1-10]/, '')}_#{u.last_name.gsub(/[^a-zA-Z1-10]/, '')}_#{Random.rand(1000).to_s}@factory.com" }
end

This gives us personable but unique emails such as [email protected] and [email protected]

Saltcellar answered 17/9, 2011 at 12:34 Comment(8)
...or just use Faker::Internet.email for the e-mail address.Panek
However, that has the drawback that the e-mail address may not be like the name. I see now what you're trying to do here. Also, ffaker is faster and better-behaved than classic Faker, FWIW.Panek
Then use Faker::Internet.email("#{first_name} #{last_name}") to have the email match the name.Volteface
I just wanted to emphasize that passing a block as opposed to a parameter is necessary to avoid lazy (one-time) generation of model attributes. For instance in user.email {"user_#{rand(1000).to_s}@factory.com" }; user.password{ "secret" } the {} are necessary to ensure a new random string gets generated every time the factory builds a user. Without the brackets the same random string would keep getting reused. OTOH the brackets are not needed around the password attribute.Reproduction
Using user.email {"user_#{rand(1000).to_s}@factory.com" } will throw an error: undefined method `rand='. Instead, use: user.email {"user_#{Random.rand(1000)}@factory.com" }. Also, #{} is string interpolation, so the .to_s in the example is superfluous.Goggles
@MarnenLaibow-Koser These days ffaker isn't faster. From their readme: "Since those days faker has also been rewritten and the "speed" factor is probably irrelevant now."Aniela
There is a drawback to the random number approach - it doesn't actually guarantee uniqueness. I know it's 1 in a thousand in this case, but if you have thousands of tests, it could occasionally collide. You can use a bigger random number to make the probability vanishingly small, but I've taken to just appending a timestamp (Time.now.to_f) on strings that need to be unique. Simple, ugly, and it just works.Altocumulus
About “Mike O’Donnell” supposedly not being e-mail-friendly: the apostrophe is a valid character in e-mail addresses (see RFC 5322), and I know people whose e-mail addresses include them.Panek
B
12

Here's what I do to force the 'n' in my factory girl sequence to be the same as that object's id, and thereby avoid collisions:

First, I define a method that finds what the next id should be in app/models/user.rb:

def self.next_id
  self.last.nil? ? 1 : self.last.id + 1
end 

Then I call User.next_id from spec/factories.rb to start the sequence:

factory :user do
  association(:demo)
  association(:location)
  password  "password"
  sequence(:email, User.next_id) {|n| "darth_#{n}@sunni.ru" }
end
Barret answered 15/9, 2012 at 13:29 Comment(0)
A
4

I found this a nice way to be sure the tests will always pass. Otherwise you can not be sure the 100% of the times you will create a unique email.

FactoryGirl.define do
  factory :user do
    name { Faker::Company.name }
    email { generate(:email) }
  end
  sequence(:email) do
    gen = "user_#{rand(1000)}@factory.com"
    while User.where(email: gen).exists?
      gen = "user_#{rand(1000)}@factory.com"
    end
    gen
  end
end
Acoustics answered 25/2, 2016 at 12:6 Comment(0)
J
2

If you only need to generate a few values for attributes, you can also add a method to String, which keeps track of the prior strings used for an attribute. You could then do something like this:

factory :user do
  fullname { Faker::Name.name.unique('user_fullname') }
end

I use this approach for seeding. I wanted to avoid sequence numbers, because they do not look realistic.

Here the String extension which makes this happen:

class String
  # Makes sure that the current string instance is unique for the given id.
  # If you call unique multiple times on equivalent strings, this method will suffix it with a upcounting number.
  # Example:
  #     puts "abc".unique("some_attribute") #=> "abc"
  #     puts "abc".unique("some_attribute") #=> "abc-1"
  #     puts "abc".unique("some_attribute") #=> "abc-2"
  #     puts "abc".unique("other") #=> "abc"
  #
  # Internal: 
  #  We keep a data structure of the following format:
  #     @@unique_values = {
  #       "some_for_id" => { "used_string_1" : 1, "used_string_2": 2 } # the numbers represent the counter to be used as suffix for the next item
  #     }
  def unique(for_id)
    @@unique_values ||= {} # initialize structure in case this method was never called before
    @@unique_values[for_id] ||= {} # initialize structure in case we have not seen this id yet
    counter = @@unique_values[for_id][self] || 0
    result = (counter == 0) ? self : "#{self}-#{counter}"
    counter += 1
    @@unique_values[for_id][self] = counter
    return result
  end

end

Caution: This should not be used for lots of attributes, since we track all prior strings (optimizations possible).

Jeer answered 27/4, 2016 at 11:10 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.