Avoiding duplicates with factory_boy factories
Asked Answered
W

6

23

I'm using factory_boy to create test fixtures. I've got two simple factories, backed by SQLAlchemy models (simplified below).

I'd like to be able to call AddressFactory.create() multiple times, and have it create a Country if it doesn't already exist, otherwise I want it to re-use the existing record.

class CountryFactory(factory.Factory):
    FACTORY_FOR = Country

    cc = "US"
    name = "United States"


class AddressFactory(factory.Factory):
    FACTORY_FOR = Address

    name = "Joe User"
    city = "Seven Mile Beach"
    country = factory.SubFactory(CountryFactory, cc="KY", name="Cayman Islands")

My question is: how can I set up these factories so that factory_boy doesn't try to create a new Country every time it creates an Address?

Wheelwork answered 2/10, 2013 at 21:26 Comment(4)
Did you take a look at factory.alchemy?Ararat
Not sure what you're referring to in that link; there's nothing in that specific file that seems helpful. I've looked at the docs for factory_boy and the SQLAlchemy factory in particular, but I haven't seen anything about re-using records. Basically looking for a "find or create" type functionality.Wheelwork
After more research into this, the short answer is that you can't do it. There's support for get-or-create with Django models, but not SQLAlchemy. I'm leaving this question open because I'm hoping to add SQLAlchemy support for this one of these days if no one beats me to it.Wheelwork
Since version 3.0.0, there is an sqlalchemy_get_or_create option as well.Aetolia
A
1

Since version 3.0.0, SQLAlchemy factories support the sqlalchemy_get_or_create option. As the documentation says, "Fields whose name are passed in this list will be used to perform a Model.query.one_or_none() or the usual Session.add()".

Using the example from the docs:

class UserFactory(factory.alchemy.SQLAlchemyModelFactory):
    class Meta:
        model = User
        sqlalchemy_session = session
        sqlalchemy_get_or_create = ('username',)

    username = 'john'
>>> User.query.all()
[]
>>> UserFactory()                   # Creates a new user
<User: john>
>>> User.query.all()
[<User: john>]

>>> UserFactory()                   # Fetches the existing user
<User: john>
>>> User.query.all()                # No new user!
[<User: john>]

>>> UserFactory(username='jack')    # Creates another user
<User: jack>
>>> User.query.all()
[<User: john>, <User: jack>]

Take into consideration that when sqlalchemy_get_or_create is used, any new values passed to the factory are NOT used to update an existing model.

Aetolia answered 1/3 at 18:56 Comment(0)
E
11

In the latest factory-boy==2.3.1 you can add FACTORY_DJANGO_GET_OR_CREATE

class CountryFactory(factory.django.DjangoModelFactory):
    FACTORY_FOR = 'appname.Country'
    FACTORY_DJANGO_GET_OR_CREATE = ('cc',)

    cc = "US"
    name = "United States"

Assuming cc field is the unique identifier.

Entasis answered 2/4, 2014 at 6:57 Comment(3)
As I mentioned in my question and followup comment above, I'm using SQLAlchemy. I know this exists for Django, but that doesn't help me. The functionality I'm looking for doesn't exist in factory boy, and I still haven't had the time to add it myself.Wheelwork
Doesn't work in factory_boy==2.10.0 (not sure if this applies to SQLAlchemy or not)Fingertip
now django_get_or_create as of 2.4.0 factoryboy.readthedocs.io/en/latest/…Permenter
L
4

While you're right that there's no get_or_create function for SQLAlchemy-based factories, if the objects you want to use as a foreign key already exist, you can iterate through them:

http://factoryboy.readthedocs.org/en/latest/recipes.html#choosing-from-a-populated-table

So conceivably you could hack together a solution in your factory by using a lazy attribute that first checks if the object exists in the db, and if so it uses this method to iterate through them, but if the object doesn't exist, it calls a SubFactory to create the object first.

Latta answered 7/8, 2015 at 1:48 Comment(1)
This is definitely a hacky solution, much better if you submitted a PR adding the get_or_create ability for SQLAlchemy ;-)Latta
F
1

For SqlAlchemy you can try this. This is cache factory as well:

class StaticFactory(factory.alchemy.SQLAlchemyModelFactory):):
    __static_exclude = ('__static_exclude', '__static_cache',)
    __static_cache = {}
 
    @classmethod
    def _create(cls, model_class, *args, **kwargs):
        """Helper for avoid duplicate factory"""
 
        # Exclude static cache
        cls._meta.exclude += cls.__static_exclude
 
        _unique_key = None
 
        # Get first unique keys from table. I'll be cache key.
        for col in model_class.__table__.columns:
            if any([col.primary_key, col.unique]):
                _unique_key = kwargs.get(col.name)
                if _unique_key:
                    break
 
        _instance = cls.__static_cache.get(_unique_key)
        if _instance:
            return _instance
 
        _session = cls._meta.sqlalchemy_session
        with _session.no_autoflush:
            obj = model_class(*args, **kwargs)
            _session.add(obj)
            cls.__static_cache.update({_unique_key: obj})
            return obj

class LanguageFactory(StaticFactory):
    class Meta:
        model = Language
        exclude = ('lang',)
Fullrigged answered 9/5, 2019 at 10:13 Comment(1)
Thanks for mentioning _create.. I was trying to use SubFactory which bypasses both build and create, so overriding _create worked for me!Patentor
A
1

Since version 3.0.0, SQLAlchemy factories support the sqlalchemy_get_or_create option. As the documentation says, "Fields whose name are passed in this list will be used to perform a Model.query.one_or_none() or the usual Session.add()".

Using the example from the docs:

class UserFactory(factory.alchemy.SQLAlchemyModelFactory):
    class Meta:
        model = User
        sqlalchemy_session = session
        sqlalchemy_get_or_create = ('username',)

    username = 'john'
>>> User.query.all()
[]
>>> UserFactory()                   # Creates a new user
<User: john>
>>> User.query.all()
[<User: john>]

>>> UserFactory()                   # Fetches the existing user
<User: john>
>>> User.query.all()                # No new user!
[<User: john>]

>>> UserFactory(username='jack')    # Creates another user
<User: jack>
>>> User.query.all()
[<User: john>, <User: jack>]

Take into consideration that when sqlalchemy_get_or_create is used, any new values passed to the factory are NOT used to update an existing model.

Aetolia answered 1/3 at 18:56 Comment(0)
S
0

Another hacky solution is to overwrite the create method of the factory in a way that the object is searched by a querying and caching the results.

This simple example does no filtering on the **kwargs though:

class StaticFactory(SQLAlchemyModelFactory):                        

    counter = 0                                                     
    cache = []                                                      
    model = None                                                    

    @classmethod                                                    
    def create(cls, **kwargs):                                      
        if not cls.cache:                                           
            cls.cache = your_session.query(cls.model).all()     
        instance = cls.cache[cls.counter]                           
        cls.counter = (cls.counter + 1) % len(cls.cache)            
        return instance                                             
Sprain answered 31/3, 2017 at 7:51 Comment(1)
This won't work if you're using subfactories, ie. static = SubFactory(StatisFactory) on another factory. For whatever reason, build and create aren't called on the subfactory. _create will, however, be called.Patentor
N
0

We can create a new instance of address with already existing country instance using factory.Iterator method

import factory, factory.django
from . import models


class CountryFactory(factory.Factory.DjangoModelFactory):
    model = models.Country

    cc = "US"
    name = "United States"

class AddressFactory(factory.Factory.DjangoModelFactory):
    model = models.Address

    name = "Joe User"
    city = "Seven Mile Beach"
    country = factory.Iterator(models.Country.objects.all())

Here, we accessed Country instances from the database and passed it to the country attribute of AddressFactory, which creates an address instance with already created country instance in database.

Nena answered 23/6, 2020 at 18:22 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.