One to many relation with Factory Boy

Asked 15/8, 2019 at 6:47 Answered 16/8, 2019 at 6:58

I have a many-to-one relationship in my SQLAlchemy models. One report has many samples (simplified for brevity):

class Sample(db.Model, CRUDMixin):
    sample_id = Column(Integer, primary_key=True)
    report_id = Column(Integer, ForeignKey('report.report_id', ondelete='CASCADE'), index=True, nullable=False)
    report = relationship('Report', back_populates='samples')

class Report(db.Model, CRUDMixin):
    report_id = Column(Integer, primary_key=True)
    samples = relationship('Sample', back_populates='report')

Now in my tests, I want to be able to generate a Sample instance, or a Report instance, and fill in the missing relationships.

class ReportFactory(BaseFactory):
    class Meta:
        model = models.Report
    report_id = Faker('pyint')
    samples = RelatedFactoryList('tests.factories.SampleFactory', size=3)

class SampleFactory(BaseFactory):
    class Meta:
        model = models.Sample
    sample_id = Faker('pyint')
    report = SubFactory(ReportFactory)

When I go to create an instance of these, the factories get stuck in an infinite loop:

RecursionError: maximum recursion depth exceeded in comparison

However, if I try to use SelfAttributes to stop the infinite loop, I end up with a report without any samples:

class ReportFactory(BaseFactory):
    samples = RelatedFactoryList('tests.factories.SampleFactory', size=3, report_id=SelfAttribute('..report_id'))

class SampleFactory(BaseFactory):
    report = SubFactory(ReportFactory, samples=[])

report = factories.ReportFactory()
l = len(report.samples) # 0

However, if I generate a Sample with SampleFactory(), it correctly has a Report object.

How should I correctly design my factories such that SampleFactory() will generate a Sample with associated Report, and ReportFactory() will generate a Report with 2 associated Samples, without infinite loops?

Chiton answered 15/8, 2019 at 6:47 Comment(0)

My final solution was actually a lot simpler than I thought:

class ReportFactory(BaseFactory):
    class Meta:
        model = models.Report

    samples = RelatedFactoryList('tests.factories.SampleFactory', 'report', size=3)


class SampleFactory(BaseFactory):
    class Meta:
        model = models.Sample

    report = SubFactory(ReportFactory, samples=[])

The key thing was using the second argument to RelatedFactoryList, which has to correspond to the parent link on the child, in this case 'report'. In addition, I used SubFactory(ReportFactory, samples=[]), which ensures that no extra samples are created on the parent if I construct a single sample.

With this setup, I can construct a sample that will have a Report associated with it, and that report only has 1 child Sample. Conversely, I can construct a Report that will automatically be populated with 3 child Samples.

I don't think there's any need to generate the actual model IDs, because SQLAlchemy will do that automatically once the models are actually inserted into the database. However, if you want to do that without using the database, I think @Xelnor's solution of report_id = factory.SelfAttribute('report.id') will work.

The only outstanding issue I have is with overriding the list of samples on the Report (e.g. ReportFactory(samples = [SampleFactory()])), but I've opened an issue documenting this bug: https://github.com/FactoryBoy/factory_boy/issues/636

Chiton answered 16/8, 2019 at 6:58 Comment(2)

This works too, but only due to the underlying features of your ORM: when you read report.samples, SQLAlchemy will dynamically fetch a list of Sample objects in the DB (or session) pointing to that specific Report. If you're not working with an ORM, you have to link them manually. – Rehabilitate 23/8, 2019 at 8:10

Thanks for the clarification. I did mention SQLAlchemy in the question, though. – Chiton 24/8, 2019 at 8:25

The RelatedFactory declaration is evaluated once the instance has been created:

The Report is instantiated
3 calls to SampleFactory are performed
The Report instantiated in step 1 is returned

In order to populate the field on the Report instances, you have to link the Sample instances to the Report at step 2.

A possible implementation would be:

class SampleFactory(BaseFactory):
    class Meta:
        model = Sample

    @classmethod
    def _after_postgeneration(cls, instance, create, results=None):
        if instance.report is not None and instance not in instance.report.samples:
            instance.report.samples.append(instance)

    id = factory.Faker('pyint')
    # Enfore `post_samples = None` to prevent creating additional samples
    report = factory.SubFactory('example.ReportFactory', samples=[], post_samples=None)
    report_id = factory.SelfAttribute('report.id')

class ReportFactory(factory.Factory):
    class Meta:
        model = Report

    id = factory.Faker('pyint')
    # Set samples = [] if needed by `Report.__init__`
    samples = []
    # Named `post_samples` to mark that they are instantiated
    # *after* the `Report` is ready (and never passed to the `samples` kwarg)
    post_samples = factory.RelatedFactoryList(SampleFactory, 'report', size=3)

With that code, when you call ReportFactory, you:

Generate a Report without any samples
Generate 3 samples, passing them a reference to the just-generated report
Upon creation, those Sample instances attach themselves to Report.samples

Rehabilitate answered 15/8, 2019 at 9:43 Comment(2)

But since you named the RelatedFactoryList post_samples, won't the the generated Report have no Samples in it? – Chiton 16/8, 2019 at 1:0

No, because we're attaching each Sample to its Report manually when it is created, in the _after_postgeneration hook. – Rehabilitate 23/8, 2019 at 8:9

Recommended topics

Hot tags