Quick Note: Setting Up Pytest Model Factory Fixtures for Django

·

17 min read

Update: 01.03.2021

Recently, I've found out this can be easily implemented using factoryboy and pytest-factoryboy.

For each model registered, pytest-factoryboy dynamically generates an instance fixture and a factory fixture. Let's say you have a model called User and you registered it to pytest-factoryboy, you can simply call user fixture to get an instance or user_factory fixture to generate it on the test. You can also use parameters on instance fixtures to have cleaner tests.

Below is just a way to implement the same behavior without factoryboy and pytest-factoryboy. I wasn't aware these existed back then.


Generating model instances in Django test is vital and it needs to be done quick and in a painless way. In this article, I'd like to present a common pattern that I've figured out.

Requirements

The methods below are tested in the relevant environment below:

  • Python >= 3.5
  • Django >= 2.2
  • pytest-django

So I assume you have these same dependencies in your enironment.

Factory Fixtures

I'd like to first discuss about factory fixtures. These are fixtures that simply return functions. To give an example:

@pytest.fixture
def foo_factory():
    def factory(number):
        return "foo{}".format(number)
    return factory

This way we can simply inject them into our tests and invoke them on the fly.

def test_foo(foo_factory):
    assert foo_factory(1) == "foo1"

This is useful when you'd like to generate instances based on some context and especially useful when creating model instances.

Model Instance Factories

Assuming we have a model as below:

class Person(models.Model):
    name = models.CharField(max_length=128)
    surname = models.CharField(max_length=128)

We can construct a factory fixture as below:

@pytest.fixture
def person_factory(db):  # mind "db"
    # db is a fixture of pytest-django
    # it is used to activate testing in database in a django environment
    def factory(**kwargs):
        return Person.objects.create(**kwargs)
    return factory

So now we can create many instances as we need in a test:

def test_person(person_factory):
    person = person_factory(name="Eray", surname="Erdin")
    assert (person.name, person.surname) == ("Eray", "Erdin")

While this might seem good, it is not still perfect. We would probably like to create an instance quickly, without providing any data such as:

def test_person(person_factory):
    person = person_factory()  # what we'd like to do
    assert (person.name, person.surname) == ("Eray", "Erdin")

This will probably fail because name and surname fields are implicitly NOT NULL.

IntegrityError: NOT NULL constraint failed: app_person.name

Setting Defaults

What we'd like to do is to write our fixture in such a way that it will provide a default value to the field that's not provided. We provide the values with kwargs and it's a dict of keyword-only arguments. Since it is a plain dict, there is a helper method called setdefault for dict. This method adds the key and value if the key does not exist. With this lore, see the example below:

@pytest.fixture
def person_factory(db):
    def factory(**kwargs):
        kwargs.setdefault("name", "Eray")
        kwargs.setdefault("surname", "Erdin")
        return Person.objects.create(**kwargs)
    return factory

So name and surname will have a default value and will not needed to be provided.

def test_person(person_factory):
    person = person_factory(name="Şenay")  # now this won't error
    assert (person.name, person.surname) == ("Şenay", "Erdin")  # `surname` defaults to `"Erdin"` in this case

Unique Constraint

However, this is not the end yet. In this case, neither name nor surname fields have unique constraints. Let's add a field with unique to our model.

class Person(models.Model):
    name = models.CharField(max_length=128)
    surname = models.CharField(max_length=128)
    phone = models.CharField(max_length=128, unique=True)  # new field

⚠️ Warning

In this example, I do not really care about the formatting and validation of phone field. It's added only for the sake of example. I will count "1", "2", "3" and such values as valid.

And we should update our related factory fixture accordingly:

@pytest.fixture
def person_factory(db):
    def factory(**kwargs):
        kwargs.setdefault("name", "Eray")
        kwargs.setdefault("surname", "Erdin")
        kwargs.phone("phone", "1")  # See warning above.
        return Person.objects.create(**kwargs)
    return factory

This is okay with one instance in a test, however will immediately fail at the initialization of the second instance.

def test_person(person_factory):
    person1 = person_factory()
    person2 = person_factory()  # will fail here
    assert (person1.name, person1.surname) == ("Eray", "Erdin")
    assert (person2.name, person2.surname) == ("Eray", "Erdin")

This will fail saying:

django.db.utils.IntegrityError: UNIQUE constraint failed: app_person.phone

This is because the first time the factory is invoked the phone is "1" and the second time it is the same, yet we explicitly defined phone field to be unique.

These kind of errors usually require different solutions but the foundation is the same. We need to set unique field to a different value each time its factory fixture is invoked. In this example, I will use the count of Person instances and use them to set the phone field.

@pytest.fixture
def person_factory(db):
    def factory(**kwargs):
        kwargs.setdefault("name", "Eray")
        kwargs.setdefault("surname", "Erdin")
        kwargs.setdefault("phone", str(Person.objects.count()))  # person instance count as phone
        return Person.objects.create(**kwargs)
    return factory

ℹ️ Tip

You can also format strings as below:

"foo{}".format(bar)

This is what I usually use with CharFields and alikes.

Now we will not get those IntegrityErrors because the phone will be different in each invocation.

def test_person(person_factory):
    person1 = person_factory()  # phone: "0"
    person2 = person_factory()  # phone: "1"
    assert (person1.name, person1.surname) == ("Eray", "Erdin")
    assert (person2.name, person2.surname) == ("Eray", "Erdin")

Relations

This pattern also lets us create instances with relations pretty easily. This time, let's use a different kind of example with two models where they are bound by a ForeignKey.

class Category(models.Model):
    name = models.CharField(max_length=128)
    # other fields if necessary ...

class Product(models.Model):
    price = models.DecimalField()
    category = models.ForeignKey(Category, models.CASCADE, "products")
    # other fields if needed ...

⚠️ Warning

In this example, I assume that each Product will have only one Category and a Category is required.

In this case, if I'd like to create a Product instance, I will definitely need a Category instance as well. First, write up our standard Category factory fixture:

@pytest.fixture
def category_factory(db):
    def factory(**kwargs):
        kwargs.setdefault("name", "foo")
        # same for other fields
        return Category.objects.create(**kwargs)
       return factory

And the factory fixture for Product. Remember we will need a Category instance for that.

@pytest.fixture
def product_factory(db, category_factory):  # see how i injected `category_factory`
    def factory(**kwargs):
        kwargs.setdefault("price", 1.5)
        kwargs.setdefault("category", category_factory())  # and i invoked `category_factory` here
        # same for other fields
        return Product.objects.create(**kwargs)
    return factory

Finally, whenever I need a Product instance in test, I can only inject product_factory and invoke it. It will implicitly create a Category instance for itself.

@pytest.fixture
def test_product(db, product_factory):
    product = product_factory()
    assert product.price == 1.5
    assert product.category.name == "foo"

Options

Let's take a look at the example below:

# models.py

class Post(models.Model):  # a blog post
    title = models.CharField(max_length=128)
    content = models.TextField()
    share_rss = models.BooleanField()
    share_facebook = models.BooleanField()
    share_twitter = models.BooleanField()
    share_telegram = models.BooleanField()
    share_reddit = models.BooleanField()

# fixtures.py
def post_factory(db):
    def factory(**kwargs):
        kwargs.setdefault("title", "Foo Bar Baz")
        kwargs.setdefault("content", "lorem ipsum dolor sit amet")
        kwargs.setdefault("share_rss", False)
        kwargs.setdefault("share_facebook", False)
        kwargs.setdefault("share_twitter", False)
        kwargs.setdefault("share_telegram", False)
        kwargs.setdefault("share_reddit", False)

        return Post.objects.create(**kwargs)
    return factory

Let's assume we have a test case that all share_* fields need to be different. To do that, we would set it on test manually, one by one, as below:

def test_not_share(post_factory):
    post_factory(
        share_rss=True,
        share_facebook=False,
        share_twitter=True,
        share_telegram=True,
        share_reddit=False,
    )
    # ... your test logic here ...

What about we could pass some extra arguments to post_factory fixture and make it set all to True or False in a shorthand way? Here we can use *args as a help. Let's refactor post_factory:

def post_factory(db):
    def factory(*args, **kwargs):  # notice i've added args
        # setting default values
        kwargs.setdefault("title", "Foo Bar Baz")
        kwargs.setdefault("content", "lorem ipsum dolor sit amet")
        kwargs.setdefault("share_rss", False)
        kwargs.setdefault("share_facebook", False)
        kwargs.setdefault("share_twitter", False)
        kwargs.setdefault("share_telegram", False)
        kwargs.setdefault("share_reddit", False)

        # option defaults
        opts = next(iter(args), dict())
        # here i take the first element of *arg
        # if there is no such thing (no first element)
        # then it will default to an empty dict

        # do the same thing to opts as we do to **kwargs
        opts.set_default("share_all", None)
        # if True, will set all share_* fields to True
        # if False, will set all share_* fields to False
        # else (which is None or sth else), will use **kwargs

        share_all = opts["share_all"]

        # options logic
        # if share_all is True or False
        if opts["share_all"] in (True, False):
            share_all = opts["share_all"]
            kwargs.update(
                share_rss=share_all,
                share_facebook=share_all,
                share_twitter=share_all,
                share_telegram=share_all,
                share_reddit=share_all,
            )
        # if it is None, it will use **kwargs default or the one you provide in **kwargs

        return Post.objects.create(**kwargs)
    return factory

So, what did we achieve? Let's see it in test:

class TestShare:
    def test_share_all(self, post_factory):
        post_factory({"share_all": True})
        # sets all share_* fields to True
        # even if we pass share_facebook=False,
        # share_facebook will be True

        # ... your test logic ...

    def test_share_none(self, post_factory):
        post_factory({"share_all": False})
        # the reverse of test_share_all

        # ... your test logic ...

    def test_share_some(self, post_factory):
        post_factory(share_twitter=True)
        # rest will be False

        # ... your test logic ...

With this way, we can define custom behavior or quick assignment on multiple fields easily.

Now you might ask why I have used *args instead of **kwargs. With this method, I seperated the purpose of *args and **kwargs. *args is for options and **kwargs is for fields. Maybe you might go far saying "You could've swap places, *args to use fields and **kwargs to use options.". This is purely a choice of design. In my case, I find it easier to write and read:

post_factory({"share_all": True}, title="foo", content="lorem")

...than the monstrosity below:

post_factory({"title": "foo", "content": "lorem"}, share_all=True)

The decision is up to you, however.

Final Words

This is how I deal with generating models on the test and I do that with each model and update the factory fixtures when I update models. I will put some possible questions/thoughts in your mind in case you are interested.

User is a built-in model in Django. Do you know a method for User model?

Ah, yes. I can even give the code away.

@pytest.fixture
def author_factory(db):
    def factory(**kwargs):
        total_instances = User.objects.count()
        username = f"user{total_instances}"  # (i)
        email = f"foo{total_instances}@bar.baz"

        kwargs.setdefault("username", username)
        kwargs.setdefault("password", "111111")
        kwargs.setdefault("email", email)

        return User.objects.create_user(**kwargs)  # (ii)

    return factory

As you probably can see, (i) I generate username based on the user instances and (ii) I have used create_user instead of create mainly because it is a special method for User's manager.

Nice, now I can generate a lot of instances with faker (or alike).

While I find faker to be useful in some cases, I recommend to avoid overusing it. I do not usually use it because, in my opinion, you usually test the behavior instead of the data.

⚠️ Warning

In the example below, I assume you have already written required models, views, serializers and factory fixtures.

To give a more concrete example (also assuming it's going to be an API endpoint and will use Django Rest Framework), let's assume that you have an endpoint /whatever/ that will return a list of, well, whatevers and those objects are related to Users.

# model of it
class Whatever(models.Model):
    # ...
    user = models.ForeignKey(User)
    # ...

So, if user1 is logged in, he will see his whatevers and not user2's whatevers.

In this case, it is just enough to create 2 instances of Whatever for user1 and 1 instance of Whatever for user2 and do the test.

def test_whatever_list_endpoint(client, user_factory, whatever_factory):  # assuming you've written factory for it
    user1 = user_factory()  # (ii)
    user2 = user_factory()  # (ii)

    for i in range(2):  # 2 whatevers for user1 # (i)
        whatever_factory(user=user1)

    whatever_factory(user=user2)   # 1 whatever for user2 # (i)

    url = reverse("whatever-list")  # from django.urls import reverse

    # whatever list of user1
    client.login(username=user1.username, password="111111")  # (ii)
    data1 = client.get(url).json()
    client.logout()

    # whatever list of user2
    client.login(username=user2.username, password="111111")  # (ii)
    data2 = client.get(url).json()

    # assertions
    assert len(data["results"]) == 2  # because we've created 2 for user1 # (i)
    assert len(data["results"]) == 1  # because we've created 1 for user2 # (i)

In this context, you can notice many things:

  • (i) I have never needed the test the content (the other fields and values or its serializer's values) of Whatever instances. I do not need to because they will most likely render the same. You should test data if your data changes by a behavior. In the case above, the data does not matter anyway, because there is no logic that changes it.
  • (ii) By not using faker, I could keep the content of User instances predictable and consistent. I know the password for every testing user will be "111111" and I know I would not (and you should never attempt to) test the password anyway.