Setting default/empty attributes for user classes in __init__ [closed]

Asked 22/4, 2019 at 19:47 Answered 5/7, 2022 at 20:9

Solved python class instance instance-variables python-attrs

When I am creating a new class, should I set all instance attributes in __init__, even if they are None and in fact later assigned values in class methods?

See example below for the attribute results of MyClass:

class MyClass:
    def __init__(self,df):
          self.df = df
          self.results = None

    def results(df_results):
         #Imagine some calculations here or something
         self.results = df_results

I have found in other projects, class attributes can get buried when they only appear in class methods and there is a lot going.

So to an experienced professional programmer what is standard practice for this? Would you define all instance attributes in __init__ for readability?

Windpollinated answered 22/4, 2019 at 19:47 Comment(5)

I would initialise everything in __init__, even if None initially. It makes it clear what the instance data attributes are, and prevents AttributeErrors on self when using the instance (though of course other exceptions are still possible). – Taconite 22/4, 2019 at 19:52

Building on first comment: if you don't do this, you may later be wondering, has this been initialized? when looking at one of your attributes and wondering if you can read from it without presence checking in some method. If everything's in __init__, you know (a.) it's all there and (b.) it's been initialized in the most sensible place, where you'd look first. – Employment 22/4, 2019 at 19:58

As an alternate perspective: it sounds like you're defining a class of objects where some attributes are not valid on some objects in that class. That's problematic from a OO perspective. If you have a method that calculates results, perhaps it should return an object representing the results rather than mutating the current object? If you avoid the need for attributes defined outside of __init__ then this problem disappears. – Sirreverence 23/4, 2019 at 9:16

Thanks for the comment Daniel but I'm not doing that. All attributes will be valid on all instances, is just that some are assigned values later on through methods rather than on init – Windpollinated 23/4, 2019 at 9:31

@Andy: That's what I meant by "valid". If an attribute doesn't have a value assigned yet, then it's not "valid" in the sense that it can't be used. Yes, you can use a "placeholder" value, and None is the normal choice for such a placeholder, but a better design would be to avoid the placeholder if possible. – Sirreverence 23/4, 2019 at 13:49

Following considerable research and discussions with experienced programmers please see below what I believe is the most Pythonic solution to this question. I have included the updated code first and then a narrative:

class MyClass:
    def __init__(self,df):
          self.df = df
          self._results = None

    @property
    def results(self):
        if self._results is None:
            raise Exception('df_client is None')
        return self._results

    def generate_results(self, df_results):
         #Imagine some calculations here or something
         self._results = df_results

Description of what I learnt, changed and why:

All class attributes should be included in the __init__ (initialiser) method. This is to ensure readability and aid debugging.
The first issue is that you cannot create private attributes in Python. Everything is public, so any partially initialised attributes (such as results being set to None) can be accessed. Convention to indicate a private attribute is to place a lead underscore at the front, so in this case I changed it to self.results to self._results.

Keep in mind this is only convention, and self._results can still be directly accessed. However, this is the Pythonic way to handle what are pseudo-private attributes.
The second issue is having a partly initialised attribute which is set to None. As this is set to None, as @jferard below explains, we now have lost a fail-fast hint and have added a layer of obfuscation for debugging the code.

To resolve this we add a getter method. This can be seen above as the function results() which has the @property decorator above.

This is a function that when invoked checks if self._results is None. If so it will raise an exception (fail-safe hint), otherwise it will return the object. The @property decorator changes the invocation style from a function to an attribute, so all the user has to use on an instance of MyClass is .results just like any other attribute.

(I changed the name of the method that sets the results to generate_results() to avoid confusion and free up .results for the getter method)
If you then have other methods within the class that need to use self._results, but only when properly assigned, you can use self.results, and that way the fail-safe hint is baked in as above.

I recommend also reading @jferard's answer to this question. He goes into depth about the problems and some of the solutions. The reason I added my answer is that I think for a lot of cases the above is all you need (and the Pythonic way of doing it).

Windpollinated answered 18/9, 2020 at 10:57 Comment(2)

This is a nice solution to the problem (and very helpful for a problem I'm currently struggling with, so thank you for posting it). One suggestion for improving the solution: raise a more specific exception than just Exception. If you raise a generic Exception, then you have to catch all kinds of errors in a try/except block when you're retrieving the attribute somewhere else. If you raise a more specific exception such as AttributeError, it will be much easier to work with. – Angio 12/9, 2021 at 17:8

I'm afraid your very first point is wrong. First, a pythonic way is to write documentation on methods/attributes. Then you don't make readers read your init code. Second, you can test whether the attribute was set by try/except block (a pythonic way). If you get an AttributeError, then your field was not set. Also, for me None is a magic constant. You have to read another method to understand that that is an unacceptable value. – Agatha 29/3 at 15:46

I think you should avoid both solutions. Simply because you should avoid to create uninitialized or partially initialized objects, except in one case I will outline later.

Look at two slightly modified version of your class, with a setter and a getter:

class MyClass1:
    def __init__(self, df):
          self.df = df
          self.results = None

    def set_results(self, df_results):
         self.results = df_results

    def get_results(self):
         return self.results

And

class MyClass2:
    def __init__(self, df):
          self.df = df

    def set_results(self, df_results):
         self.results = df_results

    def get_results(self):
         return self.results

The only difference between MyClass1 and MyClass2 is that the first one initializes results in the constructor while the second does it in set_results. Here comes the user of your class (usually you, but not always). Everyone knows you can't trust the user (even if it's you):

MyClass1("df").get_results()
# returns None

MyClass2("df").get_results()
# Traceback (most recent call last):
# ...
# AttributeError: 'MyClass2' object has no attribute 'results'

You might think that the first case is better because it does not fail, but I do not agree. I would like the program to fail fast in this case, rather than do a long debugging session to find what happened. Hence, the first part of first answer is: do not set the uninitialized fields to None, because you loose a fail-fast hint.

But that's not the whole answer. Whichever version you choose, you have an issue: the object was not used and it shouldn't have been, because it was not fully initialized. You can add a docstring to get_results: """Always use set_results **BEFORE** this method""". Unfortunately the user doesn't read docstrings either.

You have two main reasons for uninitialized fields in your object: 1. you don't know (for now) the value of the field; 2. you want to avoid an expansive operation (computation, file access, network, ...), aka "lazy initialization". Both situations are met in real world, and collide the need of using only fully initialized objects.

Happily, there is a well documented solution to this problem: Design Patterns, and more precisely Creational patterns. In your case, the Factory pattern or the Builder pattern might be the answer. E.g.:

class MyClassBuilder:
    def __init__(self, df):
          self._df = df # df is known immediately
          # GIVE A DEFAULT VALUE TO OTHER FIELDS to avoid the possibility of a partially uninitialized object.
          # The default value should be either:
          # * a value passed as a parameter of the constructor ;
          # * a sensible value (eg. an empty list, 0, etc.)

    def results(self, df_results):
         self._results = df_results
         return self # for fluent style
         
    ... other field initializers

    def build(self):
        return MyClass(self._df, self._results, ...)

class MyClass:
    def __init__(self, df, results, ...):
          self.df = df
          self.results = results
          ...
          
    def get_results(self):
         return self.results
    
    ... other getters

(You can use a Factory too, but I find the Builder more flexible). Let's give a second chance to the user:

>>> b = MyClassBuilder("df").build()
Traceback (most recent call last):
...
AttributeError: 'MyClassBuilder' object has no attribute '_results'
>>> b = MyClassBuilder("df")
>>> b.results("r")
... other fields iniialization
>>> x = b.build()
>>> x
<__main__.MyClass object at ...>
>>> x.get_results()
'r'

The advantages are clear:

It's easier to detect and fix a creation failure than a late use failure;
You do not release in the wild a uninitialized (and thus potentially damaging) version of your object.

The presence of uninitialized fields in the Builder is not a contradiction: those fields are uninitialized by design, because the Builder's role is to initialize them. (Actually, those fields are some kind of forein fields to the Builder.) This is the case I was talking about in my introduction. They should, in my mind, be set to a default value (if it exists) or left uninitialized to raise an exception if you try to create an uncomplete object.

Second part of my answer: use a Creational pattern to ensure the object is correctly initialized.

Side note: I'm very suspicious when I see a class with getters and setters. My rule of thumb is: always try to separate them because when they meet, objects become unstable.

Treadwell answered 23/4, 2019 at 14:56 Comment(7)

Thanks @jferard, a realy helpful run through. On your final side note, why don't you like a class with both getters and setters? I thought that was how most people applied them. How do you seperate them? – Windpollinated 10/9, 2020 at 9:35

@Windpollinated I guess that's because of this remark that this answer was downvoted, hence I will try to make it clear. The idea is that it is easier to understand (and test) a program when most of the objects are immmutable. If you have getters and setters, objects are basically mutable, and their current state is often uncertain (it is worse if your program is concurrent). – Treadwell 10/9, 2020 at 19:52

Sometimes, you really need mutable objects, but most of the time, you need the setters to initialize the object and then the getters to use the object. In this case, a creational pattern will isolate the setters (in a builder for instance) from the getters and the created object will be immutable, as in the given example. This removes the risk of late initalization or unwanted mutation of the object and makes the tests easy. – Treadwell 10/9, 2020 at 19:52

Thanks @Treadwell for the follow up. I need to reflect on this a bit longer. I thought one of the core powers of OOP is modifying the attributes of instantiated objects to achieve the objective of the program i.e. that they are mutable. I understand that debugging is easier if your objects are immutable, but then surely your coding style is becoming more similar to a functional language? Please excuse my ignorance if my comment here is very far off the mark! – Windpollinated 12/9, 2020 at 11:19

@Windpollinated Of course, mutable objects are very useful. If you have a BankAccount instance, you want the deposit/withdrawal methods to change the balance. In this case, you will have a getter, but no setter, for balance. When you have getters and setters, it's very often because you are dealing with a simple data structure dressed up like a class. I really prefer to handle immutable data structures, that's why I'm cautious with classes that have getters and setters. – Treadwell 12/9, 2020 at 18:40

@Treadwell PyCharm gives the following warning for MyClassBuilder: Instance attribute _results defined outside __init__ . What are your thoughts on that? – Ration 6/8, 2021 at 13:48

@Ration PyCharm is right. Note that I wrote as a comment # give a default value to other fields if possible. I should have used a stronger wording. If you do not give a default value to all the fields, you get this warning because the object may be unitialized. MyClassBuyilder().build() should return a valid object (as a default constructor would do). See my edit. – Treadwell 17/8, 2021 at 7:51

class MyClass:
    def __init__(self,df):
          self.df = df
          self._results = None

    @property
    def results(self):
        if self._results is None:
            raise Exception('df_client is None')
        return self._results

    def generate_results(self, df_results):
         #Imagine some calculations here or something
         self._results = df_results

Description of what I learnt, changed and why:

All class attributes should be included in the __init__ (initialiser) method. This is to ensure readability and aid debugging.
The first issue is that you cannot create private attributes in Python. Everything is public, so any partially initialised attributes (such as results being set to None) can be accessed. Convention to indicate a private attribute is to place a lead underscore at the front, so in this case I changed it to self.results to self._results.

Keep in mind this is only convention, and self._results can still be directly accessed. However, this is the Pythonic way to handle what are pseudo-private attributes.
The second issue is having a partly initialised attribute which is set to None. As this is set to None, as @jferard below explains, we now have lost a fail-fast hint and have added a layer of obfuscation for debugging the code.

To resolve this we add a getter method. This can be seen above as the function results() which has the @property decorator above.

This is a function that when invoked checks if self._results is None. If so it will raise an exception (fail-safe hint), otherwise it will return the object. The @property decorator changes the invocation style from a function to an attribute, so all the user has to use on an instance of MyClass is .results just like any other attribute.

(I changed the name of the method that sets the results to generate_results() to avoid confusion and free up .results for the getter method)
If you then have other methods within the class that need to use self._results, but only when properly assigned, you can use self.results, and that way the fail-safe hint is baked in as above.

Windpollinated answered 18/9, 2020 at 10:57 Comment(2)

To understand the importance(or not) of initializing attributes in __init__, let's take a modified version of your class MyClass as an example. The purpose of the class is to compute the grade for a subject, given the student name and score. You may follow along in a Python interpreter.

>>> class MyClass:
...     def __init__(self,name,score):
...         self.name = name
...         self.score = score
...         self.grade = None
...
...     def results(self, subject=None):
...         if self.score >= 70:
...             self.grade = 'A'
...         elif 50 <= self.score < 70:
...             self.grade = 'B'
...         else:
...             self.grade = 'C'
...         return self.grade

This class requires two positional arguments name and score. These arguments must be provided to initialize a class instance. Without these, the class object x cannot be instantiated and a TypeError will be raised:

>>> x = MyClass()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: __init__() missing 2 required positional arguments: 'name' and 'score'

At this point, we understand that we must provide the name of the student and a score for a subject as a minimum, but the grade is not important right now because that will be computed later on, in the results method. So, we just use self.grade = None and don't define it as a positional arg. Let's initialize a class instance(object):

>>> x = MyClass(name='John', score=70)
>>> x
<__main__.MyClass object at 0x000002491F0AE898>

The <__main__.MyClass object at 0x000002491F0AE898> confirms that the class object x was successfully created at the given memory location. Now, Python provides some useful built-in methods to view the attributes of the created class object. One of the methods is __dict__. You can read more about it here:

>>> x.__dict__
{'name': 'John', 'score': 70, 'grade': None}

This clearly gives a dict view of all the initial attributes and their values. Notice, that grade has a None value as assigned in __init__.

Let's take a moment to understand what __init__ does. There are many answers and online resources available to explain what this method does but I'll summarize:

Like __init__, Python has another built-in method called __new__(). When you create a class object like this x = MyClass(name='John', score=70), Python internally calls __new__() first to create a new instance of the class MyClass and then calls __init__ to initialize the attributes name and score. Of course, in these internal calls when Python does not find the values for the required positional args, it raises an error as we've seen above. In other words, __init__ initializes the attributes. You can assign new initial values for name and score like this:

>>> x.__init__(name='Tim', score=50)
>>> x.__dict__
{'name': 'Tim', 'score': 50, 'grade': None}

It is also possible to access individual attributes like below. grade does not give anything because it is None.

>>> x.name
'Tim'
>>> x.score
50
>>> x.grade
>>>

In the results method, you will notice that the subject "variable" is defined as None, a positional arg. The scope of this variable is inside this method only. For the purposes of demonstration, I explicitly define subject inside this method but this could have been initialized in __init__ too. But what if I try to access it with my object:

>>> x.subject
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'MyClass' object has no attribute 'subject'

Python raises an AttributeError when it cannot locate an attribute within the class's namespace. If you do not initialize attributes in __init__, there is a possibility to encounter this error when you access an undefined attribute that could be local to the method of a class only. In this example, defining subject inside __init__ would have avoided the confusion and would've been perfectly normal to do so as it is not required for any computation either.

Now, lets call results and see what we get:

>>> x.results()
'B'
>>> x.__dict__
{'name': 'Tim', 'score': 50, 'grade': 'B'}

This prints the grade for the score and notice when we view the attributes, the grade has also been updated. Right from the start, we had a clear view of the initial attributes and how their values have changed.

But what about subject? If I want to know how much Tim scored in Math and what was the grade, I can easily access the score and the grade as we've seen before but how do I know the subject? Since, the subject variable is local to the scope of the results method we could just return the value of subject. Change the return statement in the results method:

def results(self, subject=None):
    #<---code--->
    return self.grade, subject

Let's call results() again. We get a tuple with the grade and subject as expected.

>>> x.results(subject='Math')
('B', 'Math')

To access the values in the tuple, let's assign them to variables. In Python, it is possible to assign values from a collection to multiple variables in the same expression, provided that the number of variables is equal to the length of the collection. Here, the length is just two, so we can have two variables to the left of the expression:

>>> grade, subject = x.results(subject='Math')
>>> subject
'Math'

So, there we have it, though it needed a few extra lines of code to get the subject. It would be more intuitive to access all of them at once using just the dot operator to access the attributes with x.<attribute>, but this is just an example and you could try it with subject initialized in __init__.

Next, consider there are many students(say 3) and we want the names, scores, grades for Math. Except the subject, all others must be some sort of a collection data type like a list that can store all the names, scores and grades. We could just initialize like this:

>>> x = MyClass(name=['John', 'Tom', 'Sean'], score=[70, 55, 40])
>>> x.name
['John', 'Tom', 'Sean']
>>> x.score
[70, 55, 40]

This seems fine at first sight, but when you take a another look(or some other programmer) at the initialization of name, score and grade in __init__, there is no way to tell that they need a collection data type. The variables are also named singular making it more obvious that they could be just some random variables that may need just one value. The purpose of programmers should be to make the intent as clear as as possible, by way of descriptive variable naming, type declarations, code comments and so on. With this in mind, let's change the attribute declarations in __init__. Before we settle for a well-behaved, well-defined declaration, we must take care of how we declare default arguments.

Edit: Problems with mutable default arguments:

Now, there are some 'gotchas' that we must be aware of while declaring default args. Consider the following declaration that initializes names and appends a random name on object creation. Recall that lists are mutable objects in Python.

#Not recommended
class MyClass:
    def __init__(self,names=[]):
        self.names = names
        self.names.append('Random_name')

Let's see what happens when we create objects from this class:

>>> x = MyClass()
>>> x.names
['Random_name']
>>> y = MyClass()
>>> y.names
['Random_name', 'Random_name']

The list continues to grow with every new object creation. The reason behind this is that the default values are always evaluated whenever __init__ is called. Calling __init__ multiple times, keeps using the same function object thus appending to the previous set of default values. You can verify this yourself as the id remains the same for every object creation.

>>> id(x.names)
2513077313800
>>> id(y.names)
2513077313800

So, what is the correct way of defining default args while also being explicit about the data type the attribute supports? The safest option is to set default args to None and initialize to an empty list when the arg values are None. The following is a recommended way to declare default args:

#Recommended
>>> class MyClass:
...     def __init__(self,names=None):
...         self.names = names if names else []
...         self.names.append('Random_name')

Let's examine the behavior:

>>> x = MyClass()
>>> x.names
['Random_name']
>>> y = MyClass()
>>> y.names
['Random_name']

Now, this behavior is what we are looking for. The object does not "carry over" old baggage and re-initializes to an empty list whenever no values are passed to names. If we pass some valid names (as a list of course) to the names arg for the y object, Random_name will simply be appended to this list. And again, the x object values will not be affected:

>>> y = MyClass(names=['Viky','Sam'])
>>> y.names
['Viky', 'Sam', 'Random_name']
>>> x.names
['Random_name']

Perhaps, the most simplest explanation on this concept can also be found on the Effbot website. If you'd like to read some excellent answers: “Least Astonishment” and the Mutable Default Argument.

Based on the brief discussion on default args, our class declarations will be modified to:

class MyClass:
    def __init__(self,names=None, scores=None):
        self.names = names if names else []
        self.scores = scores if scores else []
        self.grades = []
#<---code------>

This makes more sense, all variables have plural names and initialized to empty lists on object creation. We get similar results as before:

>>> x.names
['John', 'Tom', 'Sean']
>>> x.grades
[]

grades is an empty list making it clear that the grades will be computed for multiple students when results() is called. Therefore, our results method should also be modified. The comparisons that we make should now be between the score numbers(70, 50 etc.) and items in the self.scores list and while it does that the self.grades list should also be updated with the individual grades. Change the results method to:

def results(self, subject=None):
    #Grade calculator 
    for i in self.scores:
        if i >= 70:
            self.grades.append('A')
        elif 50 <= i < 70:
            self.grades.append('B')
        else:
            self.grades.append('C')
    return self.grades, subject

We should now get the grades as a list when we call results():

>>> x.results(subject='Math')
>>> x.grades
['A', 'B', 'C']
>>> x.names
['John', 'Tom', 'Sean']
>>> x.scores
[70, 55, 40]

This looks good but imagine if the lists were large and to figure out who's score/grade belongs to whom would be an absolute nightmare. This is where it is important to initialize the attributes with the correct data type that can store all of these items in a way that they are easily accessible as well as clearly show their relationships. The best choice here is a dictionary.

We can have a dictionary with names and scores defined initially and the results function should put together everything into a new dictionary that has all the scores, grades etc. We should also comment the code properly and explicitly define args in the method wherever possible. Lastly, we may not require self.grades anymore in __init__ because as you will see the grades are not being appended to a list but explicitly assigned. This is totally dependent upon the requirements of the problem.

The final code:

class MyClass:
"""A class that computes the final results for students"""

    def __init__(self,names_scores=None):

        """initialize student names and scores
        :param names_scores: accepts key/value pairs of names/scores
                         E.g.: {'John': 70}"""

        self.names_scores = names_scores if names_scores else {}     

    def results(self, _final_results={}, subject=None):
        """Assign grades and collect final results into a dictionary.

       :param _final_results: an internal arg that will store the final results as dict. 
                              This is just to give a meaningful variable name for the final results."""

        self._final_results = _final_results
        for key,value in self.names_scores.items():
            if value >= 70:
                self.names_scores[key] = [value,subject,'A']
            elif 50 <= value < 70:
                self.names_scores[key] = [value,subject,'B']
            else:
                self.names_scores[key] = [value,subject,'C']
        self._final_results = self.names_scores #assign the values from the updated names_scores dict to _final_results
        return self._final_results

Please note _final_results is just an internal arg that stores the updated dict self.names_scores. The purpose is to return a more meaningful variable from the function that clearly informs the intent. The _ in the beginning of this variable indicates that it is an internal variable, as per convention.

Lets give this a final run:

>>> x = MyClass(names_scores={'John':70, 'Tom':50, 'Sean':40})
>>> x.results(subject='Math')  

  {'John': [70, 'Math', 'A'],
 'Tom': [50, 'Math', 'B'],
 'Sean': [40, 'Math', 'C']}

This gives a much clearer view of the results for each student. It is now easy to access the grades/scores for any student:

>>> y = x.results(subject='Math')
>>> y['John']
[70, 'Math', 'A']

Conclusion:

While the final code needed some extra hard work but it was worth it. The output is more precise and gives clear information about each students' results. The code is more readable and clearly informs the reader about the intent of creating the class, methods, & variables. The following are the key takeaways from this discussion:

The variables(attributes) that are expected to be shared amongst class methods, should be defined in __init__. In our example, names, scores and possibly subject were required by results(). These attributes could be shared by another method like say average that computes the average of the scores.
The attributes should be initialized with the appropriate data type. This should be decided before-hand before venturing into a class-based design for a problem.
Care must be taken while declaring attributes with default args. Mutable default args can mutate the values of the attribute if the enclosing __init__ is causing mutation of the attribute on every call. It is safest to declare default args as None and re-initialize to an empty mutable collection later whenever the default value is None.
The attribute names should be unambiguous, follow PEP8 guidelines.
Some variables should be initialized within the scope of the class method only. These could be, for example, internal variables that are required for computations or variables that don't need to be shared with other methods.
Another compelling reason to define variables in __init__ is to avoid possible AttributeErrors that may occur due to accessing unnamed/out-of-scope attributes. The __dict__ built-in method provides a view of the attributes initialized here.
While assigning values to attributes(positional args) on class instantiation, the attribute names should be explicitly defined. For instance:
```
x = MyClass('John', 70)  #not explicit
x = MyClass(name='John', score=70) #explicit
```
Finally, the aim should be to communicate the intent as clearly as possible with comments. The class, its methods and attributes should be well commented. For all attributes, a short description alongwith an example, is quite useful for a new programmer who encounters your class and its attributes for the first time.

Cuckoo answered 22/4, 2019 at 19:47 Comment(6)

This is a thorough write-up, but I can't upvote it because you're encouraging the use of mutable default arguments without explaining how problematic they are. – Sirreverence 23/4, 2019 at 9:11

Daniel could you elaborate a little on what you mean by 'encouraging the use of mutable default arguments'? – Windpollinated 23/4, 2019 at 9:43

@DanielPryden, thanks for pointing this out. I'll update the answer soon. This is one of the 'gotchas' in Python that I've begun to understand now. – Cuckoo 23/4, 2019 at 10:24

@DanielPryden, I've just updated the answer with some useful information on the problems with mutable default arguments and also edited the code accordingly. Please do let me know, if the answer can be improved in any away. – Cuckoo 24/4, 2019 at 20:52

If you use from pystrict import strict \n @strict \n class Firebird: ..., then it will be a runtime error to create attrs outside of init. – Quinsy 22/11, 2019 at 14:26

"This class requires two positional arguments name and score." - is it really needed information here? Why don't you remove those several paragraphs. It would definitely shorten the answer (and probably in other places too). – Agatha 29/3 at 15:59

It's good practice to set sane default values in most applications (this solves errors with possible missing values) - so you only have to worry about data validation.

In python 3.7+ you can use dataclasses to set default values. Python creates default special methods under the hood so the class is easy to read.

It's also good practice to write & comment your code so it can be easily followed by others.

In an app which reads user config from yaml I used a variation of this answer to solve possible missing configuration values:

class Settings():

   def __init__(self):
      """ read values from the 'Default' dataclass &
          subsequently overwrite with values from YAML.
      """
      # set default values
      self.set_defaults()

      # overwrite defaults with values from yaml
      config = self.get_config()

      # read a dict into class attributes
      for key, value in config.items():
         setattr(self, key, value)


   def set_defaults(self):
      """ sets default application values from dataclass
      """
      for name, field in   self.Default.__dataclass_fields__.items():
         setattr(self, name, field.default)


    # subclass with default values
    # dataclasses require python 3.7
    @dataclass
    class Default:
       """ Stores default values for the app.
           Called by main class: 'Settings'
       """
       cache_dir: bool = False
       cleanup: bool = True
       .....


    def get_config(self):
       """ read config file """
       ...

In the final code I also made the main class a singleton as only one copy of the object needs to exist to store configuration settings. Credit to this answer for inspiration.

Fortran answered 5/7, 2022 at 20:9 Comment(2)

As I understand from your link, the purpose of dataclass is to create an init function for your object. In your variant, you already have an init method, so probably you don't need a separate dataclass here. – Agatha 29/3 at 15:56

@YaroslavNikitenko - In __init__() I set default values from the dataclass & then override them with the users settings - github.com/itoffshore/distrobuilder-menu/blob/main/src/… - to ensure the app has no missing values set. It seems to work quite well (no bugs) – Fortran 29/3 at 16:52

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags