Non-member vs member functions in Python
Asked Answered
L

6

19

I'm relatively new to Python and struggling to reconcile features of the language with habits I've picked up from my background in C++ and Java.

The latest issue I'm having has to do with encapsulation, specifically an idea best summed up by Item 23 of Meyer's "Effective C++":

Prefer non-member non-friend functions to member functions.

Ignoring the lack of a friend mechanism for a moment, are non-member functions considered preferable to member functions in Python, too?

An obligatory, asinine example:

class Vector(object):
    def __init__(self, dX, dY):
        self.dX = dX
        self.dY = dY

    def __str__(self):
        return "->(" + str(self.dX) + ", " + str(self.dY) + ")"

    def scale(self, scalar):
        self.dX *= scalar
        self.dY *= scalar

def scale(vector, scalar):
    vector.dX *= scalar
    vector.dY *= scalar

Given v = Vector(10, 20), we can now either call v.scale(2) or scale(v, 2) to double the magnitude of the vector.

Considering the fact that we're using properties in this case, which of the two options - if any - is better, and why?

Louanneloucks answered 9/4, 2012 at 10:58 Comment(1)
I feel that this simply isn't true in Python. The arguments don't really sit with Python where you can modify classes so easily. Python also focuses on readability, and I feel that v.scale(2) is so much clearer than scale(v, 2). If you look in the standard library, all but the most general functions are kept as members rather than builtins.Demilitarize
P
18

Interesting question.

You're starting from a different place than most questions coming from Java programmers, which tend to assume that you need classes when you mostly don't. Generally, in Python there's no point in having classes unless you're specifically doing data encapsulation.

Of course, here in your example you are actually doing that, so the use of classes is justified. Personally, I'd say that since you do have a class, then the member function is the best way to go: you're specifically doing an operation on that particular vector instance, so it makes sense for the function to be a method on Vector.

Where you might want to make it a standalone function (we don't really use the word "member" or "non-member") is if you need to make it work with multiple classes which don't necessarily inherit from each other or a common base. Thanks to duck-typing, it's fairly common practice to do this: specify that your function expects an object with a particular set of attributes or methods, and do something with those.

Pointtopoint answered 9/4, 2012 at 11:5 Comment(3)
While we're here, a somewhat related note - a difference from C++ is that Python has no concept of public or private. The _underscore syntax is the conventional way to provide a hint that something is private, but it's not enforced.Bently
I don't agree to this answer. It tries to justify the method solution without even trying to identify why Scott Mayer's book recommends not to use the method solution.Ruthenious
I read again Scott Mayer's recommendation. He recommends not to use methods in this situation. But this answer writes it is "justified", "the best way" and "it makes sense". This is the opposite of Scott Mayer's recommendation. And there is no reason why. The reason why Scott Mayer makes this recommendation is that most developers have a wrong personal intuition that leads them to make this decision wrong. It is exactly this sort of answers, why Scott Mayer makes the recommendation.Ruthenious
M
4

Look at your own example - the non-member function has to access the data members of the Vector class. This is not a win for encapsulation. This is especially so as it changes data members of the object passed in. In this case, it might be better to return a scaled vector, and leave the original unchanged.

Additionally, you will not realise any benefits of class polymorphism using the non-member function. For example, in this case, it can still only cope with vectors of two components. It would be better if it made use of a vector multiplication capability, or used a method to iterate over the components.

In summary:

  1. use member functions to operate on objects of classes you control;
  2. use non-member functions to perform purely generic operations which are implemented in terms of methods and operators which are themselves polymorphic.
  3. It is probably better to keep object mutation in methods.
Middlebrooks answered 9/4, 2012 at 11:4 Comment(5)
The members which are accessed (x and y component) have to be public anyway for the object to be useful. Considering vectors with different dimensions, you have a point about polymorphism though.Parch
@delnan It's not that I consider the standalone function so much a violation of encapsulation, rather that it very obviously does not improve encapsulation in any way. I do disapprove of its lack of genericity - it can only cope with vectors of two elements. I also kind of disapprove of the fact that it changes the vector's data.Middlebrooks
@Middlebrooks - The names of the objects in the example are irrelevant, and indeed, in my defence, it was labelled as being asinine. The question asks for comments on the relative merits of member/non-member functions and not for suggestions on how the make arbitrary examples more generic.Louanneloucks
@Louanneloucks I have not commented on your choice of names. Please identify which parts of this answer you think are not on-point: every part relates to the choice of methods vs general functions.Middlebrooks
@Middlebrooks - No parts of the answer are off-topic - your comment of "lack of genericity" was what I was referring to. And when I say the names are not important, I mean I could have equally called the class "Apple" and my question would still stand. Thanks for your answer, though.Louanneloucks
F
4

A free function gives you the flexibility to use duck-typing for that first parameter as well.

A member function gives you the expressiveness of associating the functionality with the class.

Choose accordingly. Generally, functions are created equal, so they should all have the same assumptions about the interface of a class. Once you publish a free function scale, you are effectively advertising that .dX and .dY are part of the public interface of Vector. That is probably not what you want. You are doing this in exchange for the ability to reuse the same function with other objects that have a .dX and .dY. That is probably not going to be valuable to you. So in this case I would certainly prefer the member function.

For good examples of preferring a free function, we need look no further than the standard library: sorted is a free function, and not a member function of list, because conceptually you ought to be able to create the list that results from sorting any iterable sequence.

Florencia answered 9/4, 2012 at 12:0 Comment(1)
I don't agree to this answer. It is not true that a free function reveals the internals of the class. Why does Scott Mayer's recommends then not to use the method solution?Ruthenious
V
3

Prefer non-member non-friend functions to member functions

This is a design philosophy and can and should be extended to all OOP Paradigm programming languages. If you understand the essence of this, the concept is clear

If you can do without requiring private/protected access to the members of a Class, your design do not have a reason to include the function, a member of the Class. To think this the Other way, when designing a Class, after you have enumerated all properties, you need to determine the minimal set of behaviors that would be sufficient enough to make the Class. Any member function that you can write using any of the available public methods/member functions should be made public.

How much is this applicable in Python

To some extent if you are careful. Python supports a weaker encapsulation compared to the other OOP Languages (like Java/C++) notably because there is no private members. (There is something called Private variables which a programmer can easily write by prefixing an '_' before the variable name. This becomes class private through a name mangling feature.). So if we literally adopt Scott Meyer's word totally considering there is a thin like between what should be accessed from Class and what should be from outside. It should be best left to the designer/programmer to decide whether a function should be an integral part of the Class or Not. One design principle we can easily adopt, "Unless your function required to access any of the properties of the class you can make it a non-member function".

Vola answered 9/4, 2012 at 12:21 Comment(0)
A
2

As scale relies on member-wise multiplication of a vector, I would consider implementing multiplication as a method and defining scale to be more general:

class Vector(object):
    def __init__(self, dX, dY):
        self._dX = dX
        self._dY = dY

    def __str__(self):
        return "->(" + str(self._dX) + ", " + str(self._dY) + ")"

    def __imul__(self, other):
        if other is Vector:
            self._dX *= other._dX
            self._dY *= other._dY
        else:
            self._dX *= other
            self._dY *= other

        return self

def scale(vector, scalar):
    vector *= scalar

Thus, the class interface is rich and streamlined while encapsulation is maintained.

Arethaarethusa answered 7/5, 2014 at 18:59 Comment(3)
I think this is definitely an interesting approach to my specific example.Louanneloucks
Why would you define such a scale function if you already have the operator that you can use?Ruthenious
@Ruthenious really just for completeness, and to show application of the __imul__ function, though frankly, I think I prefer @marcin's answer anyway!Arethaarethusa
R
0

One point has not yet been said. Assume you have a class Vector and a class Matrix and both shall be scalable.

With non-member functions, you have to define two scale functions. One for Vector and one for Matrix. But Python is dynamically typed. That means, you must give both functions different names like scale_vector and scale_matrix and you always have to explicitly call one of them. If you name them identically, Python does not know which of them you want to call.

This is a significant difference between both options. If you know in advance (statically), what types you have, the solution with different names will work. But if you don't know this in advance, methods are the only way how Python provides 'dynamic dispatching'.

Often 'dynamic dispatching' is exactly what you want, in order to achieve dynamic behavior and a desired level of abstraction. Interestingly classes and objects is the only mechanism how Object Oriented languages (and also Python) provide 'dynamic dispatching'. And there the recommendation of Scott Meyer also fits for Python. It can be read: Use methods only when you really need dynamic dispatching. This is not a question what syntax you like more.

Although you need methods often (or you feel like you need them often), this answer is not a recommendation for the method approach. If you don't plan to use things like inheritance, polymorphism or operators, the non-member solution is better, because it can be implemented immutable (in case you prefer such style), it can be tested easier, it has a lower complexity (as it does not bring all the Object Oriented mechanisms into the game), it is better to read and it is more flexible (as it can be used for all things, that have two components dx, dy).

However, as a disclaimer, I must say that in general discussions about encapsulation (e.g. private data) or static typing are not always helpful when working with Python. It is a design decision that these concepts do not exist. It is also a decision of the language designers, that you don't have to define everything in advance. Bringing such concepts into your Python code is sometimes controversial seen, and leads to code, where people might say, this is not pythonic. Not everybody likes the Object Oriented programming style and not always it is appropriate. Python gives you much freedom to use Object Oriented programming only if you want. Not having a non-object oriented mechanism for dynamic dispatching does not mean, that you don't have alternatives.

Ruthenious answered 8/6, 2023 at 22:16 Comment(3)
Your point is valid about scale needing to be renamed and it's because scale is so vague of a name but its true purpose is quite specific and only applies to a 2d vector. This, in my opinion, is a huge problem with stand-alone functions—it's rare for a function to provide such general utility that a stand-alone function is appropriate. If you have to pass class data to a function, I think it's most likely that such a function is going to be specifically tuned to that class and a stand-alone, generic function isn't appropriate.Ingram
Yes, of course. Having two functions scale_vector and scale_matrix is normally not what you want. In extreme cases you would end up with hundreds of function names for all combinations of arguments. I did not want to value this in my answer, because it is up to the reader to make this decision. Defining two standalone functions is better in general. True! But only when the programming language is able to pick the right one (by looking at the type signature). This is not how Python works.Ruthenious
You said "generic function isn't appropriate". I fully agree. This is the problem with the other answers. Of course it would be nice to have a generic duck-type-y perfect function. But in practice you normally have concrete problems and the effort to find generic mechanisms often does not pay off. So in practice you need something that just works without much thinking and speculating what theoretically could happen in the future.Ruthenious

© 2022 - 2024 — McMap. All rights reserved.