Is there any case where len(someObj) does not call someObj's __len__ function?
Asked Answered
H

5

15

Is there any case where len(someObj) does not call someObj's __len__ function?

I recently replaced the former with the latter in a (sucessful) effort to speed up some code. I want to make sure there's not some edge case somewhere where len(someObj) is not the same as someObj.__len__().

Holdup answered 30/1, 2009 at 15:58 Comment(0)
L
18

If __len__ returns a length over sys.maxsize, len() will raise an exception. This isn't true of calling __len__ directly. (In fact you could return any object from __len__ which won't be caught unless it goes through len().)

Loveridge answered 30/1, 2009 at 20:21 Comment(4)
Note that, since len is supposed to be return the number of elements in a collection, returning something bigger than sys.maxsize is almost certainly nonsense.Mendez
@Mike In theory you could have an object like Python 3's range that doesn't store all its elements in memory and calculates its __len__ using math. range.__len__ itself raises an error in that situation: range(sys.maxsize+1).__len__() gives OverflowError: Python int too large to convert to C ssize_tCatchup
Note that the above is only true in 2.x. In 3.6, for example, I get len(range(1000000000000)) -> 1000000000000, and (worryingly) range(1000000000000).__len__() -> -727379968. Although this result still shows why you shouldn't call __len__ yourself!Amarillis
len also raises an exception if the value returned by __len__ is negative, or not an int.Infirmary
H
12

What kind of speedup did you see? I cannot imagine it was noticeable was it?

From http://mail.python.org/pipermail/python-list/2002-May/147079.html

in certain situations there is no difference, but using len() is preferred for a couple reasons.

first, it's not recommended to go calling the __methods__ yourself, they are meant to be used by other parts of python.

len() will work on any type of sequence object (lists, tuples, and all). __len__ will only work on class instances with a __len__ method.

len() will return a more appropriate exception on objects without length.

Hayward answered 30/1, 2009 at 16:7 Comment(6)
It was about half a second on a program that ran for one minute. It's probably because I called len 2,443,519 times. As I was writing the question I realized that I should probably reduce the number of times I'm calling len.Holdup
@David: Yeah you missed mentioning the 2,443,519 part. Holy hell ;)Hayward
I personally wouldn't consider the extra 1/120th speedup to make it worth the code ugliness, but that's your call.Ell
@Eli, Normally I would agree with you. In this case I'm trying to benchmark the same problem in multiple languages.Holdup
FYI: I was able to remove 2,363,276 of those calls to len and that sped things up by another second and a half.Holdup
@DavidLocke: I would think a benchmark based on idiomatic code for each language would be much more useful than one based on bent and mangled code.Zook
D
3

I think the answer is that it will always work -- according to the Python docs:

__len__(self):

Called to implement the built-in function len(). Should return the length of the object, an integer >= 0. Also, an object that doesn't define a __nonzero__() method and whose __len__() method returns zero is considered to be false in a Boolean context.

Deferral answered 30/1, 2009 at 16:3 Comment(0)
C
3

There are cases where len(someObj) is not the same as someObj.__len__() since len() validates __len__()'s return value. Here are the possible errors in Python 3.6.9:

  • Too low, i.e. less than 0

    ValueError: __len__() should return >= 0
    
  • Too high, i.e. greater than sys.maxsize (CPython-specific, per the docs)

    OverflowError: cannot fit 'int' into an index-sized integer
    
  • An invalid type, e.g float

    TypeError: 'float' object cannot be interpreted as an integer
    
  • Missing, e.g. len(object)

    TypeError: object of type 'type' has no len()
    

    I mention this because object.__len__() raises a different exception, AttributeError.

It's also worth noting that range(sys.maxsize+1) is valid, but its __len__() raises an exception:

OverflowError: Python int too large to convert to C ssize_t
Catchup answered 5/1, 2020 at 20:59 Comment(0)
R
-4

According to Mark Pilgrim, it looks like no. len(someObj) is the same as someObj.__len__();

Cheers!

Rushing answered 30/1, 2009 at 16:1 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.