How is the 'is' keyword implemented in Python?
Asked Answered
H

11

70

... the is keyword that can be used for equality in strings.

>>> s = 'str'
>>> s is 'str'
True
>>> s is 'st'
False

I tried both __is__() and __eq__() but they didn't work.

>>> class MyString:
...   def __init__(self):
...     self.s = 'string'
...   def __is__(self, s):
...     return self.s == s
...
>>>
>>>
>>> m = MyString()
>>> m is 'ss'
False
>>> m is 'string' # <--- Expected to work
False
>>>
>>> class MyString:
...   def __init__(self):
...     self.s = 'string'
...   def __eq__(self, s):
...     return self.s == s
...
>>>
>>> m = MyString()
>>> m is 'ss'
False
>>> m is 'string' # <--- Expected to work, but again failed
False
>>>
Hostile answered 7/6, 2010 at 8:17 Comment(0)
K
143

Testing strings with is only works when the strings are interned. Unless you really know what you're doing and explicitly interned the strings you should never use is on strings.

is tests for identity, not equality. That means Python simply compares the memory address a object resides in. is basically answers the question "Do I have two names for the same object?" - overloading that would make no sense.

For example, ("a" * 100) is ("a" * 100) is False. Usually Python writes each string into a different memory location, interning mostly happens for string literals.

Kellene answered 7/6, 2010 at 8:21 Comment(4)
I've observed in the past that string interning may happen for run-time computed and input values if they're sufficiently short. 'a' * 100 is not 'a' * 100; but 'a' * 20 is "a" * 20. Meanwhile 'a'.upper() is not 'a'.upper(). Jython, IronPython, PyPy and others may intern more agressively. In short it's implementation dependent. Calling the 'intern()' function on strings will "force" a string to have the same object identity as any equivalent and previously intern()'d string, as you say. However, I don't know of a valid use case for testing string identity. (Possible performance aside).Hyperacidity
("a" * 100) is ("a" * 100) might be False in 2010, but today it's True.Tabescent
@goteguru, not for me, in 2019, with CPython 3.5.6. I think the comment by Jim from 2010 is the real winner: It's implementation dependency. Assuming nothing.Poised
@Poised of course it's implementation sepecific, we shouldn't use 'is' for string comparision. Maybe your cython optimiser didn't interned the string for some reason. Try "a"*20 which is smaller.Tabescent
W
26

The is operator is equivalent to comparing id(x) values. For example:

>>> s1 = 'str'
>>> s2 = 'str'
>>> s1 is s2
True
>>> id(s1)
4564468760
>>> id(s2)
4564468760
>>> id(s1) == id(s2)  # equivalent to `s1 is s2`
True

id is currently implemented to use pointers as the comparison. So you can't overload is itself, and AFAIK you can't overload id either.

So, you can't. Unusual in python, but there it is.

Wandawander answered 7/6, 2010 at 8:27 Comment(3)
You can overload id, but not in the sense you probably meant. Just do id = <function>.Doubleripper
No, it is not. Try print(id(a.T) is id(a.T)) in python and you'll see.Incorrupt
@Incorrupt I believe he means comparing the ids with ==, not with is. So print(id(a.T) == id(a.T)) should be equivalent to print(a is a).Pumping
H
16

The Python is keyword tests object identity. You should NOT use it to test for string equality. It may seem to work frequently because Python implementations, like those of many very high level languages, performs "interning" of strings. That is to say that string literals and values are internally kept in a hashed list and those which are identical are rendered as references to the same object. (This is possible because Python strings are immutable).

However, as with any implementation detail, you should not rely on this. If you want to test for equality use the == operator. If you truly want to test for object identity then use is --- and I'd be hard-pressed to come up with a case where you should care about string object identity. Unfortunately you can't count on whether two strings are somehow "intentionally" identical object references because of the aforementioned interning.

Hyperacidity answered 7/6, 2010 at 8:27 Comment(2)
the only place in Python where you want to do identity comparison is when comparing to Singletons (e.g. None) and sentinel values that needs to be unique. Other than that, there is probably almost no reason for is.Willis
@Lie Ryan: I tend to agree. I only ever use it for None and for special sentinels that I've created (usually as calls to the base 'object()'). However, I don't feel comfortable asserting that there's no other valid uses for the 'is' operator; just none that I can think of. (Possibly a testament to my own ignorance).Hyperacidity
S
10

The is keyword compares objects (or, rather, compares if two references are to the same object).

Which is, I think, why there's no mechanism to provide your own implementation.

It happens to work sometimes on strings because Python stores strings 'cleverly', such that when you create two identical strings they are stored in one object.

>>> a = "string"
>>> b = "string"
>>> a is b
True
>>> c = "str"+"ing"
>>> a is c
True

You can hopefully see the reference vs data comparison in a simple 'copy' example:

>>> a = {"a":1}
>>> b = a
>>> c = a.copy()
>>> a is b
True
>>> a is c
False
Sheldon answered 7/6, 2010 at 8:26 Comment(0)
B
5

If you are not afraid of messing up with bytecode, you can intercept and patch COMPARE_OP with 8 ("is") argument to call your hook function on objects being compared. Look at dis module documentation for start-in.

And don't forget to intercept __builtin__.id() too if someone will do id(a) == id(b) instead of a is b.

Bradbradan answered 12/10, 2010 at 3:3 Comment(2)
Interesting to know, that's a whole world of possibilities to mess with python's function that I'd never thought about. But why would this ever be a good idea?Kissee
At my company, we have an in-house testing library containing a context decorator which freezes time by replacing datetime.datetime with an implementation that always returns a specific time from utcnow(). If you run datetime.datetime.utcnow() and attempt to pickle the returned value, it will fail because its class is inconsistent (it's pretending to be another class). In this case, overriding the way is works might be a solution.Courtund
G
3

'is' compares object identity whereas == compares values.

Example:

a=[1,2]
b=[1,2]
#a==b returns True
#a is b returns False

p=q=[1,2]
#p==q returns True
#p is q returns True
Gilgai answered 9/3, 2017 at 13:35 Comment(0)
G
2

is fails to compare a string variable to string value and two string variables when the string starts with '-'. My Python version is 2.6.6

>>> s = '-hi'
>>> s is '-hi'
False 
>>> s = '-hi'
>>> k = '-hi'
>>> s is k 
False
>>> '-hi' is '-hi'
True
Gaven answered 7/4, 2011 at 15:53 Comment(0)
D
1

You can't overload the is operator. What you want to overload is the == operator. This can be done by defining a __eq__ method in the class.

Diminished answered 15/6, 2011 at 4:24 Comment(0)
H
1

You are using identity comparison. == is probably what you want. The exception to this is when you want to be checking if one item and another are the EXACT same object and in the same memory position. In your examples, the item's aren't the same, since one is of a different type (my_string) than the other (string). Also, there's no such thing as someclass.__is__ in python (unless, of course, you put it there yourself). If there was, comparing objects with is wouldn't be reliable to simply compare the memory locations.

When I first encountered the is keyword, it confused me as well. I would have thought that is and == were no different. They produced the same output from the interpreter on many objects. This type of assumption is actually EXACTLY what is... is for. It's the python equivalent "Hey, don't mistake these two objects. they're different.", which is essentially what [whoever it was that straightened me out] said. Worded much differently, but one point == the other point.

the for some helpful examples and some text to help with the sometimes confusing differences visit a document from python.org's mail host written by "Danny Yoo"

or, if that's offline, use the unlisted pastebin I made of it's body.

in case they, in some 20 or so blue moons (blue moons are a real event), are both down, I'll quote the code examples

###
>>> my_name = "danny"
>>> your_name = "ian"
>>> my_name == your_name
0                #or False
###

###
>>> my_name[1:3] == your_name[1:3]
1    #or True
###

###
>>> my_name[1:3] is your_name[1:3]
0
###
Heterodox answered 4/1, 2013 at 18:1 Comment(0)
I
0

Assertion Errors can easily arise with is keyword while comparing objects. For example, objects a and b might hold same value and share same memory address. Therefore, doing an

>>> a == b

is going to evaluate to

True

But if

>>> a is b

evaluates to

False

you should probably check

>>> type(a)

and

>>> type(b)

These might be different and a reason for failure.

Iconography answered 7/2, 2017 at 11:17 Comment(0)
L
0

Because string interning, this could look strange:

a = 'hello'
'hello' is a  #True

b= 'hel-lo'
'hel-lo' is b #False
Laster answered 1/8, 2020 at 14:41 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.