satwikkansal/wtfpython

Why? a is b

anhuaxiang opened this issue ยท 14 comments

python: 3.7.1
image

Interesting

image

Let me dig deeper to check what changed in Python 3.7 that's causing this.

Python 3.7.1 (default, Dec  7 2018, 14:57:25) 
[GCC 4.4.7 20120313 (Red Hat 4.4.7-23)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> a,b = 257,257
>>> a is b
False
>>> a = 257
>>> b = 257
>>> a is b
False
>>> a = 257 ; b = 257
>>> a is b
True

But if using int() to convert

Python 3.7.1 (default, Nov 24 2018, 22:14:32) [MSC v.1912 64 bit (AMD64)] :: Anaconda, Inc. on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> a, b = 257, 257
>>> a is b
False
>>> a, b = int(257), int(257)
>>> a is b
True

@satwikkansal

Python 3.7.0 (v3.7.0:1bf9cc5093, Jun 27 2018, 04:59:51) [MSC v.1914 64 bit (AMD64)] on win32
Type "copyright", "credits" or "license()" for more information.
>>> a, b = "wtf!", "wtf!"
>>> a is b
False

Let me give a summary for the first example "Strings can be tricky sometimes *".

>>> a, b = "wtf!", "wtf!'
>>> a is b
True # in Python < 3.7
False # in Python >= 3.7

>>> 'a' * 20 is 'aaaaaaaaaaaaaaaaaaaa'
True  # in all Python versions
>>> 'a' * 21 is   'aaaaaaaaaaaaaaaaaaaaa'
False # in Python < 3.7
True  # in Python >= 3.7
Python 3.7.3 (default, Mar 27 2019, 09:23:39)
[Clang 10.0.0 (clang-1000.11.45.5)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> a, b = 257, 257
>>> a is b
False
>>> a = 257; b = 257;
>>> a is b
True
>>> a, b = int(257), int(257)
>>> a is b
True
>>> a = 257
>>> b = 257
>>> a is b
False

Okay, to summarize

>>> a, b = 257, 257
>>> a is b
True # Python 3.6 or less
False # Python 3.7

>>> a = 257; b =257;
>>> a is b
False # Python 3.6 or less
True # Python 3.7

I tried disassembling the code, but couldn't find the reason for this difference

import dis
def some_func():
	a, b = 257, 257
	a is b

dis.dis(some_func)

# Python 3.6 

  2           0 LOAD_CONST               2 ((257, 257))
              2 UNPACK_SEQUENCE          2
              4 STORE_FAST               0 (a)
              6 STORE_FAST               1 (b)

  3           8 LOAD_FAST                0 (a)
             10 LOAD_FAST                1 (b)
             12 COMPARE_OP               8 (is)
             14 POP_TOP
             16 LOAD_CONST               0 (None)
             18 RETURN_VALUE

# Python 3.7

  2           0 LOAD_CONST               1 ((257, 257))
              2 UNPACK_SEQUENCE          2
              4 STORE_FAST               0 (a)
              6 STORE_FAST               1 (b)

  3           8 LOAD_FAST                0 (a)
             10 LOAD_FAST                1 (b)
             12 COMPARE_OP               8 (is)
             14 POP_TOP
             16 LOAD_CONST               0 (None)
             18 RETURN_VALUE

LOAD_CONST is being called only once in both the cases. Any help in figuring out the exact change which led to this difference is appreciated ๐Ÿ˜…

Similar results come when we replace a and b by wtf! causing another example to become outdated #120

Update: Just realized even Python 3.8 has Python 3.6 like behavior, it was only 3.7 that behaving this way.

>>> a, b = 257, 257
>>> a is b
True # Python 3.6 or less or 3.8+
False # Python 3.7

>>> a = 257; b =257;
>>> a is b
False # Python 3.6 or less or 3.8+
True # Python 3.7

Diving into 3.8 changelog to figure out what changed...

bdrum commented

Hi! Thanks for the nice repo!

I've noticed one thing:

$ python3.8
Python 3.8.0 (default, Nov  3 2019, 18:06:07)
[GCC 7.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> a = "str1"
>>> b = "str1"
>>> a is b
True
>>> a = "str!"
>>> b = "str!"
>>> a is b
False
>>> a, b = "str!", "str!"
>>> a is b
True
>>> exit()
$ cat script.py
a = "str1"
b = "str1"
print(a is b)

a = "str!"
b = "str!"
print (a is b)

a, b = "str!", "str!"

print(a is b)
$ python3.8 script.py
True
True
True

Just interesting, does it mean that default optimizations for main and script are different?

Hey, yes. I added a small note regarding this (will publish in next revision), but here's the gist,

The compiling unit in an interactive environment like ipython consists of a single statement, whereas it consists of the entire module in case of modules. a, b = "str!", "str!" is single statement, whereas a = "str!"; b = "str!" are 2 statements in a single line. This explains why the identities are different in a = "str!"; b = "str!", and also explain why they are same when invoked in some_file.py

a, b = "wtf!", "wtf!"

In Python 3.7.5, a is not b and result of a is b is False

Did I miss something?

Out of curiosity I bisected this and found out, that this commit caused the regression in CPython 3.7:

python/cpython@7ea143a

This relates to this ticket on bpo: https://bugs.python.org/issue29469

Edit: The original behaviour was restored with this commit:

python/cpython@c2e1607

This was triggered by the ticket https://bugs.python.org/issue34100.