I have learnt from PEP 3131 that non-ASCII identifiers were supported in Python, though it's not considered best practice.
However, I get this strange behaviour, where my π
identifier (U+1D70F) seems to be automatically converted to Ο
(U+03C4).
class Base(object):
def __init__(self):
self.π = 5 # defined with U+1D70F
a = Base()
print(a.π) # 5 # (U+1D70F)
print(a.Ο) # 5 as well # (U+03C4) ? another way to access it?
d = a.__dict__ # {'Ο': 5} # (U+03C4) ? seems converted
print(d['Ο']) # 5 # (U+03C4) ? consistent with the conversion
print(d['π']) # KeyError: 'π' # (U+1D70F) ?! unexpected!
Is that expected behaviour? Why does this silent conversion occur? Does it have anything to see with NFKC normalization? I thought this was only for canonically ordering Unicode character sequences...
print(dir(a))
aftera
has been assigned, you can see there is no trace of U+1D70F character in the class. Your second print statement would then work for the same reason (gets normalised), while your dictionary access fails as dictionaries can take anything as keywords and there would be no reason to normalise or do anything else to them as it is a string in parentheses. β Limbic# -*- coding: utf-8 -*-
makes no difference. Maybe NFKC is responsible.. but I thought canonisation was just about reordering, not changing the actual character.. 8) β Ortego__dict__
, don't you find? β Ortego