This is an old question by now, but I'd like to add an answer on how to do this in Python 3 as I've made an implementation.
The allowed characters are documented here: https://docs.python.org/3/reference/lexical_analysis.html#identifiers . They include quite a lot of special characters, including punctuation, underscore, and a whole slew of foreign characters. Luckily the unicodedata
module can help. Here's my implementation implementing directly what the Python documentation says:
import unicodedata
def is_valid_name(name):
if not _is_id_start(name[0]):
return False
for character in name[1:]:
if not _is_id_continue(character):
return False
return True #All characters are allowed.
_allowed_id_continue_categories = {"Ll", "Lm", "Lo", "Lt", "Lu", "Mc", "Mn", "Nd", "Nl", "Pc"}
_allowed_id_continue_characters = {"_", "\u00B7", "\u0387", "\u1369", "\u136A", "\u136B", "\u136C", "\u136D", "\u136E", "\u136F", "\u1370", "\u1371", "\u19DA", "\u2118", "\u212E", "\u309B", "\u309C"}
_allowed_id_start_categories = {"Ll", "Lm", "Lo", "Lt", "Lu", "Nl"}
_allowed_id_start_characters = {"_", "\u2118", "\u212E", "\u309B", "\u309C"}
def _is_id_start(character):
return unicodedata.category(character) in _allowed_id_start_categories or character in _allowed_id_start_categories or unicodedata.category(unicodedata.normalize("NFKC", character)) in _allowed_id_start_categories or unicodedata.normalize("NFKC", character) in _allowed_id_start_characters
def _is_id_continue(character):
return unicodedata.category(character) in _allowed_id_continue_categories or character in _allowed_id_continue_characters or unicodedata.category(unicodedata.normalize("NFKC", character)) in _allowed_id_continue_categories or unicodedata.normalize("NFKC", character) in _allowed_id_continue_characters
This code is adapted from here under CC0: https://github.com/Ghostkeeper/Luna/blob/d69624cd0dd5648aec2139054fae4d45b634da7e/plugins/data/enumerated/enumerated_type.py#L91 . It has been well tested.
None
or__debug__
do? According to the following docs, I would expect it to raise aSyntaxError
: docs.python.org/2/library/constants.html – Putative