Is it possible to override __new__ in an enum to parse strings to an instance?
Asked Answered
D

5

16

I want to parse strings into python enums. Normally one would implement a parse method to do so. A few days ago I spotted the __new__ method which is capable of returning different instances based on a given parameter.

Here my code, which will not work:

import enum
class Types(enum.Enum):
  Unknown = 0
  Source = 1
  NetList = 2

  def __new__(cls, value):
    if (value == "src"):  return Types.Source
#    elif (value == "nl"): return Types.NetList
#    else:                 raise Exception()

  def __str__(self):
    if (self == Types.Unknown):     return "??"
    elif (self == Types.Source):    return "src"
    elif (self == Types.NetList):   return "nl"

When I execute my Python script, I get this message:

[...]
  class Types(enum.Enum):
File "C:\Program Files\Python\Python 3.4.0\lib\enum.py", line 154, in __new__
  enum_member._value_ = member_type(*args)
TypeError: object() takes no parameters

How can I return a proper instance of a enum value?

Edit 1:

This Enum is used in URI parsing, in particular for parsing the schema. So my URI would look like this

nl:PoC.common.config
<schema>:<namespace>[.<subnamespace>*].entity

So after a simple string.split operation I would pass the first part of the URI to the enum creation.

type = Types(splitList[0])

type should now contain a value of the enum Types with 3 possible values (Unknown, Source, NetList)

If I would allow aliases in the enum's member list, it won't be possible to iterate the enum's values alias free.

Disario answered 8/6, 2014 at 10:36 Comment(8)
Definitely possible. This answer has some good examples.Improvisation
@JohnC: except the __new__ method is not used for Types(0) or Types('nl'). It is used to create the Types.Source and Types.Unknown value objects instead.Phidias
To clarify some points: a) the enum members should be unique (I removed @enum.unique in front of the class definition to simplify the code. b) so adding all possible matching values as a enum member (like NetList, nl, list, ...) is no option. c) yes writing a parse(value) method would work fine (e.g. .NET uses this pattern) but I'm locking for a nicer solution in a python way :)Disario
You don't want aliases, yet you are adding aliases? That doesn't make sense.Butane
Aliases in value which is to be parses are allowed, but aliases in the enum itself ware not allowd. As mentioned some comments down: the value to parse can be a string and if so, there is no garantied 1-to-1 mapping of names.Disario
While it is possible to customize EnumMeta (and Martjin has a very good example of doing so), the simpler, and intended, way to add this functionality would be by using a classmethod (as my updated answer shows).Butane
Yes, it is possible to iterate over an Enum class alias free. In fact, if you want to iterate over the aliases as well as the main names, you have to iterate over __members__. And any time you have aliases, whether the values are strs, ints, or whatever, there will not be a 1-to-1 mapping of names to values. I suggest you read all the Enum docs.Butane
If I would allow aliases in the enum's member list, it won't be possible to iterate the enum's values alias free.: That is not true. list(Types) only lists the actual enumeration objects, not their aliases.Phidias
D
12

Yes, you can override the __new__() method of an enum subclass to implement a parse method if you're careful, but in order to avoid specifying the integer encoding in two places, you'll need to define the method separately, after the class, so you can reference the symbolic names defined by the enumeration.

Here's what I mean:

import enum

class Types(enum.Enum):
    Unknown = 0
    Source = 1
    NetList = 2

    def __str__(self):
        if (self == Types.Unknown):     return "??"
        elif (self == Types.Source):    return "src"
        elif (self == Types.NetList):   return "nl"
        else:                           raise TypeError(self)

def _Types_parser(cls, value):
    if not isinstance(value, str):
        # forward call to Types' superclass (enum.Enum)
        return super(Types, cls).__new__(cls, value)
    else:
        # map strings to enum values, default to Unknown
        return { 'nl': Types.NetList,
                'ntl': Types.NetList,  # alias
                'src': Types.Source,}.get(value, Types.Unknown)

setattr(Types, '__new__', _Types_parser)


if __name__ == '__main__':

    print("Types('nl') ->",  Types('nl'))   # Types('nl') -> nl
    print("Types('ntl') ->", Types('ntl'))  # Types('ntl') -> nl
    print("Types('wtf') ->", Types('wtf'))  # Types('wtf') -> ??
    print("Types(1) ->",     Types(1))      # Types(1) -> src

Update

Here's a more table-driven version that eliminates some of the repetitious coding that would otherwise be involved:

from collections import OrderedDict
import enum

class Types(enum.Enum):
    Unknown = 0
    Source = 1
    NetList = 2
    __str__ = lambda self: Types._value_to_str.get(self)

# Define after Types class.
Types.__new__ = lambda cls, value: (cls._str_to_value.get(value, Types.Unknown)
                                        if isinstance(value, str) else
                                    super(Types, cls).__new__(cls, value))

# Define look-up table and its inverse.
Types._str_to_value = OrderedDict((( '??', Types.Unknown),
                                   ('src', Types.Source),
                                   ('ntl', Types.NetList),  # alias
                                   ( 'nl', Types.NetList),))
Types._value_to_str = {val: key for key, val in Types._str_to_value.items()}


if __name__ == '__main__':

    print("Types('nl')  ->", Types('nl'))   # Types('nl')  -> nl
    print("Types('ntl') ->", Types('ntl'))  # Types('ntl') -> nl
    print("Types('wtf') ->", Types('wtf'))  # Types('wtf') -> ??
    print("Types(1)     ->", Types(1))      # Types(1)     -> src

    print(list(Types))  # -> [<Types.Unknown: 0>, <Types.Source: 1>, <Types.NetList: 2>]

    import pickle  # Demostrate picklability
    print(pickle.loads(pickle.dumps(Types.NetList)) == Types.NetList)  # -> True

Note that in Python 3.7+ regular dictionaries are ordered, so the use of OrderedDict in the code above would not be needed and it could be simplified to just:

# Define look-up table and its inverse.
Types._str_to_value = {'??': Types.Unknown,
                       'src': Types.Source,
                       'ntl': Types.NetList,  # alias
                       'nl': Types.NetList}
Types._value_to_str = {val: key for key, val in Types._str_to_value.items()}
Denishadenison answered 8/6, 2014 at 19:37 Comment(4)
@Ethan: It's fairly easy to make basic lookup work. However I don't have time right now to see if pickling could also be supporedt -- but suspect that it, too, could be.Denishadenison
@Ethan: Thanks for pointing out the lookup issue and following-up with pickling tidbit -- which I've verified to indeed be the case.Denishadenison
@Ethan: Hmmm, this code seems to contradict what it says in the notes at the end of the Interesting Examples section of the pypi's enum34 module's documentation -- namely that "there is no way to customize Enum's __new__".Denishadenison
Yeah, I'll have to fix that. :/ At the time I was thinking of writing a normal __new__ inside the class, and completely forgot about just replacing __new__ after the fact.Butane
P
20

The __new__ method on the your enum.Enum type is used for creating new instances of the enum values, so the Types.Unknown, Types.Source, etc. singleton instances. The enum call (e.g. Types('nl') is handled by EnumMeta.__call__, which you could subclass.

Using name aliases fits your usecases

Overriding __call__ is perhaps overkill for this situation. Instead, you can easily use name aliases:

class Types(enum.Enum):
    Unknown = 0

    Source = 1
    src = 1

    NetList = 2
    nl = 2

Here Types.nl is an alias and will return the same object as Types.Netlist. You then access members by names (using Types[..] index access); so Types['nl'] works and returns Types.Netlist.

Your assertion that it won't be possible to iterate the enum's values alias free is incorrect. Iteration explicitly doesn't include aliases:

Iterating over the members of an enum does not provide the aliases

Aliases are part of the Enum.__members__ ordered dictionary, if you still need access to these.

A demo:

>>> import enum
>>> class Types(enum.Enum):
...     Unknown = 0
...     Source = 1
...     src = 1
...     NetList = 2
...     nl = 2
...     def __str__(self):
...         if self is Types.Unknown: return '??'
...         if self is Types.Source:  return 'src'
...         if self is Types.Netlist: return 'nl'
... 
>>> list(Types)
[<Types.Unknown: 0>, <Types.Source: 1>, <Types.NetList: 2>]
>>> list(Types.__members__)
['Unknown', 'Source', 'src', 'NetList', 'nl']
>>> Types.Source
<Types.Source: 1>
>>> str(Types.Source)
'src'
>>> Types.src
<Types.Source: 1>
>>> str(Types.src)
'src'
>>> Types['src']
<Types.Source: 1>
>>> Types.Source is Types.src
True

The only thing missing here is translating unknown schemas to Types.Unknown; I'd use exception handling for that:

try:
    scheme = Types[scheme]
except KeyError:
    scheme = Types.Unknown

Overriding __call__

If you want to treat your strings as values, and use calling instead of item access, this is how you override the __call__ method of the metaclass:

class TypesEnumMeta(enum.EnumMeta):
    def __call__(cls, value, *args, **kw):
        if isinstance(value, str):
            # map strings to enum values, defaults to Unknown
            value = {'nl': 2, 'src': 1}.get(value, 0)
        return super().__call__(value, *args, **kw)

class Types(enum.Enum, metaclass=TypesEnumMeta):
    Unknown = 0
    Source = 1
    NetList = 2

Demo:

>>> class TypesEnumMeta(enum.EnumMeta):
...     def __call__(cls, value, *args, **kw):
...         if isinstance(value, str):
...             value = {'nl': 2, 'src': 1}.get(value, 0)
...         return super().__call__(value, *args, **kw)
... 
>>> class Types(enum.Enum, metaclass=TypesEnumMeta):
...     Unknown = 0
...     Source = 1
...     NetList = 2
... 
>>> Types('nl')
<Types.NetList: 2>
>>> Types('?????')
<Types.Unknown: 0>

Note that we translate the string value to integers here and leave the rest to the original Enum logic.

Fully supporting value aliases

So, enum.Enum supports name aliases, you appear to want value aliases. Overriding __call__ can offer a facsimile, but we can do better than than still by putting the definition of the value aliases into the enum class itself. What if specifying duplicate names gave you value aliases, for example?

You'll have to provide a subclass of the enum._EnumDict too as it is that class that prevents names from being re-used. We'll assume that the first enum value is a default:

class ValueAliasEnumDict(enum._EnumDict):
     def __init__(self):
        super().__init__()
        self._value_aliases = {}

     def __setitem__(self, key, value):
        if key in self:
            # register a value alias
            self._value_aliases[value] = self[key]
        else:
            super().__setitem__(key, value)

class ValueAliasEnumMeta(enum.EnumMeta):
    @classmethod
    def __prepare__(metacls, cls, bases):
        return ValueAliasEnumDict()

    def __new__(metacls, cls, bases, classdict):
        enum_class = super().__new__(metacls, cls, bases, classdict)
        enum_class._value_aliases_ = classdict._value_aliases
        return enum_class

    def __call__(cls, value, *args, **kw):
        if value not in cls. _value2member_map_:
            value = cls._value_aliases_.get(value, next(iter(Types)).value)
        return super().__call__(value, *args, **kw)

This then lets you define aliases and a default in the enum class:

class Types(enum.Enum, metaclass=ValueAliasEnumMeta):
    Unknown = 0

    Source = 1
    Source = 'src'

    NetList = 2
    NetList = 'nl'

Demo:

>>> class Types(enum.Enum, metaclass=ValueAliasEnumMeta):
...     Unknown = 0
...     Source = 1
...     Source = 'src'
...     NetList = 2
...     NetList = 'nl'
... 
>>> Types.Source
<Types.Source: 1>
>>> Types('src')
<Types.Source: 1>
>>> Types('?????')
<Types.Unknown: 0>
Phidias answered 8/6, 2014 at 10:44 Comment(6)
Interesting example of using and extending EnumMeta.Denishadenison
@MartijnPieters thanks for your two solutions. Solution 1: The first one would not serve all my requirements, because the enum members aren't unique - see my clarification comment on the original post for detailsDisario
Solution 2: This locks more like what I'm locking for. But in this solution the integer encoding is located in two classes / code places. Is it possible to combine them?Disario
@Paebbels: You could extend the __new__ method of the metaclass to build that map. I can try and extend that tomorrow.Phidias
@Paebbels: Given your question update I am not convinced you need to use value aliases at all, but I've expanded my answer to support defining those fully as part of the enum class now. But do read up on the first solution again, iteration over an enum does not include aliases.Phidias
You might want to check out the last edit I made to my answer (towards the bottom) as it shows how to do value aliases without the headache of metaclass hackery.Butane
D
12

Yes, you can override the __new__() method of an enum subclass to implement a parse method if you're careful, but in order to avoid specifying the integer encoding in two places, you'll need to define the method separately, after the class, so you can reference the symbolic names defined by the enumeration.

Here's what I mean:

import enum

class Types(enum.Enum):
    Unknown = 0
    Source = 1
    NetList = 2

    def __str__(self):
        if (self == Types.Unknown):     return "??"
        elif (self == Types.Source):    return "src"
        elif (self == Types.NetList):   return "nl"
        else:                           raise TypeError(self)

def _Types_parser(cls, value):
    if not isinstance(value, str):
        # forward call to Types' superclass (enum.Enum)
        return super(Types, cls).__new__(cls, value)
    else:
        # map strings to enum values, default to Unknown
        return { 'nl': Types.NetList,
                'ntl': Types.NetList,  # alias
                'src': Types.Source,}.get(value, Types.Unknown)

setattr(Types, '__new__', _Types_parser)


if __name__ == '__main__':

    print("Types('nl') ->",  Types('nl'))   # Types('nl') -> nl
    print("Types('ntl') ->", Types('ntl'))  # Types('ntl') -> nl
    print("Types('wtf') ->", Types('wtf'))  # Types('wtf') -> ??
    print("Types(1) ->",     Types(1))      # Types(1) -> src

Update

Here's a more table-driven version that eliminates some of the repetitious coding that would otherwise be involved:

from collections import OrderedDict
import enum

class Types(enum.Enum):
    Unknown = 0
    Source = 1
    NetList = 2
    __str__ = lambda self: Types._value_to_str.get(self)

# Define after Types class.
Types.__new__ = lambda cls, value: (cls._str_to_value.get(value, Types.Unknown)
                                        if isinstance(value, str) else
                                    super(Types, cls).__new__(cls, value))

# Define look-up table and its inverse.
Types._str_to_value = OrderedDict((( '??', Types.Unknown),
                                   ('src', Types.Source),
                                   ('ntl', Types.NetList),  # alias
                                   ( 'nl', Types.NetList),))
Types._value_to_str = {val: key for key, val in Types._str_to_value.items()}


if __name__ == '__main__':

    print("Types('nl')  ->", Types('nl'))   # Types('nl')  -> nl
    print("Types('ntl') ->", Types('ntl'))  # Types('ntl') -> nl
    print("Types('wtf') ->", Types('wtf'))  # Types('wtf') -> ??
    print("Types(1)     ->", Types(1))      # Types(1)     -> src

    print(list(Types))  # -> [<Types.Unknown: 0>, <Types.Source: 1>, <Types.NetList: 2>]

    import pickle  # Demostrate picklability
    print(pickle.loads(pickle.dumps(Types.NetList)) == Types.NetList)  # -> True

Note that in Python 3.7+ regular dictionaries are ordered, so the use of OrderedDict in the code above would not be needed and it could be simplified to just:

# Define look-up table and its inverse.
Types._str_to_value = {'??': Types.Unknown,
                       'src': Types.Source,
                       'ntl': Types.NetList,  # alias
                       'nl': Types.NetList}
Types._value_to_str = {val: key for key, val in Types._str_to_value.items()}
Denishadenison answered 8/6, 2014 at 19:37 Comment(4)
@Ethan: It's fairly easy to make basic lookup work. However I don't have time right now to see if pickling could also be supporedt -- but suspect that it, too, could be.Denishadenison
@Ethan: Thanks for pointing out the lookup issue and following-up with pickling tidbit -- which I've verified to indeed be the case.Denishadenison
@Ethan: Hmmm, this code seems to contradict what it says in the notes at the end of the Interesting Examples section of the pypi's enum34 module's documentation -- namely that "there is no way to customize Enum's __new__".Denishadenison
Yeah, I'll have to fix that. :/ At the time I was thinking of writing a normal __new__ inside the class, and completely forgot about just replacing __new__ after the fact.Butane
B
7

Is it possible to override __new__ in a python enum to parse strings to an instance?

In a word, yes. As martineau illustrates you can replace the __new__ method after the class has been instanciated (his original code):

class Types(enum.Enum):
    Unknown = 0
    Source = 1
    NetList = 2
    def __str__(self):
        if (self == Types.Unknown):     return "??"
        elif (self == Types.Source):    return "src"
        elif (self == Types.NetList):   return "nl"
        else:                           raise TypeError(self) # completely unnecessary

def _Types_parser(cls, value):
    if not isinstance(value, str):
        raise TypeError(value)
    else:
        # map strings to enum values, default to Unknown
        return { 'nl': Types.NetList,
                'ntl': Types.NetList,  # alias
                'src': Types.Source,}.get(value, Types.Unknown)

setattr(Types, '__new__', _Types_parser)

and also as his demo code illustrates, if you are not extremely careful you will break other things such as pickling, and even basic member-by-value lookup:

--> print("Types(1) ->", Types(1))  # doesn't work
Traceback (most recent call last):
  ...
TypeError: 1
--> import pickle
--> pickle.loads(pickle.dumps(Types.NetList))
Traceback (most recent call last):
  ...
TypeError: 2

Martijn showed is a clever way of enhancing EnumMeta to get what we want:

class TypesEnumMeta(enum.EnumMeta):
    def __call__(cls, value, *args, **kw):
        if isinstance(value, str):
            # map strings to enum values, defaults to Unknown
            value = {'nl': 2, 'src': 1}.get(value, 0)
        return super().__call__(value, *args, **kw)

class Types(enum.Enum, metaclass=TypesEnumMeta):
    ...

but this puts us having duplicate code, and working against the Enum type.

The only thing lacking in basic Enum support for your use-case is the ability to have one member be the default, but even that can be handled gracefully in a normal Enum subclass by creating a new class method.

The class that you want is:

class Types(enum.Enum):
    Unknown = 0
    Source = 1
    src = 1
    NetList = 2
    nl = 2
    def __str__(self):
        if self is Types.Unknown:
            return "??"
        elif self is Types.Source:
            return "src"
        elif self is Types.NetList:
            return "nl"
    @classmethod
    def get(cls, name):
        try:
            return cls[name]
        except KeyError:
            return cls.Unknown

and in action:

--> for obj in Types:
...   print(obj)
... 
??
src
nl

--> Types.get('PoC')
<Types.Unknown: 0>

If you really need value aliases, even that can be handled without resorting to metaclass hacking:

class Types(Enum):
    Unknown = 0, 
    Source  = 1, 'src'
    NetList = 2, 'nl'
    def __new__(cls, int_value, *value_aliases):
        obj = object.__new__(cls)
        obj._value_ = int_value
        for alias in value_aliases:
            cls._value2member_map_[alias] = obj
        return obj

print(list(Types))
print(Types(1))
print(Types('src'))

which gives us:

[<Types.Unknown: 0>, <Types.Source: 1>, <Types.NetList: 2>]
Types.Source
Types.Source
Butane answered 8/6, 2014 at 15:43 Comment(1)
The code you have at the very end of your answer is certainly the cleanest. However, it's based on modifying the undocumented (and private) _value2member_map_ attribute of enum_class that's created by class EnumMeta—which is certainly less than ideal (and by definition might possibly be changed in the future).Denishadenison
P
3

I don't have enough rep to comment on the accepted answer, but in Python 2.7 with the enum34 package the following error occurs at run-time:

"unbound method <lambda>() must be called with instance MyEnum as first argument (got EnumMeta instance instead)"

I was able to correct this by changing:

# define after Types class
Types.__new__ = lambda cls, value: (cls._str_to_value.get(value, Types.Unknown)
                                    if isinstance(value, str) else
                                    super(Types, cls).__new__(cls, value))

to the following, wrapping the lambda in with staticmethod():

# define after Types class
Types.__new__ = staticmethod(
    lambda cls, value: (cls._str_to_value.get(value, Types.Unknown)
                        if isinstance(value, str) else
                        super(Types, cls).__new__(cls, value)))

This code tested correctly in both Python 2.7 and 3.6.

Prepared answered 1/5, 2017 at 16:38 Comment(0)
S
2

I think the by far easiest solution to your problem is to use the functional API of the Enum class which gives more freedom when it comes to choosing names since we specify them as strings:

from enum import Enum

Types = Enum(
    value='Types',
    names=[
        ('??', 0),
        ('Unknown', 0),
        ('src', 1),
        ('Source', 1),
        ('nl', 2),
        ('NetList', 2),
    ]
)

This creates an enum with name aliases. Mind the order of the entries in the names list. The first one will be chosen as default value (and also returned for name), further ones are considered as aliases but both can be used:

>>> Types.src
<Types.src: 1>
>>> Types.Source
<Types.src: 1>

To use the name property as a return value for str(Types.src) we replace the default version from Enum:

>>> Types.__str__ = lambda self: self.name
>>> Types.__format__ = lambda self, _: self.name
>>> str(Types.Unknown)
'??'
>>> '{}'.format(Types.Source)
'src'
>>> Types['src']
<Types.src: 1>

Note that we also replace the __format__ method which is called by str.format().

Seale answered 8/4, 2017 at 22:18 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.