Type hint for an exhaustive dictionary with Enum/Literal keys
Asked Answered
S

4

16

I'm working on code bases with extensive type hints, checked by mypy. There's several instances where we have a mapping from an enum.Enum or other small finite set of statically known values (typing.Literal) to fixed values, and thus using a dictionary is convenient:

# GOOD
from enum import Enum, auto

class Foo(Enum):
   X = auto()
   Y = auto()

lookup: dict[Foo, str] = {Foo.X: "cool", Foo.Y: "whatever"}

print(lookup[Foo.X])

However, this dictionary doesn't have to be exhaustive (aka total): mypy is perfectly happy with it having missing keys, and indexing with a missing key will fail at runtime. This can happen in practice easily with a large enum (forgetting a member when defining the dict), or when adding a member to an existing enum (especially if the lookup is a completely different file).

For instance, this passes mypy --strict just fine, but fails at runtime because we 'forgot' to update lookup itself:

# BAD
from enum import Enum, auto

class Foo(Enum):
   X = auto()
   Y = auto()
   Z = auto() # NEW

lookup: dict[Foo, str] = {Foo.X: "cool", Foo.Y: "whatever"}
 
print(lookup[Foo.Z]) # CHANGED

I'd love to be able to mark specific dictionaries/mappings as total/exhaustive, meaning, for instance, mypy will give an error about the definition of lookup in the BAD example above.

  1. Can this be annotated with Python's current type hints as a generic type, for any Enum or Literal[...] key type? (For instance, best-case syntax we'd hope for: lookup: ExhaustiveDict[Foo, str] = {...} or lookup: ExhaustiveDict[Literal[1, 2], str] = {1: "a", 2: "b"}.)
  2. If not, can it be done for a specific pair of key/value types? (For instance, reasonable syntax might be lookup: ExhaustiveDictFooTo[str] = {...} and/or lookup: ExhaustiveDictFooToStr = {...}, as long as the definition of those types is reasonable.)

I'm happy to change the exact syntax with which we build the dictionary, but the closer it is to {Foo.X: "cool", Foo.Y: "whatever"} the better.


Additional notes for background/to be clear about what we understand:

  • We're currently using a work around of exhaustive if statements, but it's annoying to go from a compact dict to a whole function:
    from typing import NoReturn
    def exhaustive(val: NoReturn) -> NoReturn:
        raise NotImplementedError(val)
    
    def lookup(val: Foo) -> str:
        if val is Foo.X:
            return "cool"
        elif val is Foo.Y:
            return "whatever"
        else:
            exhaustive(val)
    
    If Foo is later changed to include Z, we get an error on the last line like Argument 1 to "exhaustive" has incompatible type "Literal[Foo.Z]"; expected "NoReturn", meaning we haven't handled that case earlier in the if (the message is not immediately obvious, but it's one ends up just pattern-matching what it means and is far better than nothing). (Presumably this could also use match/case, but we're still on Python 3.9, not 3.10.)
  • This applies equally well to using Literal[1, 2, 3] or Literal["foo", "bar", "baz"] as a key type, in addition to enum.Enum.
  • This is has some overlap with typing.TypedDict and its total=True default, but AFAICT, that's limited to string keys written literally into the TypedDict definition (so we'd need to convert enums to strings and have additional functionality that verifies the TypedDict definition actually matches the enum).
  • I'm basically asking how to write a Python equivalent to TypeScript's Record type, something like Record[Foo, str] or Record[Literal["foo", "bar"], str] (equivalent to Record<"foo" | "bar", string> in Typescript).
Sinister answered 27/4, 2022 at 1:49 Comment(10)
If you need each element of an enum to map to something, just add it to the enum directly as an attribute.Scharff
why does typing.Literal not work for you?Sadye
@MadPhysicist Thanks. If I understand what you're suggesting correctly, that works in limited cases, but doesn't scale well (many different mappings may lead to a huge enum definition), work for external enums (can't change the definition), or lead to good architecture (putting downstream concerns into the enum definition itself). NB. I'm assuming you mean something like X = (0, "cool", 1.23, ...), Y = (1, "whatever", 4.56, ...) (plus the appropriate __new__/__init__ overrides to get the .value and the custom attributes set correctly).Sinister
@juanpa.arrivillaga, I'm intrigued but not sure how Literal acts like an exhaustive dict. Could you be a little more specific about how I should change my code examples to use Literal and achieve my goal?Sinister
lookup: dict[Literal[Foo.X, Foo.Y], str] = {Foo.X: "cool", Foo.Y: "whatever"}Sadye
@juanpa.arrivillaga, thanks, unfortunately lookup: dict[Literal[Foo.X, Foo.Y, Foo.Z], str] = {Foo.X: "cool", Foo.Y: "whatever"} passes mypy just fine, plus it's rather verbose for a large enum (I have one with 29 values), so doesn't seem like the full story... however, that's slightly better in some ways because lookup's type will need to be updated if Foo's members are changed and it's indexed by values of type Foo, and one might remember to update the dict itself at the same time. If you add it as an answer and discuss the downsides, I'll upvote :)Sinister
@huon. Can you replace Literal[Foo.X, Foo.Y, Foo.Z] with Literal[Foo._values_] or so?Scharff
This might get you started: https://mcmap.net/q/750816/-specify-keys-for-mypy-in-python-dictionary/2988730Scharff
@MadPhysicist thanks again. I think writing Literal with all of the enum members and the enum Foo itself are pretty much equivalent and so that still doesn't address the main concern (making sure the dict keys are exhaustive). Also, that linked answer doesn't seem to address any of my concerns about TypedDict in my question?Sinister
In a nutshell no, the aim you describe can't be done with the type hints Python currently provides. You'll have to maintain both the exhaustiveness and the totality in full by hand, and you'll run into problems further down the road when narrowing.Placard
P
10

(...) mark specific dictionaries/mappings as total/exhaustive, meaning, (...) mypy will give an error about the definition of lookup in the BAD example

IOW, what the bold sentence says is: Create a type hint dependency of totality/exhaustiveness from lookup to Foo. Roughly in UML:

enter image description here

Dependency meaning lookup depends on changes to Foo, but with the conflated quadruple requirement that:

A. The dependency be implemented using only type hints (hence a static type checker dependency, not a run-time dependency).

B. The dependency automatically reflect changes in Foo to TypedDict without a need to rewrite type hints upon changing Foo. (This one is completly over the top.)

C. The dependency cause mypy to issue a warning when it's not satisfied.

D. The lookup dictionary keep a Totality relation to the Foo members.

Simple answer is: NO. Python doesn't have one native type hint to establish such a dependency; and it can't be achieved by combining static type hints without requiring rewrites that keep Foo and the TypedDict in sync. So the only choice is resorting to a run-time implementation, or rewriting the TypedDict definition to reflect changes to Foo. (I.e: It's not possible to satisfy requirements A thogether with B.)

(The hard part is demonstrating "why not", so the following points try to build up an incremental demonstration addressing the several possibilities the question mentions.)

1. Declaration

1.1. Literal and TypedDict have to be written in full at declaration, their syntax rules don't allow writing a dynamic declaration. So writing the dependency between lookup: dict and Foo: Enum into the type hints at declaration can't be done. (It's not possible so satisfy requirement B and A together.)

See the PEP quotes below: It's not possible to declare Literal[*Foo] by unpacking or other run-time means, and the same goes for TypedDict because it doesn't have a constructor (other than explicit class syntax and the alternative syntax) that would allow the declaration to be populated as a function of Enum Foo or dict lookup type hint to capture the dependency without writing it explicitly in full.

PEP 586 - Illegal parameters for Literal at type check time

The following parameters are intentionally disallowed by design:

Arbitrary expressions like Literal[3 + 4] or Literal["foo".replace("o", "b")].

(...)

Any other types: for example, Literal[Path], or Literal[some_object_instance] are illegal. This includes typevars: if T is a typevar, Literal[T] is not allowed. Typevars can vary over only types, never over values.

And specific to the TypeDict:

PEP 589 – TypedDict: Type Hints for Dictionaries with a Fixed Set of Keys

Abstract

This PEP proposes a type constructor typing.TypedDict to support the use case where a dictionary object has a specific set of string keys, each with a value of a specific type. Class-based Syntax String literal forward references are valid in the value types

This PEP proposes a type constructor typing.TypedDict to support the use case where a dictionary object has a specific set of string keys, each with a value of a specific type.

1.2. Using forward references wouldn't change the fact that references to the Enum members can only be written into the values not the keys (using class syntax) (requirement A is not met).

class Movie1(TypedDict):
    cool: "Literal[Foo.X]"
    whatever: "Literal[Foo.Y]"


class Movie2(TypedDict):
    cool: Literal[Foo.X]
    whatever: Literal[Foo.Y]

1.3. The keys in the TypedDict have to be strings but the strings can't have dots (it conflicts with dotted syntax) and can't be written as string literals. So the following three examples won't work (requirement A is not met):

class Movie3(TypedDict):  # Illegal  syntax
    "Foo.X": str
    "Foo.Y": str

class Movie4(TypedDict):  # Illegal  syntax
    Foo.X: str
    Foo.Y: str

# using a dotted syntax that has no corresponding variable also doesn't work
class Movie5(TypedDict):  # Illegal syntax
    a.x: str
    b.y: str

1.4. The previous point also means you could use an alias for the Enum members in order to write them into the TypedDict, the following code would work:

However, this would again defeat the question's main purpose of not having to write out and maintain a second group of declarations that need to be updated to reflect changes to the Enum. (Requirement B is again not met.)

some_alias1 = Foo.X
some_alias2 = Foo.Y

class Movie6(TypedDict):
    some_alias1 : str
    some_alias2 : str

lookup_forward_ref3: Movie3 = {'some_alias1': "cool", 'some_alias2': "whatever"}

1.5 TypeDict's Alternative Syntax

Using the Alternative Syntax of TypeDict (opposed to class syntax) allows to circumvent the dotted syntax problem mentioned earlier (in 1.3.), the following passes with mypy 0.931

class Foo(Enum):
   X = auto()
   Y = auto()
   Z = auto() # NEW

Movie = TypedDict('Movie',
                  {
                     'Foo.X': str,
                     'Foo.Y': str,
                     'Foo.Z': Literal[Foo.X]},   # just as a Literal example
                  total=False)

lookup2: Movie = {'Foo.X': "cool", 'Foo.Y': "whatever", 'Foo.Z': Foo.X}

This is one step closer to the possible alternatives you were asking for:

I'm happy to change the exact syntax with which we build the dictionary, but the closer it is to {Foo.X: "cool", Foo.Y: "whatever"} the better.

However, you'll still have to maintain the TypedDict declaration in sync with changes to the Enum Foo (so it doesn't satisfy requirements B but requirements A, C and D are pretty close). If for example you tried populating the key-values with something more dynamic:

part_declaration = {
                     'Foo.X': str,
                     'Foo.Y': str,
                     'Foo.Z': Literal[Foo.X]}

Movie = TypedDict('Movie',
                  part_declaration,
                  total=False)

Mypy would remind you that:

your_module.py:27: error: TypedDict() expects a dictionary literal as the second argument
your_module.py:31: error: Extra keys ("Foo.X", "Foo.Y", "Foo.Z") for TypedDict "TypedDict"

1.6 Use of Final Values and Literal Types

It should be emphasized that using Literals as key's to the TypedDict is only legal for string literals not Enum literals (notice the bolds in the PEP quote). So, the TypedDict has to be declared in full; looking at Enum Literals for a solution won't change that fact. (Requirement B again not met).

PEP 589 – TypedDict: Type Hints for Dictionaries with a Fixed Set of Keys

Use of Final Values and Literal Types

Type checkers should allow final names (PEP 591) with string values to be used instead of string literals

Type checkers are only expected to support actual string literals, not final names or literal types,

Mypy also considers Enum Literals as final, see Extra Enum checks but that doesn't superseed the above mentioned string literal limitation.

2 Relation between Literal[YourEnum.member] and YourEnum

In most cases there's no difference between typing a variable as the_var: Foo or the_var: Literal[Foo.X, Foo.Y, Foo.Z]] if the Literal has all the Enum members because it would accept the exact same types. The question mentions using Literals over just Foo (the Enum members are subclasses of the Enum so nominal subtyping rules apply). But for the purpose of the question using Literals won't solve the problem of creating a type hint dependency between lookup and Foo that reflects changes to the later without requiring rewrites (again requirement B not satisfied). The following two declarations are equivalent:

class Foo(Enum):
   X = auto()
   Y = auto()
   Z = auto()

var1: Foo
var2: Literal[Foo.X, Foo.Y, Foo.Z]

var1 = Foo.X
var1 = Foo.Y
var1 = Foo.Z

var2 = Foo.X
var2 = Foo.Y
var2 = Foo.Z

Lets now look at the 2 properties mentioned in the question:

3. Totality

A property of the TypeDict. As said before the TypeDict definition has to be written in full at declaration - there's no way for the TypedDict keys to reflect changes to Enum Foo without writing those changes explicitly in the declaration. (Requirement B again not met.)

TypedDict is the definition of a dependency on the type of it's values and the strings values of its keys. What totality aims to capture is a dependency between instances of TypedDict and the type itself. So trying to express a relationship of totality dependence to another type can only be done by explicitly coding that dependency. (Without satisfying requirement B it's possible to satisfy requirements A and C but you'll have to manually maintain those dependencies up-to-date).

4. Exhaustiveness

This property of the Enum's (see PEP 596 - Interactions with enums and exhaustiveness checks and mypy - Exhaustiveness checking) is mentioned in the question but it's orthogonal to requirements A, B.

Exhaustiveness is logic (the if/else brach) related to data (the Enum). It allows a run-time implementation that the static type checker verifies, it is not a type hint! (So it's not even in the league of requirement A - because it's not a type hint; it again doesn't satisfy requirement B; but it can satisfy requirement C as you've implemented it; it circumvents requirement D by implementing an explicit run-time mapping to the string constants instead of using a type hint TypedDict to maintain the dependency between lookup strings and Enum members).

5. Conclusion

If you notice requirement B is never satisfied using static type hint checks (you have to write the type hints and mantain them). Most developers would go straight for a unittest or run-time check (or just let the KeyError be thrown because it's easier to...):

class Foo(Enum):
    X = auto()
    Y = auto()
    Z = auto() # NEW

    @classmethod
    def totality(cls, lookup: dict[str, Any]):
        for member in cls:
            if '{}.{}'.format(cls.__qualname__, member.name) not in lookup.keys():
                raise KeyError  # lookup isn't total to Enum.

The main use of type hints is hinting to the developer what types are acceptable. Your use departs from that by trying to establish a mapping between two sets of permissable values and turning those values together with the mapping into a type. The point of such use isn't giving you a warning if you forget to maintain something in your code, but to remind you what types (in this case mapping between values) are acceptable.

6. Addressing the questions:

I'd love to be able to mark specific dictionaries/mappings as total/exhaustive

Can be done with TypedDict. Declare the type and maintain it current to Enum Foo.

meaning, for instance, mypy will give an error about the definition of lookup in the BAD example above.

Orthogonal to the previous statement! That has nothing to do with totality, the TypedDict keeps totality in relation to its type definition. Keep it's totality in relation to Foo's definition up to date and problem solved.

  1. Can this be annotated with Python's current type hints as a generic type, for any Enum or Literal input? (For instance, lookup: ExhaustiveDict[Foo, str] = {...}.)

This question doesn't make sense. The type hint you give as example works for the Enum members as shown in (2.) and you don't specify what any Literal means...? I don't see how Generic would help here.

  1. If not, can it be done for a specific pair of keys/values? (For instance, lookup: ExhaustiveDictFooTo[str] = {...} and/or lookup: ExhaustiveDictFooToStr = {...}.

Depends on the possible lookup dictionaries, you only give a 1:1 mapping between Enum members and string Literals so nothing could be simpler, it would look like this:

combo = tuple[tuple[Literal[Foo.X], Literal['cool']], tuple[Literal[Foo.Y], Literal['whatever']]]

Problem being it's not possible to express a 1:1 key-value relationship in a dictionary using type hints. So this is what turning values into types looks like in the extreme...

but it's annoying to go from a compact dict to a whole function

The straightforward solution is writing a TypedDict mapping to the Enum members' names as keys (as mentioned in 1.5) together with a lookup instance. The type hint itself could be written as

class Foo(Enum):
   X = auto()
   Y = auto()
   Z = auto() # NEW

Movie = TypedDict('Movie',
                  {
                     'Foo.X': str,
                     'Foo.Y': Literal['whatever'], # just as a Literal example

                     'Foo.Z': Literal[Foo.X]},  # just as a Literal example
                  total=False)

lookup2: Movie = {'Foo.X': "cool", 'Foo.Y': 'whatever', 'Foo.Z': Foo.X}

If you want mypy to give you a warning you can also use the Exhaustiveness check (in 4.) but that warning is meant to remind you of oversights in writing your logic not the data! nor that type hints are out-of-date.

Placard answered 2/5, 2022 at 6:41 Comment(8)
Thanks for taking the time to try to answer this. However, there's quite a few parts that I don't understand: 1. You don't seem to explain why my desire here is a "maintainability nightmare". Could you expand? 2. The question doesn't propose using Literal to listing enum members (it's redundant with Foo, as you point out), but rather making sure a solution also works with something like Literal["foo", "bar", ..] (note str members), just enums. I'd be able to understand your answer more easily if it cut down by keeping that focus in mind :)Sinister
3. You imply that one option would be "maintaining a Literal ... by hand", but this doesn't seem to give any sort of totality checking, so you'll need to expand on how that option could work? 4. Relatedly, using TypeGuard doesn't seem right: the 'narrowed' type is the same as input type (mypy-play.net/…) (as you point out elsewhere), and there's no totality checking anyway. If using runtime checks, a plain assert or test is better: it fails, rather than take a unexpected code path. (Thanks again for taking the time.)Sinister
@Sinister your initial post was quite a handful with several issues needing to be addressed so I posted "as is" since it was also likely you wouldn't reply at all. I'm not satisfied with how I handle some aspects in my answer so I was intending on revising it. It'll try to take some time tomorrow. But yes, each of the reservations you're raising now will require one or more paragraphs likely with a revised code snippet for each.Placard
My apologies for being misleading, the 'notes' in my post are background to indicate things that I understand (to try to short-circuit potential suggestions/ideas that don't work, and provide more context), rather than things needing to be addressed/answered. :)Sinister
@Sinister Sorry for not getting this earlier, writing a post like this just takes time and there's no way around it. The answer pretty much tackles every aspect you've mentioned. The hard part was untangling the conflated requirements, but in so far as it's possible I addressed them separately. I am expecting you to award the bounty.Placard
Wow, thank you for writing so much. I feel that you haven't asked for enough clarification on the questions, or given concrete enough advice for how to handle this in future. The question you're saying doesn't make sense is a little misleading (sorry!): "any literal" is my desire to have a dict[Literal[1, 2], str] that is total (must values for both 1 and 2 and keys), similar to how the overall question is asking for dict[SomeEnum, str] that must specify values for all enum members.Sinister
@Sinister it's not my job to ask for clarification as much as it's your job to not change requirements invalidating any answers (that's actually a hard SO rule). The one thing that emerged from the question is a lot of misunderstandings on your part about basic Python type hinting rules. Now, I'm always happy to work with OP's towards solving problems, you don't have to thank me for that. But in keeping with site etiquette and basic justice the problem's been solved as formulated so I will request you award the bounty and upvote after which I'll be me more than happy to expand on the answer.Placard
As you might infer from my profile, I'm familiar with how SO works. I apologise that I had wording that wasn't clear to you, however the requirements haven't changed (I've now explicitly tweaked the question wording, but I'm not willing to fight your unnecessary attacks like 'over the top'/'a lot of misunderstanding'/... any more). Record in TypeScript proves that this is perfectly reasonable to do with static types. I suspect there's nothing about Python's 'basic hinting rules' that stops it other than a PEP and implementation (e.g. probably could be a MyPy plugin with reasonable syntax).Sinister
T
2

The current type hint system doesn't support this well. To get something similar done, I have had to reach for meta-programming.

In your case, I would construct the lookup dictionary based on the enum. Doing it this way, you can move the maintenance entirely into one location in code and it forces the mapping to become exhaustive (since it's constructed from the original set of values).

from enum import Enum, auto

class Foo(Enum):
   X = auto()
   Y = auto()

lookup: dict[Foo, str] = {val:key for key,val in Foo._member_map_.items()}

print(lookup[Foo.X])

However, if you are dynamically constructing the mappings or editing them dynamically all bets are off.


Another option is to shove the look ups into the Enum by giving it a reverse_lookup implementation.

from enum import Enum

class ApplicationSpecificError(Exception): ...

class Foo(Enum):
  x = 'cool'
  y = 'whatever'

  @classmethod
  def reverse_lookup(cls, key):
    try:
      return cls._value2member_map_[key] 
    except KeyError as err:
      raise ApplicationSpecificError(f'Foo.{key} is not tracked by Foo') from None

print(Foo.reverse_lookup('nope')) # ApplicationSpecificError

I see that you've think that match-case would give you exhaustive typing, but it doesn't. You can forget about a case and it'll be happy to ignore it entirely. Mypy and other tools might implement an option to test for match exhaustiveness/totality, but matching by itself doesn't give you either.

Toothbrush answered 4/5, 2022 at 20:8 Comment(1)
Thanks for taking the time to look at this. Both options here seem to doing something very different to my question: the first lookup seems to be the same as the name attribute (Foo.X.name), and the second seems to be the same as Foo("nope") (plus some exception wrapping). That is, neither are mapping enum members to some arbitrary other value. For the match-case, it's just a slightly neater alternative to the if/else: case _: exhaustive(val) works just as well for exhaustiveness checking as else: exhaustive(val).Sinister
V
1

It looks like the problem is not of static typing, but of DRY principle (Don't Repeat Yourself), also known as a "Single Source of Truth" - the data in OP example is split between 2 places, and therefore will always carry the risk of inconsistency.

Here's how I solved this with minimum extra code, and combined all the data in one place, which ensures no missing mappings and automatic consistency. And typing checks will work. Rename and use as needed - may have more than 2 aargs in init() and in the data for any use case.

from enum import Enum
class MyError(Enum):
    def __init__(self, value, string):
        self.id = value
        self.string = string

    # fmt: off
    # @formatter: off
    ERR_OK               =  0, "Success"
    ERR_FAIL             =  1, "General failure"
    ERR_TIMEOUT          =  2, "Timed out"
    ERR_NOT_IMPLEMENTED  =  3, "Not implemented"
    # ... more items here
    # @formatter: on
    # fmt: on

    def __str__(self):
        return self.string

    def __repr__(self):
        return self.name

Use:

v = MyError.ERR_OK
v     
# ERR_OK
v.name
# 'ERR_OK'
v.id
# 0
v.string
'Success'

For OP code, from BAD:

from enum import Enum, auto

class Foo(Enum):
   X = auto()
   Y = auto()
   Z = auto() # NEW

lookup: dict[Foo, str] = {Foo.X: "cool", Foo.Y: "whatever"}
 
print(lookup[Foo.Z]) # CHANGED

Convert to good:

from enum import Enum, auto
class Foo(Enum):
   X = auto(), "cool"
   Y = auto(), "whatever"
   Z = auto(), "won't break" # NEW
   def __init__(self, value, string):
      self.id = value
      self.string = string

print(Foo.Z.string)
# won't break

If someone forgets to add the string, e.g.

   Z = auto()  # Forgot the string!

Python will throw when loading Foo:

TypeError: __init__() missing 1 required positional argument: 'string'
Viviparous answered 23/3, 2024 at 3:54 Comment(5)
Thanks for taking the time to look! This is a nice approach when the look-up is a "core" part of the enum, but I don't think it scales well along two dimensions: 1. splitting things across files (e.g. for your example, maybe the look-up is a particular operation's SHOULD_RETRY: dict[MyError, bool] = { ... } putting that look-up with the operation itself is better), 2. having a lot of look-ups, e.g. if there were 10 values to match for each enum variant, one might end up with ERR_OK = 0, "Success", 2, 3, 4, 5, 6, 7, 8, 9 and that's not so nice.Sinister
@Sinister #1. Splitting things across files is guaranteed to create inconsistency. Asking static typing to catch those is using wrong tool. #2 if you have to match across too many values across, then you're doing something else wrong. And regardless, you can use formatted layout for your code - align commas into columns, and tell formatters/linters to not touch it, like my example code does. But that's all outside the scope of original question.Viviparous
Thanks for the response. I'm not sure why people keep asserting that static typing is the wrong tool for this: TypeScript's Record (mentioned in the issue description) proves that static typing works perfectly well for this sort of task. Similarly, I'm not sure why it's unreasonable to want an enum to map to several different values in different contexts.Sinister
Python typing maturity is behind that of TypeScript. It sure is inspired by TS, but it's not a match yet, and that brings you the same answers. If you want to wait few years, then it might do what you want.Viviparous
In your problem statement you state the concern of inconsistent / not matching sets, that many designs have, and the best answer is the data-driven pattern, which guarantees completeness as all important info in concentrated in 1 file and not spread in N files. Enum only holds the definitions info - implementation files can be many that act on that info, so your implementations can be spread anywhere you want. What you'll realize, is that the implementations get much more compact.Viviparous
O
1

I think the simplest and least intrusive approach to solving the original problem – namely that people can forget to add items to lookup – is an assert at module level.

Starting with the original code ("BAD" version):

from enum import Enum, auto

class Foo(Enum):
   X = auto()
   Y = auto()
   Z = auto() # NEW

lookup: dict[Foo, str] = {Foo.X: "cool", Foo.Y: "whatever"}

We add the following two lines after the definition of lookup:

for val in Foo:
    assert val in lookup, f"`lookup` is missing a value for {val}"

This fails (as is) with the following message:

AssertionError: `lookup` is missing a value for Foo.Z

This is possible in Python because any executable code is allowed at module level. This is a powerful feature of Python that is easily missed, and enables all kinds of self-checks to be done before "run time".

Some people would put this in a unit test, but I think it is much better at module level immediately after the definition of lookup, because:

  • you can see by looking at the definition of lookup that it's exhaustive, so you don't have to be guarded about looking things up in it.

  • it is enforced at program startup, which is much quicker feedback than waiting for a test suite run.

Compared to static typing approaches, I think this solution is often superior:

  • You can provide your own error message that is very clear, rather than live with an often cryptic message from a type checker.

  • You can change the way it works very easily without restructuring types - for example, if you wanted to say "all items in the enum except this one", or "all items in the enum that satisfy some property" must be present.

Compared to trying to put everything into the definition of Foo, we have much more flexibility - we can split up code into its natural locations, rather than requiring everything in one place, but still without the possibility of forgetting anything.

If you have many similar instances, and the pain of 2 similar lines of code for each instance of this pattern is too much (which I really don't think it should be), you could always use this helper, although you lose a decent error message and you're going to have to import it as well:

def exhaustive(lookup_dict):
    """
    Ensures a dict based on an enum has all keys defined.
    """
    keys = list(lookup_dict.keys())
    assert keys, "lookup_dict must have some entries defined to use this helper"

    enum_type = type(keys[0])
    assert all(
        type(k) == enum_type for k in keys
    ), "All keys must have the same type to use this helper"

    expected_keys = set(list(enum_type))
    missing_keys = expected_keys - set(keys)
    if missing_keys:
        missing_keys_str = ', '.join(str(k) for k in missing_keys)
        raise AssertionError(
            f"Dictionary is missing keys for: {missing_keys_str}"
        )
    return lookup_dict
Octet answered 29/6, 2024 at 12:40 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.