Is it possible to use dataclass rather than typeddict to define dict typing and make mypy happy?
Asked Answered
B

1

7

I have a dataclass let's say:

from dataclasses import dataclass

@dataclass
class Foo:
    bar: int
    baz: int

I have a function that is called from an API receiving json and loading it as a Dict:

def handler(foo) -> Foo:
    return Foo(**foo)

Is there a way to type foo without having to actually create a TypedDict mirror of the dataclass?

Such as:

from typing_extensions import TypedDict


class SerializedFoo(TypedDict):
    bar: int
    baz: int

I find it weird to have to define both.

Bemoan answered 12/10, 2022 at 12:50 Comment(6)
You can at least abstract away the construction of the typed dict in a single function. def make_typed_dict(name, cls): return TypedDict(name, {f.name: f.type for f in dataclasses.fields(cls)}); SerializedFoo = make_typed_dict('SerializedFoo', Foo)`.Kinematics
This illustrates a crucial point: just because the dict's keys and the class's instance attributes have the same name, they are distinct types with distinct uses; at most dataclasses could provide make_typed_dict as a class method instead of making you define it yourself.Kinematics
@Kinematics it obviously won't work for static type checkers without plugins, and TypedDict is almost useless without it...Pentatomic
The answer is no. From a typing perspective, the dict foo has nothing whatsoever to do with the dataclass Foo. The former is a generic container type, the latter is a specific concrete type. As chepner said, the fact that dict keys and attributes match has no bearing on this. And I agree with SUTerliakov that no type checker will interpret this any other way.Halflight
Frame challenge: Would typing foo actually improve things? From a typing perspective, calling handler is equivalent to calling Foo – why not use the latter directly? At some point you must tell the type checker that the input is fine and it is much simpler for that point to be Foo than some boilerplate.Swaziland
Thank you for your help on this, I used a dict and once it is loaded in the dataclass, I use the dataclass. No need for a TypedDict if the dataclass is going to validate it nonetheless.Bemoan
C
-1

In the solution below I've found a way to resolve the problem from the other angle, going from TypedDict to dataclass. I found that there are many features of TypedDict that are hardcoded into the type checker, and while it may be more appealing the conversion from dataclass to TypedDict seems impossible. However, if we start with a TypedDict, extract the annotations and use these to construct the dataclass, then all behaves as it should. With the help of dataclass_transform the type checker is also happy:

from dataclasses import make_dataclass
from typing import TypedDict, cast, dataclass_transform

class Test(TypedDict):
    bar: int

@dataclass_transform()
def create_dataclass[T](name: str, cls: type[T]) -> type[T]:
    cls1 = make_dataclass(name, [(k,v) for k,v in cls.__annotations__.items()])
    return cast(type[T],cls1)

test: Test = {'bar': 1}

DataT = create_dataclass('DataT', Test)
print(DataT(
    bar=1
))

Note: I am using pyright rather than mypy here, but I think it will amount to the same result

Corundum answered 30/1, 2024 at 16:22 Comment(3)
Can you please clarify what this is supposed to achieve? The function create_dataclass is all over the place - it pretends to return a type[T] (e.g. the TypedDict Test) but instead returns a separate dataclass type, and due to @dataclass_transform it pretends to be an @dataclass-like class to dataclass decorator but is just a regular function. [...]Swaziland
[...] Consequently, type-checking of this code fails when doing anything non-trivial - the type-checker is convinced that DataT(bar=1)["bar"] is fine (as DataT is a Test TypedDict) which is completely wrong at runtime and should read DataT(bar=1).bar (because it is actually a dataclass).Swaziland
@MisterMiyagi Sure, so the make_dataclass function does make a dataclass from the TypedDict, that at runtime behaves as expected. I was aware that the return type isn't strictly correct, however you're quite right, since the way attributes are accessed is different it means the behaviour of the type doesn't match runtime. I think there might be a way this can be done by using the same class that each is derived from - back to the drawing board!Corundum

© 2022 - 2025 — McMap. All rights reserved.