I'm in the process of upgrading a Pydantic v1 codebase to Pydantic V2.
Pydantic has a variety of methods to create custom serialization logic for arbitrary python objects (that is, instances of classes that don't inherit from base pydantic members like BaseModel)
However, the deprecation of the v1 Config.json_encoder pattern introduces some challenges.
Namely, an arbitrary python class Animal could be used in any number of ways by a client BaseModel class, which would only need to declare a json encoder. example:
class Animal:
def __init__(self, name):
self.name = name
class Zoo(BaseModel):
first_animal: Animal
kennel: Dict[str, Animal]
class Config:
json_encoders = {
Animal: lambda v: "standard critter"
}
which had the advantage of ensuring any Animal type attribute in the model would be serialized using the provided function.
The following example lets me use custom logic to json serialize instances of Rock type members (the RockBase class already has pydantic-like attributes which are used by the current codebase):
from typing import Annotated
from pydantic.functional_serializers import PlainSerializer
from pydantic import BaseModel, ConfigDict
from pydantic import GetCoreSchemaHandler, GetJsonSchemaHandler
from pydantic_core import CoreSchema, core_schema
class RockBase:
def __init__(self, value):
self.value = value
@classmethod
def __get_pydantic_core_schema__(
cls, source_type: Any, handler: GetCoreSchemaHandler
) -> CoreSchema:
return core_schema.general_plain_validator_function(cls.validate)
@classmethod
def __get_pydantic_json_schema__(
cls, _core_schema: CoreSchema, handler: GetJsonSchemaHandler
) -> Dict[str, Any]:
extra_json_base = {"type": "string"}
return extra_json_base
@classmethod
def validate(cls, v=None, *args, **kwargs):
# validation logic
return v
Rock = Annotated[
RockBase,
PlainSerializer(lambda x: "a_string", return_type=str, when_used="always")
]
class SimpleAsteroid(BaseModel):
contains: Rock
model_config = ConfigDict(arbitrary_types_allowed=True)
mm1 = SimpleAsteroid(contains=Rock("gravel"))
print(mm1.model_dump_json())
However, the codebase I'm working with contains several pydantic models, which can instantiate arbitrarily nested structures.
For example, this call should result in a serializable object too:
SimpleAsteroid(contains={"extra":{"nested":{"value": Rock("it's deep")}}})
instantiation would be possible my declaring the following model:
class SimpleAsteroid(BaseModel):
contains: Any
model_config = ConfigDict(arbitrary_types_allowed=True)
but serialization would be impossible as the call to model_dump_json() will fail with
pydantic_core._pydantic_core.PydanticSerializationError: Unable to serialize unknown type : class '__main__.RockBase'
Is there a way to let the serialization logic be available in the base RockBase class, and let pydantic discover it as needed ?