Is it possible to use Pydantic instead of dataclasses in Structured Configs in hydra-core python package?
Asked Answered
B

4

12

Recently I have started to use hydra to manage the configs in my application. I use Structured Configs to create schema for .yaml config files. Structured Configs in Hyda uses dataclasses for type checking. However, I also want to use some kind of validators for some of the parameter I specify in my Structured Configs (something like this).

Do you know if it is somehow possible to use Pydantic for this purpose? When I try to use Pydantic, OmegaConf complains about it:

omegaconf.errors.ValidationError: Input class 'SomeClass' is not a structured config. did you forget to decorate it as a dataclass?
Brookhouse answered 9/1, 2022 at 8:22 Comment(1)
In addition to the answers below, it might be possible to use hydra_zen.hydrated_dataclass in combination with hydra_zen.validates_with_pydantic, though I haven't tried it myself.Northrup
B
12

For those of you wondering how this works exactly, here is an example of it:

import hydra
from hydra.core.config_store import ConfigStore
from omegaconf import OmegaConf
from pydantic.dataclasses import dataclass
from pydantic import validator


@dataclass
class MyConfigSchema:
    some_var: float

    @validator("some_var")
    def validate_some_var(cls, some_var: float) -> float:
        if some_var < 0:
            raise ValueError(f"'some_var' can't be less than 0, got: {some_var}")
        return some_var


cs = ConfigStore.instance()
cs.store(name="config_schema", node=MyConfigSchema)


@hydra.main(config_path="/path/to/configs", config_name="config")
def my_app(config: MyConfigSchema) -> None:
    # The 'validator' methods will be called when you run the line below
    OmegaConf.to_object(config)


if __name__ == "__main__":    
    my_app()

And config.yaml :

defaults:
  - config_schema

some_var: -1  # this will raise a ValueError
Brookhouse answered 15/1, 2022 at 6:15 Comment(5)
Kıvanç, have you been able to get this to work with custom- or constrained types? I get ValidationErrors. I would prefer to use such types instead of @validators.Bank
This doesn't seem to work for me. When you call OmegaConf.to_object(), it converts the object to a dict instead of a MyConfigSchema. Because of this, MyConfigSchema is never constructed and the validator never runs.Amias
@Bank I don't think it is possible to use types other than what python dataclasses support.Imperial
@Amias are you sure you are using the Structred Config you register in your .yaml file (in defaults key)?Imperial
@KıvançYüksel That's what I was missing! I was always curious what the purpose of the name parameter in cs.store() did. Thank you! It might be helpful for other people to include a sample YAML file in the response and include the import of omegaconf so it works as is.Amias
F
3

I wanted to provide an update as Pydantic and Hydra have changed since this question was first posed.

Here is how to do this in 2024 using an updated (but similar) config.yaml file as above:

defaults:
  - config_schema
  - _self_

some_var: -1  # this will raise a ValueError

and a similar schema:

import hydra
from hydra.core.config_store import ConfigStore
from omegaconf import DictConfig, OmegaConf
from pydantic import field_validator
from pydantic.dataclasses import dataclass


@dataclass
class MyConfigSchema:
    some_var: float

    @field_validator("some_var")
    @classmethod
    def validate_data_dir(cls, v: float) -> float:
        if v < 0:
            raise ValueError(f"'some_var' can't be less than 0, got: {v}")
        return v


cs = ConfigStore.instance()
cs.store(name="config_schema", node=MyConfigSchema)


@hydra.main(
    version_base="1.3",
    config_path="conf/",
    config_name="config",
)
def main(cfg: DictConfig) -> None:
    """
    Main function for running experiments.

    :param cfg: Configuration object.
    :type cfg: DictConfig
    :return: None
    """
    validated_config = MyConfigSchema(**OmegaConf.to_container(cfg, resolve=True))
    print(validated_config)


if __name__ == "__main__":
    main()
Festival answered 25/2 at 3:44 Comment(0)
N
0

See pydantic.dataclasses.dataclass, which are a drop-in replacement for the standard-library dataclasses with some extra type-checking.

Northrup answered 10/1, 2022 at 5:58 Comment(2)
This doesn't work directly, but using it together with OmegaConf.to_object(config) it works. Thanks!Imperial
Hey @KıvançYüksel can you give an example of how u used this?Macronucleus
T
0

The config variable in the author's answer is incorrectly annotated. Hydra always fills in a DictConf object, which is later transformed into the custom class by the OmegaConf.to_object method. In the answer, config is a DictConf object, not a MyConfigSchema object. Here is an edited answer that also integrates the updates from the current versions of the Pydantic and Hydra libraries.

from typing import cast

import hydra
from hydra.core.config_store import ConfigStore
from omegaconf import DictConfig, OmegaConf
from pydantic import field_validator
from pydantic.dataclasses import dataclass


@dataclass
class MyConfigSchema:
    some_var: float

    @field_validator("some_var")
    @classmethod
    def validate_data_dir(cls, v: float) -> float:
        if v < 0:
            raise ValueError(f"'some_var' can't be less than 0, got: {v}")
        return v


cs = ConfigStore.instance()
cs.store(name="config_schema", node=MyConfigSchema)


@hydra.main(version_base=None, config_path="/path/to/configs", config_name="config")
def main(dict_config: DictConfig) -> None:
    config = cast(MyConfigSchema, OmegaConf.to_object(dict_config))


if __name__ == "__main__":
    main()

This works in combination with the following config.yaml file.

defaults:
  - config_schema
  - _self_

some_var: -1  # this will raise a ValueError
Tillett answered 24/7 at 11:33 Comment(2)
You are right that it is actually a DictConfig object, however to infer types and attributes you duck-type with the schema that you are using. Asside from type-forcing this is one of the main ideas behind the schema classes. Theoretically the main function can work with an instance of the dataclass as well, which makes the dataclass a more appropriate type-hint.Undo
I see that was one of the ideas of the schema class, but it can easily break your code. For example, say that your dataclass has a property that rounds up some_var. If you use duck-typing instead of converting it to the actual class, the code will break as soon as you call the property, and there will be no warnings from the type checker. Also, note that you perform the conversion anyway in the author's answer.Tillett

© 2022 - 2024 — McMap. All rights reserved.