how to elegant use python dataclass singleton for config?
Asked Answered
L

3

0

for some reason, I use the toml config instead of the .py config.

then, the singleton nightmare comes, continuously come up with not init, or RecursionError

is there an elegant way to use a python dataclass in project config?

@dataclasses.dataclass(frozen=True)
class Config:
    _instance: typing.Optional[typing.Self] = dataclasses.field(init=False, repr=False)

    Targets: list[Target]

    FFmpeg: typing.Optional[FFmpeg]

    Whisper: typing.Optional[Whisper]

    Translate: typing.Optional[Translate]

    Srt: typing.Optional[Srt]

    Log: typing.Optional[Log]

    @classmethod
    def init_config(cls) -> typing.Self:
        if cls._instance is None:
            config = pathlib.Path().absolute().joinpath("config.toml")
            with open(config, "rb") as f:
                data = tomllib.load(f)
            log_config = Log(level=logging.DEBUG, count=0, size=0)
            targets: list[Target] = list()
            srt_config = Srt(overwrite=data['srt']['overwrite'], bilingual=data['srt']['bilingual'])
            cls._instance = Config(Targets=targets,
                                   FFmpeg=ffmpeg_config,
                                   Whisper=whisper_config,
                                   Translate=translate_config,
                                   Srt=srt_config,
                                   Log=log_config)
            return cls._instance


CONFIG: typing.Optional[Config] = Config.init_config()

or some other errors like below:

    if cls._instance is None:
       ^^^^^^^^^^^^^
AttributeError: type object 'Config' has no attribute '_instance'
Locular answered 30/6, 2023 at 2:30 Comment(1)
does it have to be a self-written dataclass? there are third-party libraries out there that have solved this problem already, even with similar-looking patterns.Bolinger
F
1

You can use functools.cache.

@dataclasses.dataclass(frozen=True)
class Config:
    Targets: list[Target]
    FFmpeg: typing.Optional[FFmpeg]
    Whisper: typing.Optional[Whisper]
    Translate: typing.Optional[Translate]
    Srt: typing.Optional[Srt]
    Log: typing.Optional[Log]

@functools.cache
def config():
    config_filename = pathlib.Path().absolute().joinpath("config.toml")
    with open(config_filename, "rb") as f:
        data = tomllib.load(f)
    log_config = Log(level=logging.DEBUG, count=0, size=0)
    targets: list[Target] = list()
    srt_config = Srt(overwrite=data['srt']['overwrite'], bilingual=data['srt']['bilingual'])
    config_ = Config(Targets=targets,
                     FFmpeg=ffmpeg_config,
                     Whisper=whisper_config,
                     Translate=translate_config,
                     Srt=srt_config,
                     Log=log_config)
    return config

An addition I like it the possibility to clear the cache to generate the config again. If this is useful usecase for you, consider to never store the returned config anywhere, "just" call config().FFmpeg for example. The function call overhead is negligeable.

Fabozzi answered 3/7, 2023 at 20:26 Comment(0)
I
0

Actually, there is no need to cache your singleton isntance in an _instance attribute. Just create your instance, and assign a top-level name for it, and make your code import that name instead of the class:

@dataclasses.dataclass
class _Config:    # "_" prefix indicating this should not be used by normal code.
    # Add relevant, publicly visible fields:
    Targets: list[Target]
    ...

def _init_config():  # <- no need to be a method, it is only going to be used once, and not everytime one needs the "Config"
    # it is only going to be used once, so no need
    # to check if anything already exists.

    # if any code that _knows_ what it is doing call this again,
    # just re-generate the config.
    config_path = pathlib.Path(__name__).parent.absolute() / "config.toml"
    data = tomllib.load(config_path.open("rb"))
    ...
    # no need for an esoteric, hard to reach name - 
    # just use a plain, easy to know what it is, name, like "Config"
    Config = _Config(Targets=targets, ...)
    return Config

# Public name: the configuration instance already loaded. Other code can use and import `Config` alone:
Config = _init_config()  

Sometimes, simpler is better. Other times it is way better.

Also, I simplified the code setting the Path and reading the config file. But your call to pathlib.Path() was depending on from where your program was being called to run - that is not good. I added the __name__ starting point so that it becomes deterministc - but depending on the file layout you don't show, you may need to add another .parent there to get to the directory where the toml file actually is.

Indurate answered 3/7, 2023 at 13:51 Comment(0)
F
0

You can create a module named config that you import. Doing so, you’ll get the guaranty that it is a singleton.

config_filename = pathlib.Path().absolute().joinpath("config.toml")
with open(config_filename, "rb") as f:
    data = tomllib.load(f)

Targets: list[Target] = list()
FFmpeg: typing.Optional[FFmpeg]
Whisper: typing.Optional[Whisper] = whisper_config
Translate: typing.Optional[Translate] = translate_config
Srt: typing.Optional[Srt] = Srt(overwrite=data['srt']['overwrite'], bilingual=data['srt']['bilingual'])
Log: typing.Optional[Log] = Log(level=logging.DEBUG, count=0, size=0)

del config_filename, data, f

If you want to prevent overwriting attributes, you can add typing.Final[...] to each attribute. This wont prevent users to change values, but it will warn them about the fact that it isn’t something they should do.

Fabozzi answered 3/7, 2023 at 20:47 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.