I wonder what are the advantages of using Hydra to manage my configuration files, versus loading .yaml configuration file directly (using import yaml)?
TL; DR
If you're working on a project, that has many configurable parameters, then indeed using Hydra makes sense.
If not, then it'll do more harm than help, as it's an extra requirement to be included with your project, requires other developers to learn how to use it, and instantiating the configuration files sometimes is a headache. For smaller projects using .py
, "pure" .yaml
, or even .ini
files often makes more sense.
Hydra Main Features
Aside from the points mentioned in Jasha's Answer, there are two additional features that I personally use a lot from Hydra.
Object Instantiation
The first feature is the ability to instantiate objects, like classes
, and functions
by specifying the import path to the object as a key named _target_
, alongside the values for the parameters that the object requires. For example, consider the following .yaml
configuration file:
# conf/config.yaml
defaults:
- db:
- base
- sqlite
- /hydra/callbacks:
- helper_callback
- override hydra/help: opt_help
- override hydra/job_logging: custom
- _self_
# Same as using:
# from dateutil.relativedelta import relativedelta, FR
# relative_date = relativedelta(weeks=3, weekday=FR(1))
relative_date:
_target_: dateutil.relativedelta.relativedelta
weeks: 3
weekday:
_target_: dateutil.relativedelta.FR
n: 1
Then you could instantiate relative_date
using something like:
from hydra import compose, initialize
from hydra.utils import instantiate
initialize(config_path='./conf')
cfg = compose(config_name="config")
# Same as: relative_date = relativedelta(weeks=3, weekday=FR(1))
relative_date = instantiate(cfg['relative_date'])
Or:
# foo.py
import hydra
from hydra.utils import instantiate
@hydra.main(config_path="./conf", config_name="config", version_base=hydra.__version__)
def main(cfg):
print(instantiate(cfg['relative_date']))
if __name__ == '__main__':
main()
And executing:
$ python foo.py
relativedelta(days=+21, weekday=FR(+1))
Note: first option works on interactive python environments, like Jupyter, whereas the second approach won't.
Retrieve Environment Variables
Some projects make use of environment variables. These variables are part of the environment in which a process runs (i.e. your computer). Environment variables can also be found in a project level, inside a file named .env
. Hydra enables you to use such variables, like so:
main:
source: file
debug: True
testing: True
user: ${oc.env:USER} # <-- Access an environment variable named "USER"
src_dir: ${oc.env:SRC_DIR}/ # <-- Access an environment variable named "SRC_DIR"
Note: to be fair, this is a feature from OmegaConf, which is the package that Hydra uses under the hood.
Real Project Example
The Tree view below shows an example of a project I've developed, that had a huge number of configurable parameters, that makes use of Hydra:
conf
├── config.yaml
├── optimization.yaml
├── maintenance.yaml
├── sentry_config.yaml
├── alignment_conf
│ ├── extras.yaml
│ └── alignment.yaml
├── constraints
│ ├── air_capacity.yaml
│ ├── delivery.yaml
│ └── handling.yaml
├── db
│ ├── base.yaml
│ ├── hana_dev.yaml
│ ├── hana_prod.yaml
│ └── sqlite.yaml
├── hydra
│ ├── callbacks
│ │ └── helper_callback.yaml
│ ├── help
│ │ └── opt_help.yaml
│ └── job_logging
│ └── custom.yaml
└── solvers
├── cbc_cmd.yaml
├── choco_cmd.yaml
├── cplex.yaml
├── glpk_cmd.yaml
├── gurobi.yaml
├── mosek.yaml
└── scip.yaml
Hydra provides a framework for config composition and instantiation.
The "config composition" part means that the data from yaml files can be combined and modified in a flexible way. You can use directives and "defaults lists" in your yaml files to include yaml files into eachother, and you can use Hydra's command-line grammar to modify how your yaml data are composed when you invoke the app from your terminal. This allows for e.g. changing hyperparameter settings or swapping out different implementations of a class from the command line in a way that is more flexible and fluent than traditional solutions such as python's argparse
. I recommend following Hydra's "Your first Hydra app" tutorial to get a feel for config composition.
The "instantiation" part means that you can turn a composed config into instances of your application's classes. The creation of objects that would traditionally be done in a program's "main" routine can instead be represented as yaml and later animated using Hydra's instantiate API. This extra layer of abstraction on top of your "main" routine opens up new possibilities for flexible object creation and composition.
There are several built-in convenience features such as logging support, command-line tab completion that makes it easy to discover how to modify your app's configuration at the command line, and automatic saving of a snapshot of the app's configuration in the logging directory.
Hydra has a plugin framework. There are several "sweeper" plugins that provide support for hyperparameter optimization, as well as "launcher" plugins that provide support for e.g. launching jobs remotely.
The fact that Hydra uses OmegaConf as a backend comes with several benefits:
- OmegaConf supports variable interpolation, which are like "pointers" in your config object. For example, in a yaml file you could write something like this:
foo: 123
bar: ${foo}
and then later in your python code you could assert cfg.bar == 123
.
- OmegaConf's "custom resolver" feature allows you register python functions that can be invoked inline in your yaml file, essentially allowing users to define a domain-specific language for manipulating configuration data. For example, you could register a python function
add_one
that adds1
to a given number, and then use this function in a yaml file as so:
baz: ${add_one: 123}
qux: ${add_one: ${foo}} # nested interpolations work too
This would result in cfg.baz == 124
and cfg.qux == 124
.
- OmegaConf's "structured config" support means you can create a schema that will be used to perform runtime type validation of your yaml data. See the Hydra tutorial on structured configs and the OmegaConf docs on structured configs.
© 2022 - 2024 — McMap. All rights reserved.