How to override hydra working dir from within a script?
Asked Answered
B

3

9

I know that I can change the working dir in config by setting hydra.run.dir=XXX from the command line. But how to do it properly from script w/o using CLI arguments in a way that even the logs are saved in the dir which I set?

This code won't work because:

  1. the hydra and its loggers are already initialized when I try to change the dir and
  2. there is no such attribute cfg.hydra.

UPD: I got a pointer in the comments. I could change the hydra parameters in the block if __name__ == 'main': before hydra is called. But how to get access and modify hydra.run.dir from the script?

    @hydra.main(config_path="conf", config_name="config")
    def main(cfg):
        cfg.hydra.run.dir = "./c_out/cached_loss"  # no such attribute
        logger.info('I log something')

My hydra config looks like this:

defaults:                     
  - hydra/job_logging: custom_logging 
# hydra/custom_logging.yaml
# python logging configuration for tasks                           
version: 1                                                         
formatters:                                                        
  simple:                                                          
    format: '[%(asctime)s][%(name)s][%(levelname)s] - %(message)s' 
handlers:                                                          
  console:                                                         
    class: logging.StreamHandler                                   
    formatter: simple                                              
    stream: ext://sys.stdout                                       
  file:                                                            
    class: logging.FileHandler                                     
    formatter: simple                                              
    # relative to the job log directory                            
    filename: ${hydra.job.name}.log                                
root:                                                              
  level: INFO                                                      
  handlers: [console, file]                                        
                                                                   
disable_existing_loggers: false                                    
Bedwarmer answered 1/11, 2020 at 15:5 Comment(2)
I don't know Hydra. Is the main Python entrypoint in your code or in the Hydra library? If it's in your code, then you must be able to make change before Hydra initializes. If not, then you're talking about having Hydra RE-initialize itself over again based on a new location, right? That seems like a long shot. I'm curious why you need to do this dynamically. If you really do, how about a wrapper launch script that uses the command line parameter, but hides that from you and sets the directory per whatever method you were going to use in the main program.Hummingbird
Thanks, @Steve. That makes sense. My entry point if if __name__ == 'main': block, I assume it is called before hydra. I could try to change the parameters there, however there is no obvious way since hydra hides its internal config from the user after the script is launched.Bedwarmer
F
9

The @hydra.main decorator reads the command line arguments from sys.argv and creates the output directory and sets up logging based on the arguments, before the decorated function is executed. You don't have the configuration before entering the function, but you could add the hydra.run.dir=XXX command line argument before calling the function with this kind of hack:

@hydra.main(config_path="conf", config_name="config")
def main(cfg):
    logger.info('I log something')

if __name__ == 'main':
    sys.argv.append('hydra.run.dir=c_out/cached_loss')
    main()
Flawed answered 27/5, 2021 at 10:38 Comment(3)
well said! (this is exactly right for the OP's question, but for anyone applying the idea to --multirun, note that the relevant parameters are different: hydra.sweep.dir and hydra.sweep.subdir.)Recant
Can we change the dir inside main()?Roderickroderigo
@MetaFan TL;DR - No.Flawed
E
5

You can change it BEFORE the script starts by overriding that parameter.

python foo.py hydra.run.dir=something

You can also change it in your config: config.yaml

hydra:
  run:
    dir: whatever

This can also use an environment variable in the config using OmegaConf env resolver.

hydra:
  run:
    dir: ${env:HYDRA_OUTPUT_DIR,default_output_dir}

if you just want to change the working directory at runtime you can do it with os.chdir()

Eileen answered 1/11, 2020 at 18:31 Comment(4)
But the question is how to do it w/o using the command line arguments...Bedwarmer
Changing the working dir will not change the log file path, which is configured by hydra during initialization (it would create log handlers before I change the working dir).Bedwarmer
I'm sorry about that. I have clarified the question one more time. I will change my vote when you edit your answer.Bedwarmer
in the new version of OnegaConf you should use oc.env:**** instead of oc.env:****Lightfoot
R
5

This can be achieved via omegaconf interpolation
For example my use case when I create directory named with uuid
First we register resolver with the function we need instead of lambda

from omegaconf import OmegaConf

OmegaConf.register_resolver("uuid", lambda : "fdjsfas-3213-kjfdsf")

in hydra config

hydra:
  run:
    dir: ./outputs/training/${uuid:}

This is still not really accesing from script but it allows python code to generate config variables. I don't really think there is a normal way to alter the hydra config after it was initialized.

P.S. I use structured configs and had to change the code so it may not actually work but I hope you got the idea

Receptive answered 10/11, 2020 at 8:33 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.