What does the difference between 'torch.backends.cudnn.deterministic=True' and 'torch.set_deterministic(True)'?
Asked Answered
R

1

11

My network includes 'torch.nn.MaxPool3d' which throw a RuntimeError when cudnn deterministic flag is on according to the PyTorch docs (version 1.7 - https://pytorch.org/docs/stable/generated/torch.set_deterministic.html#torch.set_deterministic), however, when I inserted the code 'torch.backends.cudnn.deterministic=True' at the beginning of my code, there was no RuntimeError. Why doesn't that code throw a RuntimeError? I wonder whether that code guarantees the deterministic computation of my training process.

Rooster answered 10/2, 2021 at 3:32 Comment(0)
J
17

torch.backends.cudnn.deterministic=True only applies to CUDA convolution operations, and nothing else. Therefore, no, it will not guarantee that your training process is deterministic, since you're also using torch.nn.MaxPool3d, whose backward function is nondeterministic for CUDA.

torch.set_deterministic(), on the other hand, affects all the normally-nondeterministic operations listed here (note that set_deterministic has been renamed to use_deterministic_algorithms in 1.8): https://pytorch.org/docs/stable/generated/torch.use_deterministic_algorithms.html?highlight=use_deterministic#torch.use_deterministic_algorithms

As the documentation states, some of the listed operations don't have a deterministic implementation. So if torch.use_deterministic_algorithms(True) is set, they will throw an error.

If you need to use nondeterministic operations like torch.nn.MaxPool3d, then, at the moment, there is no way for your training process to be deterministic--unless you write a custom deterministic implementation yourself. Or you could open a GitHub issue requesting a deterministic implementation: https://github.com/pytorch/pytorch/issues

In addition, you might want to check out this page: https://pytorch.org/docs/stable/notes/randomness.html

Jann answered 16/3, 2021 at 0:10 Comment(2)
this is gold: As the documentation states, some of the listed operations don't have a deterministic implementation. So if torch.use_deterministic_algorithms(True) is set, they will throw an error. Thanks I was confused about the errors I saw at one point.Mainsail
if you're looking for the first line of the answer in the docs: pytorch.org/docs/stable/… "A bool that, if True, causes cuDNN to only use deterministic convolution algorithms."Benzol

© 2022 - 2024 — McMap. All rights reserved.