When using torch.autocast, how do I force individual layers to float32

I think the motivation of torch.autocast is to automate the reduction of precision (not the increase).

If you have functions that need a particular dtype, you should consider using, custom_fwd

import torch
@torch.cuda.amp.custom_fwd(cast_inputs=torch.complex128)
def get_custom(x):
    print('  Decorated function received', x.dtype)
def regular_func(x):
    print('  Regular function received', x.dtype)
    get_custom(x)

x = torch.tensor(0.0, dtype=torch.half, device='cuda')
with torch.cuda.amp.autocast(False):
    print('autocast disabled')
    regular_func(x)
with torch.cuda.amp.autocast(True):
    print('autocast enabled')
    regular_func(x)

autocast disabled
  Regular function received torch.float16
  Decorated function received torch.float16
autocast enabled
  Regular function received torch.float16
  Decorated function received torch.complex128

Edit: Using torchscript

I am not sure how much you can rely on this, due to a comment in the documentation. However the comment is apparently outdated.

Here is an example where I trace the model with autocast enabled, feeze it and then I use it and the value is indeed cast to the specified type

class Cast(torch.nn.Module):    
    @torch.cuda.amp.custom_fwd(cast_inputs=torch.float64)
    def forward(self, x):
        return x

with torch.cuda.amp.autocast(True):
    model = torch.jit.trace(Cast().eval(), x)
model = torch.jit.freeze(model)

x = torch.tensor(0.0, dtype=torch.half, device='cuda')
print(model(x).dtype)

torch.float64

But I suggest you to validate this approach before using it for a serious application.

Edit: Using torchscript

Recommended topics

Hot tags