can't find the inplace operation: one of the variables needed for gradient computation has been modified by an inplace operation
Asked Answered
G

5

8

I am trying to compute a loss on the jacobian of the network (i.e. to perform double backprop), and I get the following error: RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation

I can't find the inplace operation in my code, so I don't know which line to fix.

*The error occurs in the last line:

loss3.backward()

inputs_reg = Variable(data, requires_grad=True)
output_reg = self.model.forward(inputs_reg)

num_classes = output.size()[1]
jacobian_list = []
grad_output = torch.zeros(*output_reg.size())

if inputs_reg.is_cuda:
    grad_output = grad_output.cuda()
    jacobian_list = jacobian.cuda()

for i in range(10):

    zero_gradients(inputs_reg)
    grad_output.zero_()
    grad_output[:, i] = 1
    jacobian_list.append(torch.autograd.grad(outputs=output_reg,
                                      inputs=inputs_reg,
                                      grad_outputs=grad_output,
                                      only_inputs=True,
                                      retain_graph=True,
                                      create_graph=True)[0])


jacobian = torch.stack(jacobian_list, dim=0)
loss3 = jacobian.norm()
loss3.backward()
Guttery answered 9/12, 2018 at 9:57 Comment(4)
grad_output.zero_() seems like an in-place operation. you might have in-place operations in self.model.Hanging
grad_output.zero_() is the inplace operation. In PyTorch the inplace operations end with an underscore. I think you wanted to write `grad_output.zero_grad()Rizika
I need to zero grad_output before I set the new column (corresponding with the output that I want the gradient to be calculated for) to be ones. so I changed grad_output.zero_() to grad_output[:,i-1] = 0 and it did not help.Guttery
Actually what I described above is replacing one inplace operation with another.Guttery
P
7

grad_output.zero_() is in-place and so is grad_output[:, i-1] = 0. In-place means "modify a tensor instead of returning a new one, which has the modifications applied". An example solution which is not in-place is torch.where. An example use to zero out the 1st column

import torch
t = torch.randn(3, 3)
ixs = torch.arange(3, dtype=torch.int64)
zeroed = torch.where(ixs[None, :] == 1, torch.tensor(0.), t)

zeroed
tensor([[-0.6616,  0.0000,  0.7329],
        [ 0.8961,  0.0000, -0.1978],
        [ 0.0798,  0.0000, -1.2041]])

t
tensor([[-0.6616, -1.6422,  0.7329],
        [ 0.8961, -0.9623, -0.1978],
        [ 0.0798, -0.7733, -1.2041]])

Notice how t retains the values it had before and zeroed has the values you want.

Pedicel answered 9/12, 2018 at 13:29 Comment(0)
M
8

You can make use of set_detect_anomaly function available in autograd package to exactly find which line is responsible for the error.

Here is the link which describes the same problem and a solution using the abovementioned function.

Madel answered 18/2, 2019 at 21:41 Comment(0)
P
7

grad_output.zero_() is in-place and so is grad_output[:, i-1] = 0. In-place means "modify a tensor instead of returning a new one, which has the modifications applied". An example solution which is not in-place is torch.where. An example use to zero out the 1st column

import torch
t = torch.randn(3, 3)
ixs = torch.arange(3, dtype=torch.int64)
zeroed = torch.where(ixs[None, :] == 1, torch.tensor(0.), t)

zeroed
tensor([[-0.6616,  0.0000,  0.7329],
        [ 0.8961,  0.0000, -0.1978],
        [ 0.0798,  0.0000, -1.2041]])

t
tensor([[-0.6616, -1.6422,  0.7329],
        [ 0.8961, -0.9623, -0.1978],
        [ 0.0798, -0.7733, -1.2041]])

Notice how t retains the values it had before and zeroed has the values you want.

Pedicel answered 9/12, 2018 at 13:29 Comment(0)
G
0

Thanks! I replaced the problematic code of the inplace operation in grad_output with:

            inputs_reg = Variable(data, requires_grad=True)
            output_reg = self.model.forward(inputs_reg)
            num_classes = output.size()[1]

            jacobian_list = []
            grad_output = torch.zeros(*output_reg.size())

            if inputs_reg.is_cuda:
                grad_output = grad_output.cuda()

            for i in range(5):
                zero_gradients(inputs_reg)

                grad_output_curr = grad_output.clone()
                grad_output_curr[:, i] = 1
                jacobian_list.append(torch.autograd.grad(outputs=output_reg,
                                                         inputs=inputs_reg,
                                                         grad_outputs=grad_output_curr,
                                                         only_inputs=True,
                                                         retain_graph=True,
                                                         create_graph=True)[0])

            jacobian = torch.stack(jacobian_list, dim=0)
            loss3 = jacobian.norm()
            loss3.backward()
Guttery answered 9/12, 2018 at 13:31 Comment(1)
Please note the grad_output_curr[:, i] = 1 line is still an in-place operation and may (or may not) cause trouble further down the line.Pedicel
D
0

I hope your problem got solved. I had this problem and solutions like using function clone() did not work for me. But when I installed pytorch version 1.4, it solved.
I think this problem is kind of bug in step() function. Some weird thing is this bug happen when you use pytorch version 1.5 but it's not in v1.4.
You can see all released versions of pytorch in this link.

Demur answered 23/8, 2022 at 11:6 Comment(0)
S
0

I met this error when I was doing the PPO (Proximal Policy Optimization). I solve this problem by defining a target network and a main network. The target network at the beginning has the same parameter values with the main network. During the training, the target network parameters are assigned to the main network every constant time steps. The details can be found in the code: https://github.com/nikhilbarhate99/PPO-PyTorch/blob/master/PPO_colab.ipynb

Symptomatology answered 25/1, 2023 at 1:52 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.