PyTorch - How to deactivate dropout in evaluation mode
Asked Answered
P

3

30

This is the model I defined it is a simple lstm with 2 fully connect layers.

import copy
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim

class mylstm(nn.Module):
    def __init__(self,input_dim, output_dim, hidden_dim,linear_dim):
        super(mylstm, self).__init__()
        self.hidden_dim=hidden_dim
        self.lstm=nn.LSTMCell(input_dim,self.hidden_dim)
        self.linear1=nn.Linear(hidden_dim,linear_dim)
        self.linear2=nn.Linear(linear_dim,output_dim)
    def forward(self, input):
        out,_=self.lstm(input)
        out=nn.Dropout(p=0.3)(out)
        out=self.linear1(out)
        out=nn.Dropout(p=0.3)(out)
        out=self.linear2(out)
        return out

x_train and x_val are float dataframe with shape (4478,30), while y_train and y_val are float df with shape (4478,10)

    x_train.head()
Out[271]: 
       0       1       2       3    ...        26      27      28      29
0  1.6110  1.6100  1.6293  1.6370   ...    1.6870  1.6925  1.6950  1.6905
1  1.6100  1.6293  1.6370  1.6530   ...    1.6925  1.6950  1.6905  1.6960
2  1.6293  1.6370  1.6530  1.6537   ...    1.6950  1.6905  1.6960  1.6930
3  1.6370  1.6530  1.6537  1.6620   ...    1.6905  1.6960  1.6930  1.6955
4  1.6530  1.6537  1.6620  1.6568   ...    1.6960  1.6930  1.6955  1.7040

[5 rows x 30 columns]

x_train.shape
Out[272]: (4478, 30)

Define the varible and do one time bp, I can find out the vaildation loss is 1.4941

model=mylstm(30,10,200,100).double()
from torch import optim
optimizer=optim.RMSprop(model.parameters(), lr=0.001, alpha=0.9)
criterion=nn.L1Loss()
input_=torch.autograd.Variable(torch.from_numpy(np.array(x_train)))
target=torch.autograd.Variable(torch.from_numpy(np.array(y_train)))
input2_=torch.autograd.Variable(torch.from_numpy(np.array(x_val)))
target2=torch.autograd.Variable(torch.from_numpy(np.array(y_val)))
optimizer.zero_grad()
output=model(input_)
loss=criterion(output,target)
loss.backward()
optimizer.step()
moniter=criterion(model(input2_),target2)

moniter
Out[274]: tensor(1.4941, dtype=torch.float64, grad_fn=<L1LossBackward>)

But I called forward function again I get a different number due to randomness of dropout

moniter=criterion(model(input2_),target2)
moniter
Out[275]: tensor(1.4943, dtype=torch.float64, grad_fn=<L1LossBackward>)

what should I do that I can eliminate all the dropout in predicting phrase?

I tried eval():

moniter=criterion(model.eval()(input2_),target2)
moniter
Out[282]: tensor(1.4942, dtype=torch.float64, grad_fn=<L1LossBackward>)

moniter=criterion(model.eval()(input2_),target2)
moniter
Out[283]: tensor(1.4945, dtype=torch.float64, grad_fn=<L1LossBackward>)

And pass an addtional parameter p to control dropout:

import copy
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
class mylstm(nn.Module):
    def __init__(self,input_dim, output_dim, hidden_dim,linear_dim,p):
        super(mylstm, self).__init__()
        self.hidden_dim=hidden_dim
        self.lstm=nn.LSTMCell(input_dim,self.hidden_dim)
        self.linear1=nn.Linear(hidden_dim,linear_dim)
        self.linear2=nn.Linear(linear_dim,output_dim)
    def forward(self, input,p):
        out,_=self.lstm(input)
        out=nn.Dropout(p=p)(out)
        out=self.linear1(out)
        out=nn.Dropout(p=p)(out)
        out=self.linear2(out)
        return out

model=mylstm(30,10,200,100,0.3).double()

output=model(input_)
loss=criterion(output,target)
loss.backward()
optimizer.step()
moniter=criterion(model(input2_,0),target2)
Traceback (most recent call last):

  File "<ipython-input-286-e49b6fac918b>", line 1, in <module>
    output=model(input_)

  File "D:\Users\shan xu\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 489, in __call__
    result = self.forward(*input, **kwargs)

TypeError: forward() missing 1 required positional argument: 'p'

But neither of them worked.

Plummer answered 21/12, 2018 at 5:41 Comment(2)
model.eval() should work. are you sure you haven't introduced a bug or have changed the value of your input tensors ?Fordone
yeah, I tried to removed dropout layers, the result turned out to be constant no matter how much time I casted. So I think it is just the case that dropout is applied that I got different results.Plummer
W
33

You have to define your nn.Dropout layer in your __init__ and assign it to your model to be responsive for calling eval().

So changing your model like this should work for you:

class mylstm(nn.Module):
    def __init__(self,input_dim, output_dim, hidden_dim,linear_dim,p):
        super(mylstm, self).__init__()
        self.hidden_dim=hidden_dim
        self.lstm=nn.LSTMCell(input_dim,self.hidden_dim)
        self.linear1=nn.Linear(hidden_dim,linear_dim)
        self.linear2=nn.Linear(linear_dim,output_dim)

        # define dropout layer in __init__
        self.drop_layer = nn.Dropout(p=p)
    def forward(self, input):
        out,_= self.lstm(input)

        # apply model dropout, responsive to eval()
        out= self.drop_layer(out)
        out= self.linear1(out)

        # apply model dropout, responsive to eval()
        out= self.drop_layer(out)
        out= self.linear2(out)
        return out

If you change it like this dropout will be inactive as soon as you call eval().

NOTE: If you want to continue training afterwards you need to call train() on your model to leave evaluation mode.


You can also find a small working example for dropout with eval() for evaluation mode here: nn.Dropout vs. F.dropout pyTorch

Williawilliam answered 21/12, 2018 at 9:4 Comment(6)
is it cool to use the same dropout layer multiple times in a model?Discotheque
It appears that in Pytorch, you have to define all the layers as fields in the class if you want things to work well. Am I right? When I once assigned the layers into a list (because I wanted things to be dynamic), they were not included in .model_dict(), so I could not save the network. Solved it by also calling setattr(self, layer_name, layer) within the net's __init__ function. It appears that Pytorch will not recursively look for additional components within non-pytorch components, such as lists or other data structures.Danialdaniala
@Danialdaniala Not sure if I got you right, but you might want to take a look at: torch.nn.ModuleListWilliawilliam
Thank you @blue-phoenox, this was very helpful. So the ModuleList is a list designated for containing components that will be recursively updated when calling methods such as model.eval(), model.train(), if I got it right.Danialdaniala
@Danialdaniala Yes, using nn.ModuleList will make sure that all the parameters/modules in it will get registered properly, so they will be visible by all Module methods such as train().Williawilliam
@Discotheque I seem to have missed your comment, sorry for that. Sure it is no problem to use the same layer multiple times, since the dropout layer has no parameters that will be learned. It just performs the dropout operation on the given droprate. It does this just as good when you use it multiple times.Williawilliam
P
2

I add this answer just because I'm facing now the same issue while trying to reproduce Deep Bayesian active learning through dropout disagreement. If you need to keep dropout active (for example to bootstrap a set of different predictions for the same test instances) you just need to leave the model in training mode, there is no need to define your own dropout layer.

Since in pytorch you need to define your own prediction function, you can just add a parameter to it like this:

def predict_class(model, test_instance, active_dropout=False):
    if active_dropout:
        model.train()
    else:
        model.eval()
Phototonus answered 13/6, 2019 at 17:35 Comment(0)
S
0

As the other answers said, the dropout layer is desired to be defined in your model's __init__ method, so that your model can keep track of all information of each pre-defined layer. When the model's state is changed, it would notify all layers and do some relevant work. For instance, while calling model.eval() your model would deactivate the dropout layers but directly pass all activations. In general, if you wanna deactivate your dropout layers, you'd better define the dropout layers in __init__ method using nn.Dropout module.

Seminal answered 17/1, 2019 at 8:42 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.