Implement SeparableConv2D in Pytorch
Asked Answered
D

1

5

Main objective

PyTorch equivalent for SeparableConv2D with padding = 'same':

from tensorflow.keras.layers import SeparableConv2D
x = SeparableConv2D(64, (1, 16), use_bias = False, padding = 'same')(x)

What is the PyTorch equivalent for SeparableConv2D?

This source says:

If groups = nInputPlane, kernel=(K, 1), (and before is a Conv2d layer with groups=1 and kernel=(1, K)), then it is separable.

While this source says:

Its core idea is to break down a complete convolutional acid into a two-step calculation, Depthwise Convolution and Pointwise.

This is my attempt:

class SeparableConv2d(nn.Module):
    def __init__(self, in_channels, out_channels, depth, kernel_size, bias=False):
        super(SeparableConv2d, self).__init__()
        self.depthwise = nn.Conv2d(in_channels, out_channels*depth, kernel_size=kernel_size, groups=in_channels, bias=bias)
        self.pointwise = nn.Conv2d(out_channels*depth, out_channels, kernel_size=1, bias=bias)

    def forward(self, x):
        out = self.depthwise(x)
        out = self.pointwise(out)
        return out

Is this correct? Is this equivalent to tensorflow.keras.layers.SeparableConv2D?

What about padding = 'same'?

How to ensure that my input and output size is the same while doing this?

My attempt:

x = F.pad(x, (8, 7, 0, 0), )

Because the kernel size is (1,16), I added left and right padding, 8 and 7 respectively. Is this the right way (and best way) to achieve padding = 'same'? How can I place this inside my SeparableConv2d class, and calculate on the fly given the input data dimension size?

All together

class SeparableConv2d(nn.Module):
    def __init__(self, in_channels, out_channels, depth, kernel_size, bias=False):
        super(SeparableConv2d, self).__init__()
        self.depthwise = nn.Conv2d(in_channels, out_channels*depth, kernel_size=kernel_size, groups=in_channels, bias=bias)
        self.pointwise = nn.Conv2d(out_channels*depth, out_channels, kernel_size=1, bias=bias)

    def forward(self, x):
        out = self.depthwise(x)
        out = self.pointwise(out)
        return out


class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.separable_conv = SeparableConv2d(
            in_channels=32, 
            out_channels=64, 
            depth=1, 
            kernel_size=(1,16)
        )
        
    def forward(self, x):
        x = F.pad(x, (8, 7, 0, 0), )
        x = self.separable_conv(x)
        return x

Any problem with these codes?

Dacoit answered 5/12, 2020 at 5:47 Comment(0)
V
7

The linked definitions are generally agreeing. The best one is in the article.

  • "Depthwise" (not a very intuitive name since depth is not involved) - is a series of regular 2d convolutions, just applied to layers of the data separately. - "Pointwise" is same as Conv2d with 1x1 kernel.

I suggest a few correction to your SeparableConv2d class:

  • no need to use depth parameter - it is same as out_channels
  • I set padding to 1 to ensure same output size with kernel=(3,3). if kernel size is different - adjust padding accordingly, using same principles as with regular Conv2d. Your example class Net() is no longer needed - padding is done in SeparableConv2d.

This is the updated code, should be similar to tf.keras.layers.SeparableConv2D implementation:

class SeparableConv2d(nn.Module):

def __init__(self, in_channels, out_channels, kernel_size, bias=False):
    super(SeparableConv2d, self).__init__()
    self.depthwise = nn.Conv2d(in_channels, in_channels, kernel_size=kernel_size, 
                               groups=in_channels, bias=bias, padding=1)
    self.pointwise = nn.Conv2d(in_channels, out_channels, 
                               kernel_size=1, bias=bias)

def forward(self, x):
    out = self.depthwise(x)
    out = self.pointwise(out)
    return out
Violetvioleta answered 5/12, 2020 at 8:27 Comment(4)
Thanks for your answer, I have questions on the self.depthwise layer. Why is it nn.Conv2d(1,1) and not nn.Conv2d(in_channels, in_channels). And why is the groups` parameter not used but instead performs x. reshape first followed by Conv2d with 1 filter?Dacoit
My example was based on assumption that all layers are convolved depthwise with same filter. After reviewing tf.keras.layers.SeparableConv2D implementation, I see that TF uses different filters for each layer. You are correctly suggesting that it will be nn.Conv2d(in_channels=N, out_channels=N, groups=N) # N is in_channels. I will update the example in my solution with both options.Violetvioleta
What is torch_s2d (I'm having name 'torch_s2d' is not defined here)? And what is self.depthwise.weight[1:] = torch_s2d.depthwise.weight[0] doing, initial weights?Dacoit
torch_s2d is a leftover from testing. I removed it from the answer and also removed the second case - just left one that should match the TF implementation as you asked.Violetvioleta

© 2022 - 2024 — McMap. All rights reserved.