Intuition behind U-net vs FCN for semantic segmentation
S

2

11

I don't quite understand the following:

In the proposed FCN for Semantic Segmentation by Shelhamer et al, they propose a pixel-to-pixel prediction to construct masks/exact locations of objects in an image.

In the slightly modified version of the FCN for biomedical image segmentation, the U-net, the main difference seems to be "a concatenation with the correspondingly cropped feature map from the contracting path."

Now, why does this feature make a difference particularly for biomedical segmentation? The main differences I can point out for biomedical images vs other data sets is that in biomedical images there are not as rich set of features defining an object as for common every day objects. Also the size of the data set is limited. But is this extra feature inspired by these two facts or some other reason?

Smelser answered 8/5, 2018 at 18:9 Comment(0)
V
16

FCN vs U-Net:

FCN

  1. It upsamples only once. i.e. it has only one layer in the decoder
  2. The original implementation github repo uses bilinear interpolation for upsampling the convoloved image. That is there is no learnable filter here
  3. variants of FCN-[FCN 16s and FCN 8s] add the skip connections from lower layers to make the output robust to scale changes

U-Net

  1. multiple upsampling layers
  2. uses skip connections and concatenates instead of adding up
  3. uses learnable weight filters instead of fixed interpolation technique
Vinnievinnitsa answered 18/6, 2018 at 14:23 Comment(2)
For whatever reason, VGG16-FCN-8s (see my keras conversion github.com/dmitryako/keras_fcn_8s) worked much better for me, i.e. I could not get better results with U-Net.Credit
Hello, The results will depend on the task we are trying to do ans also the dataset which we are using. U net was specifically proven to work well with less [with data augmentation techniques]. Ideally in my experience UNet gives better performance because it has multiple upsamlping layers along with more skip connections which theoritically make it more robust to scale varitiaons as compared to FCN . BTW what task were you doing and what dataset did you use.Also can you post a link to your research papers hereVinnievinnitsa
B
2

U-Net is built upon J. Long's FCN paper. A couple of differences is that the original FCN paper used the decoder half to upsample the classification (i.e the entire second half of the net is of depth C - number of classes)

U-Net's think of the second half as being in feature space and do the final classification at the end.

Nothing about it is special to bio-medical IMO

Boykin answered 18/5, 2018 at 19:11 Comment(1)
You are right, U-Net is not specifically biomedical, it just fits well for biomedical applications where accuracy (especially in shape) is critical, and U-Net's skip connection help a lot with thatTrajectory

© 2022 - 2024 — McMap. All rights reserved.