What does backbone mean in a neural network?

Asked 22/1, 2020 at 20:57 Answered 23/6, 2021 at 10:6

Solved deep-learning neural-network deeplab

I am getting confused with the meaning of "backbone" in neural networks, especially in the DeepLabv3+ paper. I did some research and found out that backbone could mean

the feature extraction part of a network

DeepLabv3+ took Xception and ResNet-101 as its backbone. However, I am not familiar with the entire structure of DeepLabv3+, which part the backbone refers to, and which parts remain the same?

A generalized description or definition of backbone would also be appreciated.

Isleen answered 22/1, 2020 at 20:57 Comment(1)

I think that it is just a concept used in the paper arxiv.org/pdf/1703.06870.pdf. It is nothing special. just the first block of their image. – Jerusalem 22/1, 2020 at 21:57

In my understanding, the "backbone" refers to the feature extracting network which is used within the DeepLab architecture. This feature extractor is used to encode the network's input into a certain feature representation. The DeepLab framework "wraps" functionalities around this feature extractor. By doing so, the feature extractor can be exchanged and a model can be chosen to fit the task at hand in terms of accuracy, efficiency, etc.

In case of DeepLab, the term backbone might refer to models like the ResNet, Xception, MobileNet, etc.

Kathlyn answered 25/2, 2020 at 17:40 Comment(0)

TL;DR Backbone is not a universal technical term in deep learning.

(Disclaimer: yes, there may be a specific kind of method, layer, tool etc. that is called "backbone", but there is no "backbone of a neural network" in general.)

If authors use the word "backbone" as they are describing a neural network architecture, they mean

feature extraction ( a part of the network that "sees" the input), but this interpretation is not quite universal in the field: for instance, in my opinion, computer vision researchers would use the term to mean feature extraction, whereas natural language processing researchers would not.
in informal language, that this part in question is crucial to the overall method.

Baiss answered 28/1, 2020 at 9:57 Comment(3)

journals.sagepub.com/doi/full/10.1177/00368504211011343, arxiv.org/pdf/1904.01169.pdf, – Muscadel 9/12, 2021 at 3:8

@Muscadel Why did you add references to these papers? – Macule 9/12, 2021 at 8:47

I just missunderstand your post and I add just refer and no comment becouse of my poor english. Sorry, just forget it ^^ – Muscadel 17/12, 2021 at 9:9

Backbone is a term used in DeepLab models/papers to refer to the feature extractor network. These feature extractor networks compute features from the input image and then these features are upsampled by a simple decoder module of DeepLab models to generate segmented masks. The authors of DeepLab models have shown performance with different feature extractors (backbones) like MobileNet, ResNet, and Xception network.

Sauger answered 31/3, 2020 at 10:31 Comment(0)

CNNs are used for extracting features. Several CNNs are available, for instance, AlexNet, VGGNet, and ResNet(backbones). These networks are mainly used for object classification tasks and have evaluated on some widely used benchmarks and datasets such as ImageNet. In image classification or image recognition, the classifier classifies a single object in the image, outputs a single category per image, and gives the probability of matching a class. Whereas in object detection, the model must be able to recognize several objects in a single image and provides the coordinates that identify the location of the objects. This shows that the detection of objects can be more difficult than the classification of images.

source and more info: https://link.springer.com/chapter/10.1007/978-3-030-51935-3_30

Fauch answered 23/6, 2021 at 10:6 Comment(0)

Recommended topics

Hot tags