Extract features from last hidden layer Pytorch Resnet18

Asked 10/3, 2019 at 1:29 Answered 2/6, 2023 at 18:45

Solved python conv-neural-network pytorch

I am implementing an image classifier using the Oxford Pet dataset with the pre-trained Resnet18 CNN. The dataset consists of 37 categories with ~200 images in each of them.

Rather than using the final fc layer of the CNN as output to make predictions I want to use the CNN as a feature extractor to classify the pets.

For each image i'd like to grab features from the last hidden layer (which should be before the 1000-dimensional output layer). My model is using Relu activation so I should grab the output just after the ReLU (so all values will be non-negative)

Here is code (following the transfer learning tutorial on Pytorch):

loading data

normalize = transforms.Normalize(mean=[0.485, 0.456, 0.406],
                                std=[0.229, 0.224, 0.225])


image_datasets = {"train": datasets.ImageFolder('images_new/train', transforms.Compose([
        transforms.RandomResizedCrop(224),
        transforms.RandomHorizontalFlip(),
        transforms.ToTensor(),
        normalize
    ])), "test": datasets.ImageFolder('images_new/test', transforms.Compose([
        transforms.Resize(256),
        transforms.CenterCrop(224),
        transforms.ToTensor(),
        normalize
    ]))
               }

dataloaders = {x: torch.utils.data.DataLoader(image_datasets[x], batch_size=4,
                                             shuffle=True, num_workers=4, pin_memory=True)
              for x in ['train', 'test']}

dataset_sizes = {x: len(image_datasets[x]) for x in ['train', 'test']}

train_class_names = image_datasets['train'].classes

device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")

train function

def train_model(model, criterion, optimizer, scheduler, num_epochs=25):
    since = time.time()

    best_model_wts = copy.deepcopy(model.state_dict())
    best_acc = 0.0

    for epoch in range(num_epochs):
        print('Epoch {}/{}'.format(epoch, num_epochs - 1))
        print('-' * 10)

        # Each epoch has a training and validation phase
        for phase in ['train', 'test']:
            if phase == 'train':
                scheduler.step()
                model.train()  # Set model to training mode
            else:
                model.eval()   # Set model to evaluate mode

            running_loss = 0.0
            running_corrects = 0

            # Iterate over data.
            for inputs, labels in dataloaders[phase]:
                inputs = inputs.to(device)
                labels = labels.to(device)

                # zero the parameter gradients
                optimizer.zero_grad()

                # forward
                # track history if only in train
                with torch.set_grad_enabled(phase == 'train'):
                    
                    outputs = model(inputs)
                    _, preds = torch.max(outputs, 1)
                    
                    
                    loss = criterion(outputs, labels)

                    # backward + optimize only if in training phase
                    if phase == 'train':
                        loss.backward()
                        optimizer.step()

                # statistics
                running_loss += loss.item() * inputs.size(0)
                running_corrects += torch.sum(preds == labels.data)

            epoch_loss = running_loss / dataset_sizes[phase]
            epoch_acc = running_corrects.double() / dataset_sizes[phase]

            print('{} Loss: {:.4f} Acc: {:.4f}'.format(
                phase, epoch_loss, epoch_acc))

            # deep copy the model
            if phase == 'test' and epoch_acc > best_acc:
                best_acc = epoch_acc
                best_model_wts = copy.deepcopy(model.state_dict())

        print()

    time_elapsed = time.time() - since
    print('Training complete in {:.0f}m {:.0f}s'.format(
        time_elapsed // 60, time_elapsed % 60))
    print('Best val Acc: {:4f}'.format(best_acc))

    # load best model weights
    model.load_state_dict(best_model_wts)
    return model

Compute SGD cross-entropy loss

model_ft = models.resnet18(pretrained=True)
num_ftrs = model_ft.fc.in_features

print("number of features: ", num_ftrs)

model_ft.fc = nn.Linear(num_ftrs, len(train_class_names))

model_ft = model_ft.to(device)
criterion = nn.CrossEntropyLoss()

# Observe that all parameters are being optimized
optimizer_ft = optim.SGD(model_ft.parameters(), lr=0.001, momentum=0.9)

# Decay LR by a factor of 0.1 every 7 epochs
exp_lr_scheduler = lr_scheduler.StepLR(optimizer_ft, step_size=7, gamma=0.1)
model_ft = train_model(model_ft, criterion, optimizer_ft, exp_lr_scheduler,
                       num_epochs=24)

Now how do I get a feature vector from the last hidden layer for each of my images? I know I have to freeze the previous layer so that gradient isn't computed on them but I'm having trouble extracting the feature vectors.

My ultimate goal is to use those feature vectors to train a linear classifier such as Ridge or something like that.

Thanks!

Nimesh answered 10/3, 2019 at 1:29 Comment(0)

This is probably not the best idea, but you can do something like this:

#assuming model_ft is trained now
model_ft.fc_backup = model_ft.fc
model_ft.fc = nn.Sequential() #empty sequential layer does nothing (pass-through)
# or model_ft.fc = nn.Identity()
# now you use your network as a feature extractor

I also checked fc is the right attribute to change, look at forward

Headlock answered 10/3, 2019 at 20:27 Comment(4)

from model_ft.fc how would I now extract the final hidden layer? – Nimesh 10/3, 2019 at 20:40

you just run an image through your network and it will output the final hidden features – Headlock 10/3, 2019 at 20:44

check out the source code's forward function, if you replace the fc with a dummy function, it will output hidden features (features = model_ft(inputs)) – Headlock 10/3, 2019 at 20:46

As of 2022, you can use nn.Identity() instead of the empty sequential layer. – Intrinsic 30/3, 2022 at 12:10

You can try the approach below. This will work for any layer with only a change of offset.

model_ft = models.resnet18(pretrained=True)
### strip the last layer
feature_extractor = torch.nn.Sequential(*list(model_ft.children())[:-1])
### check this works
x = torch.randn([1,3,224,224])
output = feature_extractor(x) # output now has the features corresponding to input x
print(output.shape)

torch.Size([1, 512, 1, 1])

Sweat answered 11/3, 2019 at 8:59 Comment(7)

how to I get the actual tensor (features). – Nimesh 12/3, 2019 at 4:2

variable 'output' has the features. I just printed the shape of the output tensor as it would be too verbose to print the output itself. – Sweat 12/3, 2019 at 4:5

@ManojMohan Why the output has 512 features? Shouldn't it be 1000? – Pulmonic 18/12, 2020 at 7:50

@Pulmonic 512 - since we're grabbing the features from the penultimate layer. – Sweat 21/12, 2020 at 4:53

Right, got it. Thanks :) – Pulmonic 22/12, 2020 at 5:8

get ts error for output: RuntimeError: Input type (torch.FloatTensor) and weight type (torch.cuda.FloatTensor) should be the same or input should be a MKLDNN tensor and weight is a dense tensor – Cercaria 17/2, 2022 at 20:58

As the error states, the input is on the CPU while the model weights are on GPU. Need to have both of the them on the same device. – Sweat 18/2, 2022 at 13:24

This is probably not the best idea, but you can do something like this:

#assuming model_ft is trained now
model_ft.fc_backup = model_ft.fc
model_ft.fc = nn.Sequential() #empty sequential layer does nothing (pass-through)
# or model_ft.fc = nn.Identity()
# now you use your network as a feature extractor

I also checked fc is the right attribute to change, look at forward

Headlock answered 10/3, 2019 at 20:27 Comment(4)

from model_ft.fc how would I now extract the final hidden layer? – Nimesh 10/3, 2019 at 20:40

you just run an image through your network and it will output the final hidden features – Headlock 10/3, 2019 at 20:44

check out the source code's forward function, if you replace the fc with a dummy function, it will output hidden features (features = model_ft(inputs)) – Headlock 10/3, 2019 at 20:46

As of 2022, you can use nn.Identity() instead of the empty sequential layer. – Intrinsic 30/3, 2022 at 12:10

If you know the name of your layer (eg layer4 in resnet), you can use hooks:

def get_hidden_features(x, layer):
    activation = {}

    def get_activation(name):
        def hook(m, i, o):
            activation[name] = o.detach()

        return hook

    model.register_forward_hook(get_activation(layer))
    _ = model(x)
    return activation[layer]


get_features(inputs, "layer4")

Example: https://discuss.pytorch.org/t/how-can-i-extract-intermediate-layer-output-from-loaded-cnn-model/77301/3

Presser answered 3/3, 2022 at 15:34 Comment(0)

You can use create_feature_extractor from torchvision.models.feature_extraction to extract the required layer's features from the model.

The node name of the last hidden layer in ResNet18 is flatten.

from torchvision.io import read_image
from torchvision.models import resnet18, ResNet18_Weights
from torchvision.models.feature_extraction import create_feature_extractor

# Step 1: Initialize model with the best available weights
weights = ResNet18_Weights.DEFAULT
model = resnet18(weights=weights)
model.eval()

# Step 2: Initialize the inference transforms
preprocess = weights.transforms()

# Step 3: Create the feature extractor with required nodes
return_nodes = {'flatten': 'flatten'}
feature_extractor = create_feature_extractor(model, return_nodes=return_nodes)

# Step 4: Load the image(s) and apply inference preprocessing transforms
image = "?"
image = read_image(image).unsqueeze(0)
model_input = preprocess(image)

# Step 5: Extract the features
features = feature_extractor(model_input)
flatten_fts = features["flatten"].squeeze()
print(flatten_fts.shape)

One can get all the node names in the model by

from torchvision.models import resnet18
from torchvision.models.feature_extraction import get_graph_node_names

model = resnet18()
train_nodes, eval_nodes = get_graph_node_names(model)
print(train_nodes)
print(eval_nodes)

Morita answered 2/6, 2023 at 18:45 Comment(0)

Recommended topics

Hot tags