Different output from Libtorch C++ and pytorch
Asked Answered
A

1

5

I'm using the same traced model in pytorch and libtorch but I'm getting different outputs.

Python Code:

import cv2
import numpy as np 
import torch
import torchvision
from torchvision import transforms as trans


# device for pytorch
device = torch.device('cuda:0')

torch.set_default_tensor_type('torch.cuda.FloatTensor')

model = torch.jit.load("traced_facelearner_model_new.pt")
model.eval()

# read the example image used for tracing
image=cv2.imread("videos/example.jpg")

test_transform = trans.Compose([
        trans.ToTensor(),
        trans.Normalize([0.5, 0.5, 0.5], [0.5, 0.5, 0.5])
    ])       

resized_image = cv2.resize(image, (112, 112))

tens = test_transform(resized_image).to(device).unsqueeze(0)
output = model(tens)
print(output)

C++ Code:

#include <iostream>
#include <algorithm> 
#include <opencv2/opencv.hpp>
#include <torch/script.h>


int main()
{
    try
    {
        torch::jit::script::Module model = torch::jit::load("traced_facelearner_model_new.pt");
        model.to(torch::kCUDA);
        model.eval();

        cv::Mat visibleFrame = cv::imread("example.jpg");

        cv::resize(visibleFrame, visibleFrame, cv::Size(112, 112));
        at::Tensor tensor_image = torch::from_blob(visibleFrame.data, { 1, visibleFrame.rows, 
                                                    visibleFrame.cols, 3 }, at::kByte);
        tensor_image = tensor_image.permute({ 0, 3, 1, 2 });
        tensor_image = tensor_image.to(at::kFloat);

        tensor_image[0][0] = tensor_image[0][0].sub(0.5).div(0.5);
        tensor_image[0][1] = tensor_image[0][1].sub(0.5).div(0.5);
        tensor_image[0][2] = tensor_image[0][2].sub(0.5).div(0.5);

        tensor_image = tensor_image.to(torch::kCUDA);
        std::vector<torch::jit::IValue> input;
        input.emplace_back(tensor_image);
        // Execute the model and turn its output into a tensor.
        auto output = model.forward(input).toTensor();
        output = output.to(torch::kCPU);
        std::cout << "Embds: " << output << std::endl;

        std::cout << "Done!\n";
    }
    catch (std::exception e)
    {
        std::cout << "exception" << e.what() << std::endl;
    }
}

The model gives (1x512) size output tensor as shown below.

Python output

tensor([[-1.6270e+00, -7.8417e-02, -3.4403e-01, -1.5171e+00, -1.3259e+00,

-1.1877e+00, -2.0234e-01, -1.0677e+00, 8.8365e-01, 7.2514e-01,

2.3642e+00, -1.4473e+00, -1.6696e+00, -1.2191e+00, 6.7770e-01,

...

-7.1650e-01, 1.7661e-01]], device=‘cuda:0’,
grad_fn=)

C++ output

Embds: Columns 1 to 8 -84.6285 -14.7203 17.7419 47.0915 31.8170 57.6813 3.6089 -38.0543


Columns 9 to 16 3.3444 -95.5730 90.3788 -10.8355 2.8831 -14.3861 0.8706 -60.7844

...

Columns 505 to 512 36.8830 -31.1061 51.6818 8.2866 1.7214 -2.9263 -37.4330 48.5854

[ CPUFloatType{1,512} ]

Using

  • Pytorch 1.6.0
  • Libtorch 1.6.0
  • Visual studio 2019
  • Windows 10
  • Cuda 10.1
Approver answered 20/8, 2020 at 9:46 Comment(4)
You understand your (quite long) code much better than we do. If you want us to help you, it would be best to provide your own thoughts on the problem. Why do you think this code is not providing the correct output? And what is this code actually supposed to do?Tedric
Both the c++ and python code are essentially doing the same thing, that is a CNN model is loaded and an input is given to it. The output as mentioned is (1x512) size tensor. The problem is that the values in this output tensor given by the model is different in C++ and python. I'm not sure why this is happening even though the input image, prerocessing steps, model is the same in both.Approver
you just need to scale once "tensor_image.sub_(0.5).div_(0.5); also try to unsqueeze your tensor after you created the tensor,(remove the 1 from load_from_blob and simply use the respective rows and cols) also you dont need IValue, simply use model.forward({tensor_image})Slightly
by the way prior to that you need to rescale your tensor by 255. and then do normalizationSlightly
S
6

before the final normalization, you need to scale your input to the range 0-1 and then carry on the normalization you are doing. convert to float and then divide by 255 should get you there. Here is the snippet I wrote, there might be some syntaax errors, that should be visible.
Try this :

#include <iostream>
#include <algorithm> 
#include <opencv2/opencv.hpp>
#include <torch/script.h>


int main()
{
    try
    {
        torch::jit::script::Module model = torch::jit::load("traced_facelearner_model_new.pt");
        model.to(torch::kCUDA);
        
        cv::Mat visibleFrame = cv::imread("example.jpg");

        cv::resize(visibleFrame, visibleFrame, cv::Size(112, 112));
        at::Tensor tensor_image = torch::from_blob(visibleFrame.data, {  visibleFrame.rows, 
                                                    visibleFrame.cols, 3 }, at::kByte);
        
        tensor_image = tensor_image.to(at::kFloat).div(255).unsqueeze(0);
        tensor_image = tensor_image.permute({ 0, 3, 1, 2 });
        ensor_image.sub_(0.5).div_(0.5);

        tensor_image = tensor_image.to(torch::kCUDA);
        // Execute the model and turn its output into a tensor.
        auto output = model.forward({tensor_image}).toTensor();
        output = output.cpu();
        std::cout << "Embds: " << output << std::endl;

        std::cout << "Done!\n";
    }
    catch (std::exception e)
    {
        std::cout << "exception" << e.what() << std::endl;
    }
}

I don't have access to a system to run this so if you face anything comment below.

Slightly answered 20/8, 2020 at 11:29 Comment(3)
Thanks a ton! this worked. But can you tell me why rescale the tensor by 255 before normalising it?Approver
@Arki99, this is the default for pytorchs ToTensor. when you do ToTensor() in Pytorch transformation, it just rescales the input image to the range 0-1. so in order to get the same behavior, you need to do the same in libtorch.Slightly
got it. Thanks a lot.Approver

© 2022 - 2024 — McMap. All rights reserved.