How to determine accuracy with triplet loss in a convolutional neural network

Asked 22/7, 2017 at 13:20 Answered 8/4, 2021 at 11:27

Solved neural-network conv-neural-network triplet

A Triplet network (inspired by "Siamese network") is comprised of 3 instances of the same feed-forward network (with shared parameters). When fed with 3 samples, the network outputs 2 intermediate values - the L2 (Euclidean) distances between the embedded representation of two of its inputs from the representation of the third.

I'm using pairs of three images for feeding the network (x = anchor image, a standard image, x+ = positive image, an image containing the same object as x - actually, x+ is same class as x, and x- = negative image, an image with different class than x.

I'm using the triplet loss cost function described here.

How do I determine the network's accuracy?

Quadroon answered 22/7, 2017 at 13:20 Comment(0)

I am assuming that your are doing work for image retrieval or similar tasks.

You should first generate some triplet, either randomly or using some hard (semi-hard) negative mining method. Then you split your triplet into train and validation set.

If you do it this way, then you can define your validation accuracy as proportion of the number of triplet in which feature distance between anchor and positive is less than that between anchor and negative in your validation triplet. You can see an example here which is written in PyTorch.

As another way, you can directly measure in term of your final testing metric. For example, for image retrieval, typically, we measure the performance of model on test set using mean average precision. If you use this metric, you should first define some queries on your validation set and their corresponding ground truth image.

Either of the above two metric is fine. Choose whatever you think fit your case.

Evening answered 4/12, 2017 at 2:33 Comment(11)

something really weird happens: the accuracy goes up to 99% (defined as in your answer), but when I use the embeddings generated by the model to classify people, only 20% of the classifications are correct (the network transforms images into 124 float numbers). Can you please help me? – Quadroon 18/12, 2017 at 11:44

The feature embedding is tailored for retrieval or clustering tasks. I am not sure the effectiveness of features for classification task. Also, 99% accuracy is on validation set? – Evening 18/12, 2017 at 12:3

Yes, I get this unexpected good results on the validation set. I'm actually following this article arxiv.org/pdf/1503.03832.pdf where they use triplet loss for face verification. – Quadroon 18/12, 2017 at 12:7

I'm using CASIA as training set (I'm generating triplets that violate the loss) and 6400 random photos from LFW as validation set. I am sure these two data sets do not overlap (so there's no overfitting). – Quadroon 18/12, 2017 at 12:9

There are so many impacting factors. Is your feature L2 normalized or not? How do you generate triplet for your val set. Do you use any regularization (weight decay, dropout, etc.). What is your testing set? Again, why do you want to do classification task using features trained for retrieval task. – Evening 18/12, 2017 at 12:28

The L2 normalization is another issue. I don't know if I'm doing it correctly because it lowers my accuracy a lot. Here's a question I posted some time ago which also contains my model: #47240337 – Quadroon 18/12, 2017 at 12:30

But I'll better update my question with all the details I cannot fit in the comments. And thank you so much for your help! I'll update the question in a few minutes. – Quadroon 18/12, 2017 at 12:31

Ok, if you post a new question, I will try my best to give you a good answer. – Evening 18/12, 2017 at 12:40

Let us continue this discussion in chat. – Evening 18/12, 2017 at 12:49

@Evening In one of the previous comments, you mentioned that you have doubt for this embeddings to work for classification task. Can you please elaborate why did you think so? Is there any reference where I can read about it more – Achaean 5/12, 2018 at 5:33

@shaifaliGupta NO, you can do some experiment to verify. I think it also depends on the dataset you choose. – Evening 5/12, 2018 at 6:15

So I am performing a similar task of using Triplet loss for classification. Here is how I used the novel loss method with a classifier. First, train your model using the standard triplet loss function for N epochs. Once you are sure that the model ( we shall refer to this as the embedding generator) is trained, save the weights as we shall be using these weights ahead. Let's say that your embedding generator is defined as:

class EmbeddingNetwork(nn.Module):
def __init__(self):
    super(EmbeddingNetwork, self).__init__()
    self.conv1 = nn.Sequential(
        nn.Conv2d(1, 64, (7,7), stride=(2,2), padding=(3,3)),
        nn.BatchNorm2d(64),
        nn.LeakyReLU(0.001),
        nn.MaxPool2d((3, 3), 2, padding=(1,1))
    )
    self.conv2 = nn.Sequential(
        nn.Conv2d(64,64,(1,1), stride=(1,1)),
        nn.BatchNorm2d(64),
        nn.LeakyReLU(0.001),
        nn.Conv2d(64,192, (3,3), stride=(1,1), padding=(1,1)),
        nn.BatchNorm2d(192),
        nn.LeakyReLU(0.001),
        nn.MaxPool2d((3,3),2, padding=(1,1))
    )
    self.fullyConnected = nn.Sequential(
        nn.Linear(7*7*256,32*128),
        nn.BatchNorm1d(32*128),
        nn.LeakyReLU(0.001),
        nn.Linear(32*128,128)
    )
def forward(self,x):
  x = self.conv1(x)
  x = self.conv2(x)
  x = self.fullyConnected(x)
  return torch.nn.functional.normalize(x, p=2, dim=-1)

Now we shall using this embedding generator to create another classifier, fit the weights we saved before to this part of the network and then freeze this part so our classifier trainer does not interfere with the triplet model. This can be done as:

class classifierNet(nn.Module):
def __init__(self, EmbeddingNet):
    super(classifierNet, self).__init__()
    self.embeddingLayer = EmbeddingNet
    self.classifierLayer = nn.Linear(128,62)
    self.dropout = nn.Dropout(0.5)

def forward(self, x):
    x = self.dropout(self.embeddingLayer(x))
    x = self.classifierLayer(x)
    return F.log_softmax(x, dim=1)

Now we shall load the weights we saved before and freeze them using:

embeddingNetwork = EmbeddingNetwork().to(device)
embeddingNetwork.load_state_dict(torch.load('embeddingNetwork.pt'))
classifierNetwork = classifierNet(embeddingNetwork)

Now train this classifier network using the standard classification losses like BinaryCrossEntropy or CrossEntropy.

Invasion answered 8/4, 2021 at 11:27 Comment(0)

Recommended topics

Hot tags