You can use the idea of face-embeddings, which for example is proposed in the highly-cited paper FaceNet and implemented in OpenFace (which also comes pre-trained).
The general idea: take some preprocessed face (frontal, cropped, ...) and embedd it to some lower dimension with the characteristic, that similar faces in input should have low euclidean-distance in the output.
So in your case: use the embedding-CNN to map your faces to the reduced space (usually a vector of size 128) and calculate the distance as in the euclidean-space. Of course you also cluster faces then, but that's not your task.
The good thing here besides the general idea: openface is a nice implementation ready to use and it's homepage also explains the idea:
Use a deep neural network to represent (or embed) the face on a 128-dimensional unit hypersphere.
The embedding is a generic representation for anybody's face. Unlike other face representations, this embedding has the nice property that a larger distance between two face embeddings means that the faces are likely not of the same person.
This property makes clustering, similarity detection, and classification tasks easier than other face recognition techniques where the Euclidean distance between features is not meaningful.
They even have a comparison-demo here.