Image clustering by its similarity in python

S

2

14

I have a collection of photos and I'd like to distinguish clusters of the similar photos. Which features of an image and which algorithm should I use to solve my task?

Skean answered 24/8, 2016 at 12:31 Comment(0)

G

11

It is a too broad question.

Generally speaking you can use any clustering mechanism, e.g. a popular k-means. To prepare your data for clustering you need to convert your collection into an array X, where every row is one example (image) and every column is a feature.

The main question - what your features should be. It is difficult to answer without knowing what you are trying to accomplish. If your images are small and of the same size you can simply have every pixel as a feature. If you have any metadata and would like to sort using it - you can have every tag in metadata as a feature.

Now if you really need to find some patterns between images you will have to apply an additional layer of processing, like convolutional neural network, which essentially allows you to extract features from different pieces of your image. You can think about it as a filter, which will convert every image into, say 8x8 matrix, which then correspondingly could be used as a row with 64 different features in your array X for clustering.

Godoy answered 24/8, 2016 at 13:9 Comment(4)

I want to make each cluster consist of the photos of the same object/scene, i.e. I took three photos of my room, but from a little bit different position or there were less light on some photo. So the photos are not completely equal, but very similar. – Skean 24/8, 2016 at 13:15

In this case I'd suggest to try a simple approach first before delving into the depths of deep learning (convnet). I'd just extract the map of pixels from every image, especially since they all are of the same size. Converting your pictures to black/white will help you to reduce the complexity. Then try to apply the k-means from scikit-learn. Vary number of clusters as well to see the impact. But most likely you will have to apply some kind of convnet to get a meaningful result. I am just suggesting to move in phases. – Godoy 24/8, 2016 at 13:22

In fact I'd suggest taking a look at mahotas library, which can be well applicable in your case. Link - mahotas.readthedocs.io/en/latest/features.html – Godoy 24/8, 2016 at 13:35

I've tried to obtain haralick features, but I don't understand, why do I get 13x13 matrix for each image? Shouldn't it be a 13x1 vector, which describes some number for each of 13 haralick features? – Skean 26/8, 2016 at 12:45

B

28

I had the same problem and I came up with this solution:

Import a pretrained model using Keras (here VGG16)
Extract features per image
Do kmeans
Export by copying with cluster label

Here is my code, partly motivated by this post.

from keras.preprocessing import image
from keras.applications.vgg16 import VGG16
from keras.applications.vgg16 import preprocess_input
import numpy as np
from sklearn.cluster import KMeans
import os, shutil, glob, os.path
from PIL import Image as pil_image
image.LOAD_TRUNCATED_IMAGES = True 
model = VGG16(weights='imagenet', include_top=False)

# Variables
imdir = 'C:/indir/'
targetdir = "C:/outdir/"
number_clusters = 3

# Loop over files and get features
filelist = glob.glob(os.path.join(imdir, '*.jpg'))
filelist.sort()
featurelist = []
for i, imagepath in enumerate(filelist):
    print("    Status: %s / %s" %(i, len(filelist)), end="\r")
    img = image.load_img(imagepath, target_size=(224, 224))
    img_data = image.img_to_array(img)
    img_data = np.expand_dims(img_data, axis=0)
    img_data = preprocess_input(img_data)
    features = np.array(model.predict(img_data))
    featurelist.append(features.flatten())

# Clustering
kmeans = KMeans(n_clusters=number_clusters, random_state=0).fit(np.array(featurelist))

# Copy images renamed by cluster 
# Check if target dir exists
try:
    os.makedirs(targetdir)
except OSError:
    pass
# Copy with cluster name
print("\n")
for i, m in enumerate(kmeans.labels_):
    print("    Copy: %s / %s" %(i, len(kmeans.labels_)), end="\r")
    shutil.copy(filelist[i], targetdir + str(m) + "_" + str(i) + ".jpg")

Update 02/2022:

In some cases (e.g. unknown number of clusters) using Affinity Propagation may be a much better choice than kmeans. In this case, replace kmeans by:

from sklearn.cluster import AffinityPropagation
affprop = AffinityPropagation(affinity="euclidean", damping=0.5).fit(np.array(featurelist))

and loop over affprop.labels_ to access the results.

Bidwell answered 11/8, 2019 at 15:8 Comment(2)

What's the expected output here? I'm not sure if I understand the naming. – Beale 24/4 at 21:20

Output is a copy of each image, with label cluster id + original image name, e.g. "image1.jpg" would be copied as "0_image1.jpg" in case it belongs to cluster 0. – Bidwell 25/4 at 13:11

G

11

It is a too broad question.

Generally speaking you can use any clustering mechanism, e.g. a popular k-means. To prepare your data for clustering you need to convert your collection into an array X, where every row is one example (image) and every column is a feature.

The main question - what your features should be. It is difficult to answer without knowing what you are trying to accomplish. If your images are small and of the same size you can simply have every pixel as a feature. If you have any metadata and would like to sort using it - you can have every tag in metadata as a feature.

Now if you really need to find some patterns between images you will have to apply an additional layer of processing, like convolutional neural network, which essentially allows you to extract features from different pieces of your image. You can think about it as a filter, which will convert every image into, say 8x8 matrix, which then correspondingly could be used as a row with 64 different features in your array X for clustering.

Godoy answered 24/8, 2016 at 13:9 Comment(4)

I want to make each cluster consist of the photos of the same object/scene, i.e. I took three photos of my room, but from a little bit different position or there were less light on some photo. So the photos are not completely equal, but very similar. – Skean 24/8, 2016 at 13:15

In this case I'd suggest to try a simple approach first before delving into the depths of deep learning (convnet). I'd just extract the map of pixels from every image, especially since they all are of the same size. Converting your pictures to black/white will help you to reduce the complexity. Then try to apply the k-means from scikit-learn. Vary number of clusters as well to see the impact. But most likely you will have to apply some kind of convnet to get a meaningful result. I am just suggesting to move in phases. – Godoy 24/8, 2016 at 13:22

In fact I'd suggest taking a look at mahotas library, which can be well applicable in your case. Link - mahotas.readthedocs.io/en/latest/features.html – Godoy 24/8, 2016 at 13:35

I've tried to obtain haralick features, but I don't understand, why do I get 13x13 matrix for each image? Shouldn't it be a 13x1 vector, which describes some number for each of 13 haralick features? – Skean 26/8, 2016 at 12:45

Recommended topics

Hot tags