Classifiying a set of Images into Classes
Asked Answered
A

4

11

I have the problem that I get a set of pictures and need to classify those.

The thing is, i do not really have any knowledge of these images. So i plan on using as many descriptors as I can find and then do a PCA on those to identify only the descriptors that are of use to me.

I can do supervised learning on a lot of datapoints, if that helps. However there is a chance that pictures are connected to each other. Meaning there could be a development from Image X to Image X+1, although I kinda hope this gets sorted out with the information in each Image.

My question are:

  1. How do i do this best when using Python? (I want to make a proof of concept first where speed is a non-issue). What libraries should i use?
  2. Are there examples already for an image Classification of such a kind? Example of using a bunch of descriptors and cooking them down via PCA? This part is kinda scary for me, to be honest. Although I think python should already do something like this for me.

Edit: I have found a neat kit that i am currently trying out for this: http://scikit-image.org/ There seem to be some descriptors in there. Is there a way to do automatic feature extraction and rank the features according to their descriptive power towards the target classification? PCA should be able to rank automatically.

Edit 2: I have my framework for the storage of the data now a bit more refined. I will be using the Fat system as a database. I will have one folder for each instance of a combination of classes. So if an image belongs to class 1 and 2, there will be a folder img12 that contains those images. This way i can better control the amount of data i have for each class.

Edit 3: I found an example of a libary (sklearn) for python that does some sort of what i want to do. it is about recognizing hand-written digits. I am trying to convert my dataset into something that i can use with this.

here is the example i found using sklearn:

import pylab as pl

# Import datasets, classifiers and performance metrics
from sklearn import datasets, svm, metrics

# The digits dataset
digits = datasets.load_digits()

# The data that we are interested in is made of 8x8 images of digits,
# let's have a look at the first 3 images, stored in the `images`
# attribute of the dataset. If we were working from image files, we
# could load them using pylab.imread. For these images know which
# digit they represent: it is given in the 'target' of the dataset.
for index, (image, label) in enumerate(zip(digits.images, digits.target)[:4]):
    pl.subplot(2, 4, index + 1)
    pl.axis('off')
    pl.imshow(image, cmap=pl.cm.gray_r, interpolation='nearest')
    pl.title('Training: %i' % label)

# To apply an classifier on this data, we need to flatten the image, to
# turn the data in a (samples, feature) matrix:
n_samples = len(digits.images)
data = digits.images.reshape((n_samples, -1))

# Create a classifier: a support vector classifier
classifier = svm.SVC(gamma=0.001)

# We learn the digits on the first half of the digits
classifier.fit(data[:n_samples / 2], digits.target[:n_samples / 2])

# Now predict the value of the digit on the second half:
expected = digits.target[n_samples / 2:]
predicted = classifier.predict(data[n_samples / 2:])

print("Classification report for classifier %s:\n%s\n"
      % (classifier, metrics.classification_report(expected, predicted)))
print("Confusion matrix:\n%s" % metrics.confusion_matrix(expected, predicted))

for index, (image, prediction) in enumerate(
        zip(digits.images[n_samples / 2:], predicted)[:4]):
    pl.subplot(2, 4, index + 5)
    pl.axis('off')
    pl.imshow(image, cmap=pl.cm.gray_r, interpolation='nearest')
    pl.title('Prediction: %i' % prediction)

pl.show()
Ambrosane answered 30/4, 2013 at 14:16 Comment(2)
And what did you try so far? Show some effort mate.Edythedythe
i will edit the stuff in that i accomplish so far.Ambrosane
A
8

You can convert a picture to a vector of pixels, and perform PCA on that vector. This might be easier than trying to manually find descriptors. You can use numPy and sciPy in python. For example:

import scipy.io
from numpy import *
#every row in the *.mat file is 256*256 numbers representing gray scale values
#for each pixel in an image. i.e. if XTrain.mat has 1000 lines than each line
#will be made up of 256*256 numbers and there would be 1000 images in the file.
#The following loads the image into a sciPy matrix where each row is a vector
#of length 256*256, representing an image. This code will need to be switched
#out if you have a different method of storing images.
Xtrain = scipy.io.loadmat('Xtrain.mat')["Xtrain"]
Ytrain = scipy.io.loadmat('Ytrain.mat')["Ytrain"]
Xtest = scipy.io.loadmat('Xtest.mat')["Xtest"]
Ytest = scipy.io.loadmat('Ytest.mat')["Ytest"]
learn(Xtest,Xtrain,Ytest,Ytrain,5) #this lowers the dimension from 256*256 to 5

def learn(testX,trainX,testY,trainY,n):
    pcmat = PCA(trainX,n)
    lowdimtrain=mat(trainX)*pcmat #lower the dimension of trainX
    lowdimtest=mat(testX)*pcmat #lower the dimension of testX
    #run some learning algorithm here using the low dimension matrices for example
    trainset = []    

    knnres = KNN(lowdimtrain, trainY, lowdimtest ,k)
    numloss=0
    for i in range(len(knnres)):
        if knnres[i]!=testY[i]:
            numloss+=1
    return numloss

def PCA(Xparam, n):
    X = mat(Xparam)
    Xtranspose = X.transpose()
    A=Xtranspose*X
    return eigs(A,n)

def eigs(M,k):
    [vals,vecs]=LA.eig(M)
    return LM2ML(vecs[:k])

def LM2ML(lm):
    U=[[]]
    temp = []
    for i in lm: 
       for j in range(size(i)):
           temp.append(i[0,j])
       U.append(temp)
       temp = []
    U=U[1:]
    return U

In order to classify your image you can used k-nearest neighbors. i.e. you find the k nearest images and label your image with by majority vote over the k nearest images. For example:

def KNN(trainset, Ytrainvec, testset, k):
    eucdist = scidist.cdist(testset,trainset,'sqeuclidean')
    res=[]
    for dists in eucdist:
        distup = zip(dists, Ytrainvec)
        minVals = []
    sumLabel=0;
    for it in range(k):
        minIndex = index_min(dists)
        (minVal,minLabel) = distup[minIndex]
        del distup[minIndex]
        dists=numpy.delete(dists,minIndex,0)
        if minLabel == 1:
            sumLabel+=1
        else:
            sumLabel-=1
        if(sumLabel>0):
            res.append(1)
        else:
            res.append(0)
    return res
Aswarm answered 9/5, 2013 at 8:12 Comment(2)
can you add an example how to do this?Ambrosane
I added an example, I hope it helps.Aswarm
T
3

I know I'm not answering your question directly. But images vary greatly:remote sensing, objects, scenes, fMRI, biomedial, faces, etc... It would help if you narrow your categorization a bit and let us know.

What descriptors are you computing? Most of the code I use (as well as the computer vision community) is in MATLAB, not in python, but I'm sure there are similar codes available (pycv module & http://www.pythonware.com/products/pil/). Try out this descriptor benchmark that has precompiled state-out-the-art code from the people at MIT: http://people.csail.mit.edu/jxiao/SUN/ Try looking at GIST,HOG and SIFT, those are pretty standard depending on what you wanto to analyze: scenes, objects or points respectively.

Troudeloup answered 8/5, 2013 at 17:19 Comment(3)
is there a way to use all of those descriptors simultaneously? A PCA should then be able to weed out those who do not contribute. Can you make an example in python code?Ambrosane
The problem with your approach is that you are trying to solve it on a purely "programmer approach", instead of relying on computer vision literature that can give you a short cut. I believe you can mix some descriptors and hive a huge feature vector and normalize each vector, but your approach looks very "brute". You haven't even defined what type of images you are planning to use of the categories I mentioned before.Troudeloup
that is on purpose. i am trying to get by without resorting to computer vision stuff. i want the algorithm to figure out on its own what is important in the images. This should depend entirely on the data.Ambrosane
S
0

First,import libraries and extract pictures

from sklearn import datasets    
%matplotlib inline
import sklearn as sk
import numpy as np
import matplotlib.pyplot as plt
digits = datasets.load_digits()
X_digits = digits.data
y_digits = digits.target
ind4 = np.where(y_digits==4)
ind5=  np.where(y_digits==5)
plt.imshow(X_digits[1778].reshape((8,8)),cmap=plt.cm.gray_r)
Sapor answered 21/12, 2016 at 10:26 Comment(0)
S
0

then use this feature:

xx = np.arange(64)

def feature_11(xx):

yy=xx.reshape(8,8)
feature_1 = sum(yy[0:2,:])
feature11 = sum(feature_1)
print (feature11)
return feature11

feature_11(X_digits[1778])

then use lda:

from sklearn.discriminant_analysis import LinearDiscriminantAnalysis

clf = LinearDiscriminantAnalysis()

ind_all = np.arange(0,len(y_digits))

np.random.shuffle(ind_all)

ind_training = ind_all[0:int(0.8 * len(ind_all)) ]

ind_test = ind_all[int(0.8 * len(ind_all)):]

clf.fit(X_digits[ind_training], y_digits[ind_training])

y_predicted = clf.predict(X_digits[ind_test])

plt.subplot(211)

plt.stem(y_predicted)

plt.subplot(212)

plt.stem(y_digits[ind_test], 'r')

plt.stem(y_digits[ind_test] - y_predicted, 'r')

sum (y_predicted == y_digits[ind_test]) / len(y_predicted)

Sapor answered 21/12, 2016 at 10:33 Comment(1)
Please add some explenation to your answer. Only showing code can be confusing.Aiglet

© 2022 - 2024 — McMap. All rights reserved.