How to handle memory error while fitting GaussianMixture in sklearn python?
Asked Answered
E

1

5

I am trying to fit GaussianMixture using sklearn to a bunch of cat and dog pictures. I feed a numpy array of size (50,30000) where 50 number of data points(25 cats and 25 dog pictures), 30000 is the number of features after I convert each picture to numpy array and resize to (100,100,3). It is throwing memory error. I have 4GB of RAM and 70% used before running this code. Can anyone suggest me how to debug how much memory is used by GaussianMixture fit method in sklearn. Or can anyone provide some code to fit it in batches.

Following is the code

print(img_coll_cat_dog.shape)
print(img_coll_cat_dog.nbytes)
print(img_coll_cat_dog.itemsize)

Result:

(50, 30000)
12000000 bytes
8 

gmix = mixture.GaussianMixture(n_components=2, covariance_type='full')
gmix.fit(img_coll_cat_dog)

Following is the error I am getting.

MemoryError                               Traceback (most recent call last)
<ipython-input-32-c0370476a619> in <module>()
      1 gmix = mixture.GaussianMixture(n_components=2, covariance_type='full')
----> 2 gmix.fit(img_coll_cat_dog)

~/dl/dl3/lib/python3.5/site-packages/sklearn/mixture/base.py in fit(self, X, y)
    205 
    206             if do_init:
--> 207                 self._initialize_parameters(X, random_state)
    208                 self.lower_bound_ = -np.infty
    209 

~/dl/dl3/lib/python3.5/site-packages/sklearn/mixture/base.py in _initialize_parameters(self, X, random_state)
    155                              % self.init_params)
    156 
--> 157         self._initialize(X, resp)
    158 
    159     @abstractmethod

~/dl/dl3/lib/python3.5/site-packages/sklearn/mixture/gaussian_mixture.py in _initialize(self, X, resp)
    629 
    630         weights, means, covariances = _estimate_gaussian_parameters(
--> 631             X, resp, self.reg_covar, self.covariance_type)
    632         weights /= n_samples
    633 

~/dl/dl3/lib/python3.5/site-packages/sklearn/mixture/gaussian_mixture.py in _estimate_gaussian_parameters(X, resp, reg_covar, covariance_type)
    283                    "diag": _estimate_gaussian_covariances_diag,
    284                    "spherical": _estimate_gaussian_covariances_spherical
--> 285                    }[covariance_type](resp, X, nk, means, reg_covar)
    286     return nk, means, covariances
    287 

~/dl/dl3/lib/python3.5/site-packages/sklearn/mixture/gaussian_mixture.py in _estimate_gaussian_covariances_full(resp, X, nk, means, reg_covar)
    162     """
    163     n_components, n_features = means.shape
--> 164     covariances = np.empty((n_components, n_features, n_features))
    165     for k in range(n_components):
    166         diff = X - means[k]

MemoryError: 

Any help is much appreciated.

Explant answered 25/9, 2017 at 6:29 Comment(0)
G
6

Try to set covariance_type='diag'

Godesberg answered 23/2, 2018 at 8:27 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.