Calculating just a specific property in regionprops python

Asked 23/3, 2015 at 17:10 Answered 15/4, 2019 at 18:57

Solved python image-processing scikit-image connected-components

I am using the measure.regionprops method available in scikit-image to measure the properties of the connected components. It computes a bunch of properties (Python-regionprops). However, I just need the area of each connected component. Is there a way to compute just a single property and save computation?

Casque answered 23/3, 2015 at 17:10 Comment(0)

There seems to be a more direct way to do the same thing using regionprops with cache=False. I generated labels using skimage.segmentation.slic with n_segments=10000. Then:

rps = regionprops(labels, cache=False)
[r.area for r in rps]

My understanding of the regionprops documentation is that setting cache=False means that the attributes won't be calculated until they're called. According to %%time in Jupyter notebook, running the code above took 166ms with cache=False vs 247ms with cache=True, so it seems to work.

I tried an equivalent of the other answer and found it much slower.

%%time
ard = np.empty(10000, dtype=int)
for i in range(10000):
   ard[i] = size(np.where(labels==0)[1])

That took 34.3 seconds.

Here's a full working example comparing the two methods using the skimage astronaut sample image and labels generated by slic segmentation:

import numpy as np
import skimage
from skimage.segmentation import slic
from skimage.data import astronaut

img = astronaut()
# `+ 1` is added to avoid a region with the label of `0`
# zero is considered unlabeled so isn't counted by regionprops
# but would be counted by the other method.
segments = slic(img, n_segments=1000, compactness=10) + 1

# This is just to make it more like the original poster's 
# question.
labels, num = skimage.measure.label(segments, return_num=True)

Calculate areas using the OP's suggested method with index values adjusted to avoid the having a zero label:

%%time
area = {}
for i in range(1,num + 1):
    area[i + 1] = np.size(np.where(labels==i)[1])

CPU times: user 512 ms, sys: 0 ns, total: 512 ms Wall time: 506 ms

Same calculation using regionprops:

%%time
rps = skimage.measure.regionprops(labels, cache=False)
area2 = [r.area for r in rps]

CPU times: user 16.6 ms, sys: 0 ns, total: 16.6 ms Wall time: 16.2 ms

Verify that the results are all equal element-wise:

np.equal(area.values(), area2).all()

True

So, as long as zero labels and the difference in indexing is accounted for, both methods give the same result but regionprops without caching is faster.

Staircase answered 31/3, 2016 at 3:46 Comment(2)

As long as you account for zero labels, the results are the same. I've added an expanded example with results comparison. Cheers. – Staircase 9/3, 2017 at 22:47

This answer is incorrect; caching simply determines whether, once a property is asked for, it is retained in memory. No properties are pre-computed, irrespective of the value of the flag. – Mano 23/7, 2019 at 23:1

I found a way for avoiding using regionprops and computing all the properties when all we need is the area of the connected components. When the labelling of the connected component is done using the label command, we can compute the size of each component by computing the number of pixels with a given label. So, basically

labels,num=label(image, return_num=True)
for i in range(num):
    area[i]=size(np.where(labels==i)[1])

will compute the number of pixels in each connected component.

Casque answered 24/3, 2015 at 17:14 Comment(1)

please see my answer – Dulsea 15/4, 2019 at 18:55

@optimist

Your non-regionprops method showed some inefficiencies for me. It picked up some unwanted noise and incorrectly calculated one of the shapes

import numpy as np
from skimage.measure import label, regionprops
import matplotlib.pyplot as plt

arr = np.array([[1,0,1,0,0,0,1],
                [1,1,1,0,0,0,1],
                [0,1,1,0,0,0,1],
                [0,1,1,0,0,1,1],
                [0,0,0,0,1,1,1],
                [0,0,0,1,1,1,1],
                [1,0,0,1,1,1,1],
                [1,0,0,1,1,1,1],
                [1,0,0,1,1,1,1]])

area = {}
labels, num = label(arr, return_num=True)
for i in range(num):
    print(i)
    area[i]=np.size(np.where(labels==i)[1])
    print(area[i])

plt.imshow(labels)
plt.show();

rps = regionprops(labels, cache=False)
[r.area for r in rps]

Out: [9, 24, 3]

Dulsea answered 15/4, 2019 at 18:57 Comment(0)

Recommended topics

Hot tags