3d sliding window operation in Theano? - McMap

About

3d sliding window operation in Theano?

Asked 29/2, 2016 at 4:23 Answered 1/1, 2018 at 21:5

python numpy cython theano conv-neural-network

A

1

21

TL.DR. Is there a 3-dimensional friendly implementation of theano.tensor.nnet.neighbours.images2neibs?

I would like to perform voxel-wise classification of a volume (NxNxN) using a neural network that takes in a nxnxn image, where N>n. To classify each voxel in the volume, I have to iterate through each voxel. For each iterration, I obtain and pass the neighborhood voxels as the input to the neural network. This is simply a sliding window operation, which the operation is the neural network.

While my neural network is implemented in Theano, the sliding window implementation is in python/numpy. Since this is not a pure Theano operation, the classification takes forever (> 3 hours) to classify all voxels in one volume. For 2d sliding window operation, Theano has a helper method, theano.tensor.nnet.neighbours.images2neibs, is there a similar implementation for 3-dimensional images?

Edit: There are existing numpy solutions (1 and 2) for n-d sliding window, both use np.lib.stride_tricks.as_strided to provide "views of the sliding window", thus preventing memory issues. In my implementation, the sliding window arrays are being passed from numpy (Cython) to Python and then to Theano. To boost performance, it's likely I have to bypass Python.

Archaic answered 29/2, 2016 at 4:23 Comment(9)

related discussion. github.com/Theano/Theano/issues/2166 – Archaic 29/2, 2016 at 4:27

Alternatively, maybe you want to check out sklearn.feature_extraction.image.extract_patches. This can give you a view onto the desired nxnxn cubes without making a copy of the data. Combine it with an np.einsum which also doesn't copy and you may get something that runs in acceptable time (no guarantee, never tried) – Shellacking 29/2, 2016 at 12:3

Thanks Eickenberg. I'll need to take a look at np.einsum! – Archaic 29/2, 2016 at 17:36

fyi sklearn.feature_extraction.image.extract_patches also uses stride tricks to do its work. It is just a few lines of code and calculation to get the right shape of the views. – Shellacking 1/3, 2016 at 8:31

I think OverfeatTransformer from sklearn_theano.feature_extraction.overfeat is what I'm looking for. Guess who the author is. :) – Archaic 2/3, 2016 at 7:0

Hmm, that works, but only for color images. How are you thinking of extending it to 3D volumes? – Shellacking 2/3, 2016 at 10:11

OverfeatTransformer would be a good template for me to work with. Thanks for sharing sklearn-theano! – Archaic 3/3, 2016 at 5:35

For those that ended up here, I recommend using fully convolutional neural network structure as opposed to voxel-wise classification for classifying pixels/voxels. see J Long, Fully Convolutional Networks for Semantic Segmentation; B Hariharan, Hypercolumns for object segmentation and fine-grained localization. – Archaic 6/1, 2017 at 0:38

hmm are you sure? Please also see this comment about the naming of these things as though they didn't exist before. (Just adding this for completeness - I haven't looked at the paper you refer to and in the end it is the results that count) – Shellacking 6/1, 2017 at 14:14

I

1

Eickenberg and Kastner's OverfeatTransformer utility in sklearn_theano.feature_extraction.overfeat would be a good match for this operation, as mentioned by OP.

Intoxicative answered 1/1, 2018 at 21:5 Comment(0)

Recommended topics

#Godot #Unity #Godot 4.X #Mongodb

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

© 2022 - 2024 — McMap. All rights reserved.