Can I prevent MATLAB from dynamically resizing a pre-allocated array?
Asked Answered
S

4

9

For example, in this simple/stupid example:

n = 3;
x = zeros(n, 1);
for ix=1:4
    x(ix) = ix;
end

the array is preallocated, but dynamically resized in the loop. Is there a setting in MATLAB that will throw an error when dynamic resizing like this occurs? In this example I could trivially rewrite it:

n = 3;
x = zeros(n, 1);
for ix=1:4
    if ix > n
        error('Size:Dynamic', 'Dynamic resizing will occur.')
    end
    x(ix) = ix;
end

But I'm hoping to use this as a check to make sure I've preallocated my matrices properly.

Shantae answered 26/9, 2013 at 20:35 Comment(6)
Why don't you loop for ix=1:n? That way, if you happen to set n to the wrong value, you only have to fix one line of code.Humbertohumble
@EMS you are complaining about ugly workarounds suggested by others, but that is what workarounds are, ugly. It they would be clear, they would be a solution, not a workaround. I didn't hear a real solution from you yet, you just gave a link to python. Don't blame us for these ugly suggestions, blame Mathworks ...Enquire
@EMS An answer that no good solutions exist (as far as I know) is also a valid answer, even if that is not the one you were hoping for.Enquire
@larsmans This isn't my actual production code. It's just a simple/stupid example that I'm using to illustrate the point.Shantae
If Matlab actually issues a run time warning for this, then you could try catch the warning and throw your own errorKeys
@Keys Matlab doesn't issue a runtime warning for this, as far as I know.Shantae
V
9

You can create a subclass of double and restrict the assignment in subsasgn method:

classdef dbl < double
    methods
        function obj = dbl(d)
            obj = obj@double(d);
        end

        function obj = subsasgn(obj,s,val)
            if strcmp(s.type, '()')
                mx = cellfun(@max, s.subs).*~strcmp(s.subs, ':');
                sz = size(obj);
                nx = numel(mx);
                if nx < numel(sz)
                    sz = [sz(1:nx-1) prod(sz(nx:end))];
                end
                assert(all( mx <= sz), ...
                    'Index exceeds matrix dimensions.');
            end
            obj = subsasgn@double(obj, s, val);
        end

    end
end

So now when you are preallocating use dbl

>> z = dbl(zeros(3))
z = 
  dbl

  double data:
     0     0     0
     0     0     0
     0     0     0
  Methods, Superclasses

All methods for double are now inherited by dbl and you can use it as usual until you assign something to z

>> z(1:2,2:3) = 6
z = 
  dbl

  double data:
     0     6     6
     0     6     6
     0     0     0
  Methods, Superclasses

>> z(1:2,2:5) = 6
Error using dbl/subsasgn (line 9)
Index exceeds matrix dimensions.

I haven't benchmarked it but I expect this to have insignificant performance impact.

If you want the display of the values look normal you can overload the display method as well:

function display(obj)
    display(double(obj));
end

Then

>> z = dbl(zeros(3))
ans =
     0     0     0
     0     0     0
     0     0     0
>> z(1:2,2:3) = 6
ans =
     0     6     6
     0     6     6
     0     0     0
>> z(1:2,2:5) = 6
Error using dbl/subsasgn (line 9)
Index exceeds matrix dimensions.
>> class(z)
ans =
dbl
Venture answered 27/9, 2013 at 9:58 Comment(7)
Very nice, I think a class is the only proper way to achieve this. Note that this solution still allows you to increase the size of your variable, just not by indexing into it with a number outside its size. Good to know that things like x = dbl(magic(5)); x=x(:); x=[x x]; x= dbl(rand(100)); will work.Spannew
Good solution. Can you provide any benchmarks on the way it affects performance as the array grows?Justiciary
Watch out for the colon operator - your code turns it into its ASCII value. Simple fix: mx(cellfun(@(x) strcmp(x,':'),s.subs)) = 1; after assignment. In a few very quick and cursory tests, this is about 2x slower than letting Matlab dynamically allocate the arrays, or about 3x slower than its pre-allocated double performance.Erinerina
@Matt B can you provide any stats on this? Does it fundamentally change the complexity? That is, for an O(n) reassignment, does this just make it a slower constant in front of the n, or does it make it O(n^2), etc. What about for other array operations that involve accessing? What about mixed operations, like appending one of these dbl objects onto some other array that still is dynamically resizable?Justiciary
You don't need to subclass double, instead you can just overload subsasgn by creating an @double directory and adding your subsasgn function into it. It will still give you the same performance hit.Panegyric
@DanielE.Shub No your solution doesn't work and it is expressly mentioned as an example in third paragraph of the description section of documentation for subsasgnVenture
“I haven't benchmarked it but […]” Famous last words!!!Dressy
E
5

The simplest, most straightforward and robust way I can think of to do this is just by accessing the index before assigning to it. Unfortunately, you cannot overload subsasgn for fundamental types (and it'd be a major headache to do correctly in any case).

for ix=1:4
    x(ix); x(ix) = ix;
end
% Error: 'Attempted to access x(4); index out of bounds because numel(x)=3.'

Alternatively, you could try to be clever and do something with the end keyword... but no matter what you do you'll end up with some sort of nonsensical error message (which the above nicely provides).

for ix=1:4
    x(ix*(ix<=end)) = ix;
end
% Error: 'Attempted to access x(0); index must be a positive integer or logical.'

Or you could do that check in a function, which gains you your nice error message but is still terribly verbose and obfuscated:

for ix=1:4
    x(idxchk(ix,end)) = ix;
end
function idx = idxchk(idx,e)
    assert(idx <= e, 'Size:Dynamic', 'Dynamic resizing will occur.')
end
Erinerina answered 26/9, 2013 at 21:31 Comment(3)
@EMS My statement that "you cannot overload subsasgn for fundamental types" is a very hard and real limitation. See the documentation on subsasgn. The alternatives then are to define your own class with custom subsasgn behavior (but custom matlab OOP is slow) or to use some sort of workaround. I'm just brainstorming a few. And no, that function is reusable for any data structure because it checks against the end keyword.Erinerina
@EMS, the original question simply asks for "a check to make sure I've pre-allocated my matrices properly." I read this as looking for a quick way to double check the indexing without necessarily knowing the length of each data structure, not some persistent solution. My first suggestion is to simply grab a copy of the LHS: quick, easy, and simple. I'm not advocating for using this sort of solution in production code. But as a quick check? Sure.Erinerina
@EMS Static-typed languages will do this, however, so why isn't it reasonable? (I know these are different paradigms obviously). For example fortran will throw errors if you attempt to assign or access outside the bounds of an array. Matlab will only throw an error if you attempt to access outside these bounds.Shantae
S
3

This is not a fully worked example (see disclaimer after the code!) but it shows one idea...

You could (at least while debugging your code), use the following class in place of zeros to allocate your original variable.

Subsequent use of the data outside of the bounds of the originally allocated size would result in an 'Index exceeds matrix dimensions.' error.

For example:

>> n = 3;
>> x = zeros_debug(n, 1)

x = 

     0
     0
     0

>> x(2) = 32

x = 

     0
    32
     0

>> x(5) = 3
Error using zeros_debug/subsasgn (line 42)
Index exceeds matrix dimensions.

>> 

The class code:

classdef zeros_debug < handle    
    properties (Hidden)
       Data
    end

    methods       
      function obj = zeros_debug(M,N)
          if nargin < 2
              N = M;
          end
          obj.Data = zeros(M,N);
      end

        function sref = subsref(obj,s)
           switch s(1).type
              case '()'
                 if length(s)<2
                 % Note that obj.Data is passed to subsref
                    sref = builtin('subsref',obj.Data,s);
                    return
                 else
                    sref = builtin('subsref',obj,s);
                 end              
               otherwise,
                 error('zeros_debug:subsref',...
                   'Not a supported subscripted reference')
           end 
        end        
        function obj = subsasgn(obj,s,val)
           if isempty(s) && strcmp(class(val),'zeros_debug')
              obj = zeros_debug(val.Data);
           end
           switch s(1).type
               case '.'
                    obj = builtin('subsasgn',obj,s,val);
              case '()'
                    if strcmp(class(val),'double')                        
                        switch length(s(1).subs{1}) 
                            case 1,
                               if s(1).subs{1} > length(obj.Data)
                                   error('zeros_debug:subsasgn','Index exceeds matrix dimensions.');
                               end
                            case 2,                            
                               if s(1).subs{1} > size(obj.Data,1) || ...
                                       s(1).subs{2} > size(obj.Data,2) 
                                   error('zeros_debug:subsasgn','Index exceeds matrix dimensions.');
                               end                            
                        end
                        snew = substruct('.','Data','()',s(1).subs(:));
                             obj = subsasgn(obj,snew,val);
                    end
               otherwise,
                 error('zeros_debug:subsasgn',...
                    'Not a supported subscripted assignment')
           end     
        end        
        function disp( obj )
            disp(obj.Data);
        end        
    end   
end

There would be considerable performance implications (and problems stemming from using a class inheriting from handle) but it seemed like an interesting solution to the original problem.

Scansorial answered 27/9, 2013 at 2:45 Comment(1)
Inheriting from handle is so wrong here. This means that the class doesn’t behave like a numeric array. Copies all share data!Dressy
E
2

Allowing assignment to indices outside of an array's bounds and filling the gaps with zeros is indeed one of MATLAB's ugly parts. I am not aware of any simple tricks without an explicit check to avoid that, other than implementing your own storage class. I would stick to adding a simple assert(i <= n) to your loop and forget about it. I have never been bitten by hard-to-find bugs due to assigning something out of bounds.

In case of a forgotten or too small preallocation, in the 'ideal' case your code gets really slow due to quadratic behavior, after which you find the bug and fix it. But these days, MATLAB's JIT is sometimes smart enough to not cause any slowdowns (maybe it dynamically grows arrays in some cases, like Python's list), so it might not even be an issue anymore. So it actually allows for some sloppier coding...

Enquire answered 26/9, 2013 at 21:35 Comment(1)
What if you want to perform this check in many different for-loops. You'd have to copy/paste the assert code all over. If you used a function instead, you'd have to manually remember to use the function for index checking but only for this particular piece of data. Either way, the checking is not done implicitly by the data structure, rather manually by the programmer's exogenous knowledge. These are violations of long-standing software design principles.Justiciary

© 2022 - 2024 — McMap. All rights reserved.