How does the colon operator work in MATLAB?
Asked Answered
H

1

14

As noted in this answer by Sam Roberts and this other answer by gnovice, MATLAB's colon operator (start:step:stop) creates a vector of values in a different way that linspace does. In particular, Sam Roberts states:

The colon operator adds increments to the starting point, and subtracts decrements from the end point to reach a middle point. In this way, it ensures that the output vector is as symmetric as possible.

However, offical documentation about this from The MathWorks has been deleted from their site.

If Sam's description is correct, wouldn't the errors in the step sizes be symmetric?

>> step = 1/3;
>> C = 0:step:5;
>> diff(C) - step
ans =
   1.0e-15 *
  Columns 1 through 10
         0         0    0.0555   -0.0555   -0.0555    0.1665   -0.2776    0.6106   -0.2776    0.1665
  Columns 11 through 15
    0.1665   -0.2776   -0.2776    0.6106   -0.2776

Interesting things to note about the colon operator:

  • Its values depend on its length:

    >> step = 1/3;
    >> C = 0:step:5;
    >> X = 0:step:3;
    >> C(1:10) - X
    ans =
       1.0e-15 *
             0         0         0         0         0   -0.2220         0   -0.4441    0.4441         0
    
  • It can generate repeated values if they are rounded:

    >> E = 1-eps : eps/4 : 1+eps;
    >> E-1
    ans =
       1.0e-15 *
       -0.2220   -0.2220   -0.1110         0         0         0         0    0.2220    0.2220
    
  • There is a tolerance for the last value, if the step size creates a value just above the end, this end value is still used:

    >> A = 0 : step : 5-2*eps(5)
    A =
      Columns 1 through 10
             0    0.3333    0.6667    1.0000    1.3333    1.6667    2.0000    2.3333    2.6667    3.0000
      Columns 11 through 16
        3.3333    3.6667    4.0000    4.3333    4.6667    5.0000
    >> A(end) == 5 - 2*eps(5)
    ans =
      logical
       1
    >> step*15 - 5
    ans =
         0
    
Hyperpyrexia answered 20/3, 2018 at 5:59 Comment(0)
H
14

The deleted page referred to by Sam's answer is still archived by the Way Back Machine. Luckily, even the attached M-file colonop is there too. And it seems that this function still matches what MATLAB does (I'm on R2017a):

>> all(0:step:5 == colonop(0,step,5))
ans =
  logical
   1
>> all(-pi:pi/21:pi == colonop(-pi,pi/21,pi))
ans =
  logical
   1

I'll replicate here what the function does for the general case (there are some shortcuts for generating integer vectors and handling special cases). I'm replacing the function's variable names with more meaningful ones. The inputs are start, step and stop.

First it computes how many steps there are in between start and stop. If the last step exceeds stop by more than a tolerance, it is not taken:

n = round((stop-start)/step);
tol = 2.0*eps*max(abs(start),abs(stop));
sig = sign(step);
if sig*(start+n*step - stop) > tol
  n = n - 1;
end

This explains the last observation mentioned in the question.

Next, it computes the value of the last element, and makes sure that it does not exceed the stop value, even if it allowed to go past it in the previous computation.

last = start + n*step;
if sig*(last-stop) > -tol
   last = stop;
end

This is why the lasat value in the vector A in the question actually has the stop value as the last value.

Next, it computes the output array in two parts, as advertised: the left and right halves of the array are filled independently:

out = zeros(1,n+1);
k = 0:floor(n/2);
out(1+k) = start + k*step;
out(n+1-k) = last - k*step;

Note that they are not filled by incrementing, but by computing an integer array and multiplying it by the step size, just like linspace does. This exaplains the observation about array E in the question. The difference is that the right half of the array is filled by subtracting those values from the last value.

As a final step, for odd-sized arrays, the middle value is computed separately to ensure it lies exactly half-way the two end points:

if mod(n,2) == 0
   out(n/2+1) = (start+last)/2;
end

The full function colonop is copied at the bottom.


Note that filling the left and right side of the array separately does not mean that the errors in step sizes should be perfectly symmetric. These errors are given by roundoff errors. But it does make a difference where the stop point is not reached exactly by the step size, as in the case of array A in the question. In this case, the slightly shorter step size is taken in the middle of the array, rather than at the end:

>> step=1/3;
>> A = 0 : step : 5-2*eps(5);
>> A/step-(0:15)
ans =
   1.0e-14 *
  Columns 1 through 10
         0         0         0         0         0         0         0   -0.0888   -0.4441   -0.5329
  Columns 11 through 16
   -0.3553   -0.3553   -0.5329   -0.5329   -0.3553   -0.5329

But even in the case where the stop point is reached exactly, some additional error accumulates in the middle. Take for example the array C in the question. This error accumulation does not happen with linspace:

C = 0:1/3:5;
lims = eps(C);
subplot(2,1,1)
plot(diff(C)-1/3,'o-')
hold on
plot(lims,'k:')
plot(-lims,'k:')
plot([1,15],[0,0],'k:')
ylabel('error')
title('0:1/3:5')
L = linspace(0,5,16);
subplot(2,1,2)
plot(diff(L)-1/3,'x-')
hold on
plot(lims,'k:')
plot(-lims,'k:')
plot([1,15],[0,0],'k:')
title('linspace(0,5,16)')
ylabel('error')

output of code above


colonop:

function out = colonop(start,step,stop)
% COLONOP  Demonstrate how the built-in a:d:b is constructed.
%
%   v = colonop(a,b) constructs v = a:1:b.
%   v = colonop(a,d,b) constructs v = a:d:b.
%
%   v = a:d:b is not constructed using repeated addition.  If the
%   textual representation of d in the source code cannot be
%   exactly represented in binary floating point, then repeated
%   addition will appear to have accumlated roundoff error.  In
%   some cases, d may be so small that the floating point number
%   nearest a+d is actually a.  Here are two imporant examples.
%
%   v = 1-eps : eps/4 : 1+eps is the nine floating point numbers
%   closest to v = 1 + (-4:1:4)*eps/4.  Since the spacing of the
%   floating point numbers between 1-eps and 1 is eps/2 and the
%   spacing between 1 and 1+eps is eps,
%   v = [1-eps 1-eps 1-eps/2 1 1 1 1 1+eps 1+eps].
%
%   Even though 0.01 is not exactly represented in binary,
%   v = -1 : 0.01 : 1 consists of 201 floating points numbers
%   centered symmetrically about zero.
%
%   Ideally, in exact arithmetic, for b > a and d > 0,
%   v = a:d:b should be the vector of length n+1 generated by
%   v = a + (0:n)*d where n = floor((b-a)/d).
%   In floating point arithmetic, the delicate computatations
%   are the value of n, the value of the right hand end point,
%   c = a+n*d, and symmetry about the mid-point.

if nargin < 3
    stop = step;
    step = 1;
end

tol = 2.0*eps*max(abs(start),abs(stop));
sig = sign(step);

% Exceptional cases.

if ~isfinite(start) || ~isfinite(step) || ~isfinite(stop)
   out = NaN;
   return
elseif step == 0 || start < stop && step < 0 || stop < start && step > 0
   % Result is empty.
   out = zeros(1,0);
   return
end

% n = number of intervals = length(v) - 1.

if start == floor(start) && step == 1
   % Consecutive integers.
   n = floor(stop) - start;
elseif start == floor(start) && step == floor(step)
   % Integers with spacing > 1.
   q = floor(start/step);
   r = start - q*step;
   n = floor((stop-r)/step) - q;
else
   % General case.
   n = round((stop-start)/step);
   if sig*(start+n*step - stop) > tol
      n = n - 1;
   end
end

% last = right hand end point.

last = start + n*step;
if sig*(last-stop) > -tol
   last = stop;
end

% out should be symmetric about the mid-point.

out = zeros(1,n+1);
k = 0:floor(n/2);
out(1+k) = start + k*step;
out(n+1-k) = last - k*step;
if mod(n,2) == 0
   out(n/2+1) = (start+last)/2;
end
Hyperpyrexia answered 20/3, 2018 at 5:59 Comment(3)
This is a fantastic QA! I suggest you add the entire colonop.m here, for completeness.Chrissa
I like how the implementation of : uses the : operator in k = 0:floor(n/2);. How do they avoid a stackoverflow from endless recursive calls to colonop()?Roseola
@JohnAlexiou: This colonop function emulates the internal working of :. It is interesting only for non-integer inputs, and it uses : only with integer inputs. MATLAB's colon operator is of course a built-in function written in C or C++ or Fortran or whatever, not a MATLAB M-file script.Hyperpyrexia

© 2022 - 2024 — McMap. All rights reserved.