Is there an Octave equivalent of Matlab's `contains` function?
Asked Answered
L

4

5

Is there an equivalent of MATLAB's contains function in Octave? Or, is there a simpler solution than writing my own function in Octave to replicate this functionality? I am in the process of switching to Octave from MATLAB and I use contains throughout my MATLAB scripts.

Lacey answered 15/1, 2020 at 0:14 Comment(0)
M
10

Let's stick to the example from the documentation on contains: In Octave, there are no (double-quoted) strings as introduced in MATLAB R2017a. So, we need to switch to plain, old (single-quoted) char arrays. In the see also section, we get a link to strfind. We'll use this function, which is also implemented in Octave to create an anonymous function mimicking the behaviour of contains. Also, we will need cellfun, which is available in Octave, too. Please see the following code snippet:

% Example adapted from https://www.mathworks.com/help/matlab/ref/contains.html

% Names; char arrays ("strings") in cell array
str = {'Mary Ann Jones', 'Paul Jay Burns', 'John Paul Smith'}

% Search pattern; char array ("string")
pattern = 'Paul';

% Anonymous function mimicking contains
contains = @(str, pattern) ~cellfun('isempty', strfind(str, pattern));
% contains = @(str, pattern) ~cellfun(@isempty, strfind(str, pattern));

TF = contains(str, pattern)

The output is as follows:

str =
{
  [1,1] = Mary Ann Jones
  [1,2] = Paul Jay Burns
  [1,3] = John Paul Smith
}

TF =
  0  1  1

That should resemble the output of MATLAB's contains.

So, in the end - yes, you need to replicate the functionality by yourself, since strfind is no exact replacement.

Hope that helps!


EDIT: Use 'isempty' instead of @isempty in the cellfun call to get a faster in-built implementation (see carandraug's comment below).

Maragaret answered 15/1, 2020 at 5:40 Comment(7)
Perhaps worth noting that cellfun is basically a loop, so this won't be very performant for large cell arrays. I imagine if the OP gave specific examples where this matters (if any) then there might be tailored solutions which are faster...Johnsonian
@Johnsonian That's true! Since I currently have no MATLAB available: How does MATLAB's contains perform on large string arrays? Is there some smart algorithm working under the hood; or is there some optimization? From a naive point of view: There still must be n (sub)string findings. So, shouldn't any optimization with respect to that also be applicable to the strfind approach?Maragaret
@Johnsonian celfun is not basically a loop for specific cases. If instead of a function handler @isempty you pass the string with the function name isempty it will use a faster implementation.Infielder
@Infielder Ah, I forgot about that! Will add this to my answer. (Someone with access to MATLAB may provide timings for both versions.)Maragaret
@Infielder Right, but as it was used when I commented it was basically a loop, I think my comment about specific use-case optimisations still stands with the 'isempty' update. That is a nice trick to remember though.Johnsonian
@Maragaret Hello, I would like to know please how to add the 'ignorecase' by default defined as an input in the MATLAB contains function, in the one you just redefined over here for Octave?Duodenum
@Maragaret Nevermind, found the strcmpi function :DDuodenum
F
2

I'm not too familiar with MuPad functions, but it looks like this is reinventing the ismember function (which exists in both Matlab and Octave).

E.g.

ismember( {'jim', 'stan'}, {'greta', 'george', 'jim', 'jenny'} )
% ans = 1  0

i.e. 'jim' is a member of {'greta', 'george', 'jim', 'jenny'}, whereas 'stan' is not.

Furthermore, ismember also supports finding the index of the matched element:

[BoolVal, Idx] = ismember( {'jim', 'stan'}, {'greta', 'george', 'jim', 'jenny'} )
% BoolVal = 1  0
% Idx     = 3  0
Fancie answered 15/1, 2020 at 10:23 Comment(2)
ismember and contains are not the same - hence the introduction of the contains function. To extend your example, ismember( {'george', 'jim'}, 'eor' ) would return false for both input elements, but contains( {'george', 'jim'}, 'eor' ) returns true for 'george'.Johnsonian
@Johnsonian thanks. It wasn't clear to me how the two differed. I'll leave this up, if only for this comment.Fancie
P
0

Personally I use my own implementation, which returns 1 if a string str contains entire substring sub:

function res = containsStr(str, sub)
res = 0;

strCharsCount = length(str);
subCharsCount = length(sub);

startCharSub = sub(1);

% loop over character of main straing
for ic = 1:strCharsCount
    currentChar = str(ic);
    % if a substring starts from current character
    if (currentChar == startCharSub)
        %fprintf('Match! %s = %s\n', currentChar, startCharSub);
        
        matchedCharsCount = 1;
        % loop over characters of substring
        for ics = 2:subCharsCount
            nextCharIndex = ic + (ics - 1);
            % if there's enough chars in the main string
            if (nextCharIndex <= strCharsCount)
                nextChar = str(nextCharIndex);
                nextCharSub = sub(ics);
                if (nextChar == nextCharSub)
                    matchedCharsCount = matchedCharsCount + 1;      
                end
            end
        end
        
        %fprintf('Matched chars = %d / %d\n', matchedCharsCount, subCharsCount);
        
        % the substring is inside the main one
        if (matchedCharsCount == subCharsCount)
            res = 1;  
        end
    end
end

end

Peanuts answered 16/8, 2022 at 16:31 Comment(0)
B
0

It is very simple to implement similar function in Octave. Follow code suing strfind does same job:

function tf = contains(str, substr)
% MATLAB equivalent of contain funciton in Octave
tf = ~isempty(strfind(str, substr));
end
Bartlett answered 26/12, 2023 at 10:50 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.