Check if one string is a prefix of another
Asked Answered
H

14

64

I have two strings which I'd like to compare: String and String:. Is there a library function that would return true when passed these two strings, but false for say String and OtherString?

To be precise, I want to know whether one string is a prefix of another.

Heliopolis answered 27/10, 2011 at 9:9 Comment(10)
what about using good old string.compare()?Parlormaid
you mean comparing first N characters?Septal
@Septal That would be ok, would be nice if it did it for me so I didn't need to go through the faff of working out n.Heliopolis
Well, strictly speaking one function which satisfies your requirements is the == operator. ;-)Hammel
@FrerichRaabe: no, it doesn't, he does not want to check whether they are the same, but rather whether they share a prefixPurifoy
@DavidRodríguez-dribeas: As it turns out he doesn't want to check whether they share a prefix, but whether one is a prefix of the other (I got two downvotes for not guessing that).Okeechobee
@BjörnPollex Sorry for the downvotes... some people are just too trigger happy with the downvote, the original question was not clear, and the third comment here seemed to indicate that N was an unknown (while if you are looking for a prefix, it is quite well-known!)Purifoy
@David Rodríguez - dribeas: My tongue-in-cheek response was aimed at the sentence "Is there a library function that would return true when passed these two strings (but false for say String and OtherString)?" in the question. That's why I wrote the smiley at the end. ;-)Hammel
The question was quite clear from the examples before the edit that made it explicit. == does not return true for 'String' and 'String:', "strictly" speaking or otherwise.Chladek
Isn't this a dup of (part of) #1878501 ? The answer to that one is better than any of the ones here so far.Acetyl
N
66

Use std::mismatch. Pass in the shorter string as the first iterator range and the longer as the second iterator range. The return is a pair of iterators, the first is the iterator in the first range and the second, in the second rage. If the first is end of the first range, then you know the the short string is the prefix of the longer string e.g.

std::string foo("foo");
std::string foobar("foobar");

auto res = std::mismatch(foo.begin(), foo.end(), foobar.begin());

if (res.first == foo.end())
{
  // foo is a prefix of foobar.
}
Nalda answered 27/10, 2011 at 9:23 Comment(7)
+1, and this can actually be extended to test share a prefix rather than is a prefix by comparing the result against begin() rather than end (and can get the actual length of the common prefix too, by substracting)Purifoy
+1, but This is dangerous if the second string is shorter because you would iterate past its end. it is therefore needed to check that foo.size() <= foobar.size().Dumas
@Benoit, yip; the thing that bemuses me is that they could have so easily accepted an end for the second iterator and save us having to do the check before...Nalda
maybe because it sometimes happen that you compare distinct containers, you know that the first one is shorter than the second one for sure, but computing the size or an end iterator in the second container would be time-consuming.Dumas
This is neat, but James Kanze's solution of using std::equal is simpler.Nasa
@Dumas Note, I think your concern regarding size has been addressed in C++14. See comments on Return Value for mismatch.Fining
No need for mismatch() on iterators, just use compare().Agbogla
K
28

This is both efficient and convenient:

str.compare(0, pre.size(), pre) == 0

compare is fast because it uses the fast traits::compare method and doesn't have to copy any data.

Here, it will compare std::min(str.size(), pre.size()) characters but if the characters in the two ranges are equal it also checks the length of pre and returns a non-zero value if pre is longer than this.

See the documentation at cplusplus.com.

I've written a test program that uses this code to compare prefixes and strings given on the command line.

Kilimanjaro answered 26/2, 2014 at 21:44 Comment(7)
Why you need a.size() >= b.size()? compare() will handle that as well.Rosecan
Because a.compare will stop when it reaches the end of a and won't look at the remaining characters of b. b isn't a prefix of a if it contains extra characters at the end.Kilimanjaro
I've changed the variable names to make this easier to understand.Kilimanjaro
@Rosecan You're right! The size comparison isn't needed. I've just checked the docs at cplusplus.com/reference/string/string/compare, and compare will return 0 only if the two character ranges being compared are the same length. If str is shorter than pre, compare will return a negative value (-1 in my testing). I'll edit my answer, but you should have a share of the credit. However, the best I can do is to vote up your comment.Kilimanjaro
"if str is shorter than pre, compare will return a negative value" Wait what? That's not right. For example, if str is "B" and pre is "AA", then str.compare(0, pre.size(), pre) returns a positive value. Seems to me the initial test is necessary and should be put back.Acetyl
@DonHatch "if str is shorter than pre" AND they're equal up to the end of str. In the case of AA and B they differ on the first char, so compare will return a non-zero result. The wording in the answer itself is clearer.Kilimanjaro
This is the best answer!Bullpup
P
21

If you know which string is shorter, the procedure is simple, just use std::equal with the shorter string first. If you don't, something like the following should work:

bool
unorderIsPrefix( std::string const& lhs, std::string const& rhs )
{
    return std::equal(
        lhs.begin(),
        lhs.begin() + std::min( lhs.size(), rhs.size() ),
        rhs.begin() );
}
Protectionist answered 27/10, 2011 at 10:11 Comment(1)
This will not always give you the correct answer. This will return whether either of the strings is a prefix of another one.Sodamide
M
18

std::string(X).find(Y) is zero if and only if Y is a prefix of X

Morbid answered 27/10, 2011 at 9:14 Comment(4)
It's probably not the most efficient. The compiler would need to inline it, or else it must search for Y at non-zero offsets too.Morbid
This is concise, but potentially inefficient (imagine if X is very long and Y is not the prefix of X).Hammel
@FrerichRaabe: That's why I commented on this myself. A good optimizer will spot the comparison with zero, find that the comparand corresponds to the index variable used in the preceding for loop, and replace the for loop by an if statement.Morbid
Message from the future: Use std::string_view :)Bowles
S
15

After C++20, we can use starts_with to check if a string begins with given prefix.

str.starts_with(prefix)

Also, there is ends_with to check suffix

Staal answered 10/3, 2021 at 2:50 Comment(0)
T
10

With string::compare, you should be able to write something like:

bool match = (0==s1.compare(0, min(s1.length(), s2.length()), s2,0,min(s1.length(),s2.length())));

Alternatively, in case we don't want to use the length() member function:

bool isPrefix(string const& s1, string const&s2)
{
    const char*p = s1.c_str();
    const char*q = s2.c_str();
    while (*p&&*q)
        if (*p++!=*q++)
            return false;
    return true;
}
Thereabouts answered 27/10, 2011 at 9:15 Comment(9)
This is potentially inefficient if string1 is very long - calling length() is O(n) and there's no need to know the exact length of the string. You only care if it's long enough or not.Hammel
I don't think that's the case, not with std::string. From what I can see in basic_string.h, it has a length field somewhere.Thereabouts
.length() is O(n) ? Are you by any chance looking at the character_traits table?Morbid
@Vlad: See this question for a discussion on std::string::length(); seems that it should be O(1) as you say, but it doesn't have to: #256533Hammel
@Frerich: I admit, I didn't know that. But then again, it's probably O(1) on most current compilers. Alternatively, you could just start at the beginning, and compare chars until one of them is \0.Thereabouts
In C++11, length() must take constant time; in C++03, it "should".Ornithischian
@FrerichRaabe: The comment to the answer in the first question is the key. While the general requirement is that size() does not need to be constant time in all containers (for example std::list), but std::string must know the size (i.e. it cannot calculate it on the fly, as all characters are valid in the string, you cannot search for a sentry). Once the requirement for the size to be known (no need to store actual size, can store two pointers), and cannot be calculated from data, the implementation must be O(1)Purifoy
@David Rodríguez - dribeas: std::string doesn't need to know the size of the string, only the end of it. That's why you can always get the size of the string in constant time by subtracting the begin iterator from the end iterator.Hammel
@FrerichRaabe: Rationale 1) string needs to know begin() and end() in constant time, the iterators are random, so they can be substracted in constant time, and the difference is the size of the string, to it must be known in constant time. Rationale 2) unless the string is implemented with ropes (forbidden in C++11, not implemented in any known current standard library implementation), the memory is contiguous, and that means that knowing begin() and end() and knowing size() is equivalent, you need to store two of the three and the other can be calculated in constant time.Purifoy
H
6

If you can reasonably ignore any multi-byte encodings (say, UTF-8) then you can use strncmp for this:

// Yields true if the string 's' starts with the string 't'.
bool startsWith( const std::string &s, const std::string &t )
{
    return strncmp( s.c_str(), t.c_str(), t.size() ) == 0;
}

If you insist on using a fancy C++ version, you can use the std::equal algorithm (with the added benefit that your function also works for other collections, not just strings):

// Yields true if the string 's' starts with the string 't'.
template <class T>
bool startsWith( const T &s, const T &t )
{
    return s.size() >= t.size() &&
           std::equal( t.begin(), t.end(), s.begin() );
}
Hammel answered 27/10, 2011 at 9:18 Comment(2)
With your std::equal solution, what happens when s is shorter than t? It appears that it could read past the end of s.Psychosomatic
@teambob: You're right; I augmented the answer to check the sizes of the two strings.Hammel
I
5

How about simply:

bool prefix(const std::string& a, const std::string& b) {
  if (a.size() > b.size()) {
    return a.substr(0,b.size()) == b;
  }
  else {
    return b.substr(0,a.size()) == a;
  }
}

C++ not C, safe, simple, efficient.

Tested with:

#include <string>
#include <iostream>

bool prefix(const std::string& a, const std::string& b);

int main() {
  const std::string t1 = "test";
  const std::string t2 = "testing";
  const std::string t3 = "hello";
  const std::string t4 = "hello world";
  std::cout << prefix(t1,t2) << "," << prefix(t2,t1) << std::endl;
  std::cout << prefix(t3,t4) << "," << prefix(t4,t3) << std::endl;
  std::cout << prefix(t1,t4) << "," << prefix(t4,t1) << std::endl;
  std::cout << prefix(t1,t3) << "," << prefix(t3,t1) << std::endl;

}

If you have C++17 you can write a better version of this, using std::string_view instead:

#include <string>
#include <string_view>

bool prefix(const std::string& a, const std::string& b) {
  if (a.size() > b.size()) {
    return std::string_view(a.c_str(),b.size()) == b;
  }
  else {
    return std::string_view(b.c_str(),a.size()) == a;
  }
}

With g++ 7 at -O3 this collapses to a single memcmp call, which is a fairly substantial improvement over the older version.

Interbedded answered 27/10, 2011 at 9:53 Comment(6)
Why std::for_each + lambda, instead of the much less noisy ranged for loop?Bimolecular
@R.MartinhoFernandes - removed. I only added that bit to show calling it with a bigger list.Interbedded
This function would report that an empty string contains every other string as its prefix. For a prefix function, it does not make sense to make it symmetric.Martinemartineau
This method is complex and inefficient. It always creates temporary string objects potentially involving heap memory allocation and may throw.Albuquerque
I'd definitely use string_view if I wrote this answer again now.Interbedded
For the updated version, just accept the strings as std::string_view as parameters. The construction inside the function is unnecessary.Perbunan
M
3

Easiest way is to use substr() and compare() member functions:

string str = "Foobar";
string prefix = "Foo";

if(str.substr(0, prefix.size()).compare(prefix) == 0) cout<<"Found!";
Mall answered 3/6, 2013 at 18:2 Comment(2)
The substr operation typically makes a copy of the data, so this isn't as efficient as it could be.Kilimanjaro
if you are going to use substr() you can simply write str.substr(0, prefix.size()) == prefixRosecan
L
1

I think strncmp is the closest to what you're looking for.

Though, if reworded, you may be looking for strstr(s2,s1)==s2, which is not necessarily the most performant way to do that. But you do not want to work out n ;-)

Okay, okay, the c++ version would be !s1.find(s2).

Okay, you can make it even more c++, something like this: std::mismatch(s1.begin(),s1.end(),s2.begin()).first==s1.end().

Lift answered 27/10, 2011 at 9:12 Comment(2)
The question is tagged as C++, not C.Neckband
.c_str() is not that hard to call :)Lift
M
1

str1.find(str2) returns 0 if entire str2 is found at the index 0 of str1:

#include <string>
#include <iostream>

// does str1 have str2 as prefix?
bool StartsWith(const std::string& str1, const std::string& str2)
{   
    return (str1.find(str2)) ? false : true;
}

// is one of the strings prefix of the another?
bool IsOnePrefixOfAnother(const std::string& str1, const std::string& str2)
{   
    return (str1.find(str2) && str2.find(str1)) ? false : true;
}

int main()
{
    std::string str1("String");
    std::string str2("String:");
    std::string str3("OtherString");

    if(StartsWith(str2, str1))
    {
        std::cout << "str2 starts with str1" << std::endl;      
    }
    else
    {
        std::cout << "str2 does not start with str1" << std::endl;
    }

    if(StartsWith(str3, str1))
    {
        std::cout << "str3 starts with str1" << std::endl;      
    }
    else
    {
        std::cout << "str3 does not start with str1" << std::endl;
    }

        if(IsOnePrefixOfAnother(str2, str1))
        {
            std::cout << "one is prefix of another" << std::endl;      
        }
        else
        {
            std::cout << "one is not prefix of another" << std::endl;
        }

        if(IsOnePrefixOfAnother(str3, str1))
        {
            std::cout << "one is prefix of another" << std::endl;      
        }
        else
        {
            std::cout << "one is not prefix of another" << std::endl;
        }

    return 0;
}

Output:

  str2 starts with str1
  str3 does not start with str1
  one is prefix of another
  one is not prefix of another
Molding answered 27/10, 2011 at 9:34 Comment(0)
E
1

What's wrong with the "find" and checking the result for position 0 ?

string a = "String";
string b = "String:";

if(b.find(a) == 0)
{
// Prefix

}
else
{
// No Prefix
}
Ecthyma answered 27/10, 2011 at 10:37 Comment(1)
find searches through the entire string and compare does it better.Infringe
M
1

You can use this:

for c++14 or less

bool has_prefix
    (const std::string& str, const std::string& prefix)  {
    return str.find(prefix, 0) == 0;
}

for c++17

//it's a little faster
auto has_prefix
    (const std::string& str, const std::string_view& prefix) -> decltype(str.find(prefix) == 0) {
    return str.find(prefix, 0) == 0;
}
Mornings answered 24/7, 2018 at 0:45 Comment(2)
Wouldn't this be considerably slower than some other methods if the string is not prefixed and str is longer than prefix? Since the find() method will search for any instance of prefix in str, even if it is not the at offset 0. E.g., checking "bbbbbbba" for prefix "a" would need to search the entire string, find the final "a", then return false because it is not at offset zero, rather than return false after comparing only the first character.Fourflusher
@Fourflusher yes. Using rfind() instead would fix this, as suggested in the accepted answer to the question of which this is a dup: #1878501Acetyl
H
1
bool IsPrefix(const std::string& prefix, const std::string& whole)
{
  return whole.size() >= prefix.size() && whole.compare(0, prefix.size(), prefix) == 0;
}
Hectograph answered 10/9, 2019 at 14:37 Comment(2)
This is a duplicate of a previously-submitted answer and uses a length comparison that was determined to be unnecessary by the comments on that answer.Kilimanjaro
I downvoted in agreement with @NeilMayhew, but on further reflection, I disagree with this downvote (which is now unfortunately locked in). If I'm not mistaken, the initial test is necessary (for performance), and the comments in that answer saying otherwise are wrong. See my reply on that thread.Acetyl

© 2022 - 2024 — McMap. All rights reserved.