How to check if string ends with .txt
Asked Answered
N

14

28

I am learning basic C++, and right now I have gotten a string from a user and I want to check if they typed the entire file name (including .txt) or not. I have the string, but how can I check if the string ends with ".txt" ?

string fileName;

cout << "Enter filename: \n";
cin >> fileName;

string txt = fileName.Right(4);

The Right(int) method only works with CString, so the above code does not work. I want to use a regular string, if possible. Any ideas?

Nanananak answered 7/12, 2013 at 20:25 Comment(2)
I suspect your question has more to do with filenames than strings, so I would suggest you look for solutions that extract the filename's extension in a portable way. Many libraries provide support for that.Teel
possible duplicate of Find if string endswith another string in C++Footsie
A
47

Unfortunately this useful function is not in the standard library. It is easy to write.

bool has_suffix(const std::string &str, const std::string &suffix)
{
    return str.size() >= suffix.size() &&
           str.compare(str.size() - suffix.size(), suffix.size(), suffix) == 0;
}
Artois answered 7/12, 2013 at 20:29 Comment(4)
Thanks. But now I wonder why you turned the passed parameter into a constant? bool has_suffix(std::string str, std::string suffix){} works also.Nanananak
@Chronicle: This is more of a general question about "why use const". The const qualifier is a promise, enforced by the compiler, that the code in the function will not modify the string. Since the function does not modify the string, you might as well make that promise -- and then the function can be passed const strings without having to copy them first. If you omit the const, you might have to copy your string before you can pass it to the function.Artois
@Chronicle: A quick search turned up good material, with an answer by Lightness Races in Orbit: #18158023Artois
@Nanananak Always use const& when you don't change the arguments (read-only access). Even if you don't understand why... just do it. It's good to understand why, nevertheless.Torgerson
A
14

Using boost ends_with predicate:

#include <boost/algorithm/string/predicate.hpp>

if (boost::ends_with(fileName, ".txt")) { /* ... */ }
Abc answered 7/12, 2013 at 20:28 Comment(1)
This is my favorite. Can use boost::iends_with(...) for case insensitive comparison.Devil
P
12

You've gotten quite a few answers already, but I decided to add yet another:

bool ends_with(std::string const &a, std::string const &b) {
    auto len = b.length();
    auto pos = a.length() - len;
    if (pos < 0)
        return false;
    auto pos_a = &a[pos];
    auto pos_b = &b[0];
    while (*pos_a)
        if (*pos_a++ != *pos_b++)
            return false;
    return true;
}

Since you have gotten quite a few answers, perhaps a quick test and summary of results would be worthwhile:

#include <iostream>
#include <string>
#include <vector>
#include <time.h>
#include <iomanip>

bool ends_with(std::string const &a, std::string const &b) {
    auto len = b.length();
    auto pos = a.length() - len;
    if (pos < 0)
        return false;
    auto pos_a = &a[pos];
    auto pos_b = &b[0];
    while (*pos_a)
        if (*pos_a++ != *pos_b++)
            return false;
    return true;
}

bool ends_with_string(std::string const& str, std::string const& what) {
    return what.size() <= str.size()
        && str.find(what, str.size() - what.size()) != str.npos;
}

bool has_suffix(const std::string &str, const std::string &suffix)
{
    return str.size() >= suffix.size() &&
        str.compare(str.size() - suffix.size(), suffix.size(), suffix) == 0;
}

bool has_suffix2(const std::string &str, const std::string &suffix)
{
    bool index = str.find(suffix, str.size() - suffix.size());
    return (index != -1);
}

bool isEndsWith(const std::string& pstr, const std::string& substr)
{
    int tlen = pstr.length();
    int slen = substr.length();

    if (slen > tlen)
        return false;

    const char* tdta = pstr.c_str();
    const char* sdta = substr.c_str();

    while (slen)
    {
        if (tdta[tlen] != sdta[slen])
            return false;

        --slen; --tlen;
    }
    return true;
}

bool ends_with_6502(const std::string& str, const std::string& end) {
    size_t slen = str.size(), elen = end.size();
    if (slen <= elen) return false;
    while (elen) {
        if (str[--slen] != end[--elen]) return false;
    }
    return true;
}

bool ends_with_rajenpandit(std::string const &file, std::string const &suffix) {
    int pos = file.find(suffix);
    return (pos != std::string::npos);
}

template <class F>
bool test(std::string const &label, F f) {
    static const std::vector<std::pair<std::string, bool>> tests{
        { "this is some text", false },
        { "name.txt.other", false },
        { "name.txt", true }
    };
    bool result = true;

    std::cout << "Testing: " << std::left << std::setw(20) << label;
    for (auto const &s : tests)
        result &= (f(s.first, ".txt") == s.second);
    if (!result) {
        std::cout << "Failed\n";
        return false;
    }
    clock_t start = clock();
    for (int i = 0; i < 10000000; i++)
        for (auto const &s : tests)
            result &= (f(s.first, ".txt") == s.second);
    clock_t stop = clock();
    std::cout << double(stop - start) / CLOCKS_PER_SEC << " Seconds\n";
    return result;
}

int main() {
    test("Jerry Coffin", ends_with);
    test("Dietrich Epp", has_suffix);
    test("Dietmar", ends_with_string);
    test("Roman", isEndsWith);
    test("6502", ends_with_6502);
    test("rajenpandit", ends_with_rajenpandit);
}

Results with gcc:

Testing: Jerry Coffin           3.416 Seconds
Testing: Dietrich Epp           3.461 Seconds
Testing: Dietmar                3.695 Seconds
Testing: Roman                  3.333 Seconds
Testing: 6502                   3.304 Seconds
Testing: rajenpandit            Failed

Results with VC++:

Testing: Jerry Coffin           0.718 Seconds
Testing: Dietrich Epp           0.982 Seconds
Testing: Dietmar                1.087 Seconds
Testing: Roman                  0.883 Seconds
Testing: 6502                   0.927 Seconds
Testing: rajenpandit            Failed

Yes, those were run on identical hardware, and yes I ran them a number of times, and tried different optimization options with g++ to see if I could get it to at least come sort of close to matching VC++. I couldn't. I don't have an immediate explanation of why g++ produces so much worse code for this test, but I'm fairly confident that it does.

Polish answered 7/12, 2013 at 22:11 Comment(3)
do me, do me, do me... :P that aside g++ has crappy string implementation last time I read things on the internet ;) aka no SSO herbsutter.com/2013/05/13/gotw-2-solution-temporary-objectsWylma
@NoSenseEtAl: These results would certainly seem to support that.Polish
'pos < 0' will always be false. length() returns an unsigned type. Just explicitly compare the std-string lengths, as they should be O(1) https://mcmap.net/q/102725/-is-std-string-size-a-o-1-operationSuber
U
4

Use std::string::substr

if (filename.substr(std::max(4, filename.size())-4) == std::string(".txt")) {
    // Your code here
}
Unseam answered 7/12, 2013 at 20:30 Comment(20)
This may cause problems if filename.size() < 4.Artois
This will still cause problems if filename.size() < 4. The std::max() function won't work as you expect here.Artois
Hint: std::max(0, filename.size() - 4) == filename.size() - 4 always.Artois
This will still cause problems. How can you ensure (long)filename.size() does not overflow?Lamp
It's really quite amazing how many pitfalls there are in C and C++ with an operation as simple as integer subtraction.Artois
@Lamp because it wouldn't fit into memoryUnseam
size_t is also often as big as longBiparous
On Windows 64-bit, size_t is 64-bit and long is 32-bit. So it can overflow there.Artois
@DietrichEpp: what is amazing is how C++ design mistake of choosing an unsigned type for size is inconvenient and still accepted by zealots as the right thing.Biparous
@Biparous A size cannot be negative so yes, an unsigned type makes sense. Is it that hard to write (filename.size() <= 4) ? 0 : (filename.size() - 4) ? Or even std::max(4, filename.size()) - 4 ?Lamp
@6502: I don't think using an unsigned type is wrong per se, but using unsigned types is dangerous in a language which has tricky conversion rules and no checked arithmetic for unsigned types.Artois
@DietrichEpp: using usigned for size is a design mistake because of unsigned semantic. Clinging to the name unsigned and giving it meanings it doesn't have is nonsense. An unsigned in C++ is NOT a non-negative integer. If it was then the difference of two unsigned would give a signed integer as result.Biparous
@syam: the only reason sizes are unsigned is historical and dates back to 16-bit systems. Even then it was in my opinion a bad idea (if 32k wasn't going to be enough then quite soon 64k wasn't going to suffice either), but you cannot change history to fix that.Biparous
@DietrichEpp It looks like that Windows64 is the only common 64bit platform, that has 32bit long (source)Unseam
@Biparous The real issue is, if you have a 32 bits system then a 32 bits unsigned integer can represent the whole memory (assuming we are counting chars) while a 32 bits signed integer can only represent half of it and thus might not be able to represent all you need. Since a negative container size doesn't make sense anyway, it's logical to place the burden on the programmer (and it's not that big a burden anyway). IMHO correctness comes before convenience.Lamp
@syam: tell that to Stroustrup. I agree with him that it was a mistake https://mcmap.net/q/75098/-why-is-size_t-unsigned/320726Biparous
@Biparous Let's agree to disagree then. Stroustrup or not, I stand by my assertion that if you voluntarily use a type that cannot represent your whole operational range, that's a design bug. It's pretty much the same as choosing to use a 16 bits integer as a size_t on a 32 bits system, it's just a difference of scale but the principle is the same. YMMV though. Or are you advocating to use 64 bits size_t on 32 bits systems (and similarly 128 bits size_t on 64 bits systems) just for the convenience of signedness?Lamp
@syam: size_t is not needed for representing all of memory, just the size of the largest object you can allocate. As long as malloc() fails for >=2 GiB allocations, signed 32-bit is large enough.Artois
Sutter also agrees it was a mistake... and afaik google discourages unsigned ints in coding standard... You should not use the unsigned integer types such as uint32_t, unless there is a valid reason such as representing a bit pattern rather than a number, or you need defined overflow modulo 2^N. In particular, do not use unsigned types to say a number will never be negative. Instead, use assertions for this.Wylma
What about something like: auto suffix = string(".txt"); auto min_length = suffix.size(); if (f.size() > min_length && f.substr(f.size() - min_length) == suffix) { ... }Tsimshian
Z
3
bool has_suffix(const std::string &str, const std::string &suffix)
{
    std::size_t index = str.find(suffix, str.size() - suffix.size());
    return (index != std::string::npos);
}
Zoroastrian answered 7/12, 2013 at 20:32 Comment(1)
This is broken when str.size() < suffix.size().Koala
P
3

you can just use another string to verify the extension like this :

string fileName;

cout << "Enter filename: \n";
cin >> fileName;

//string txt = fileName.Right(4);
string ext="";
for(int i = fileName.length()-1;i>fileName.length()-5;i--)
{
    ext += fileName[i];
}
cout<<ext;
if(ext != "txt.")
    cout<<"error\n";

checking if equals "txt." cause i starts with the length of the filename so ext is filled in the opposite way

Prosy answered 7/12, 2013 at 20:34 Comment(0)
G
2

The easiest approach is probably to verify that the string is long enough to hold ".txt" at all and to see if the string can be found at the position size() - 4, e.g.:

bool ends_with_string(std::string const& str, std::string const& what) {
    return what.size() <= str.size()
        && str.find(what, str.size() - what.size()) != str.npos;
}
Goglet answered 7/12, 2013 at 20:30 Comment(1)
I'm not sure if this is an example one would show in a book pushing C++. Not your fault, of course ...Biparous
B
2

This is something that, unfortunately enough, is not present in the standard library and it's also somewhat annoying to write. This is my attempt:

bool ends_with(const std::string& str, const std::string& end) {
    size_t slen = str.size(), elen = end.size();
    if (slen < elen) return false;
    while (elen) {
        if (str[--slen] != end[--elen]) return false;
    }
    return true;
}
Biparous answered 7/12, 2013 at 21:1 Comment(0)
W
2

2 options I can think of beside mentioned ones:
1) regex - prob overkill for this, but simple regexes are nice and readable IMHO
2) rbegin - kind of nice, could be I am missing something, but here it is:

bool ends_with(const string& s, const string& ending)
{
return (s.size()>=ending.size()) && equal(ending.rbegin(), ending.rend(), s.rbegin());
}

http://coliru.stacked-crooked.com/a/4de3eafed3bff6e3

Wylma answered 8/12, 2013 at 1:26 Comment(1)
Cleanest solution so farPenalize
W
1

This should do it.

bool ends_with(const std::string & s, const std::string & suffix) {
     return s.rfind(suffix) == s.length() - suffix.length();
}

Verify here

Willner answered 27/12, 2019 at 19:54 Comment(0)
B
1

Since C++17 you can utilize the path class from the filesystem library.

#include <filesystem>

bool ends_with_txt(const std::string& fileName) {
    return std::filesystem::path{fileName}.extension() == ".txt";
}
Berard answered 3/8, 2021 at 15:8 Comment(0)
K
1

C++ string and string-view types have member functions called ends_with() for this purpose. They were introduced in C++20.

Example:

bool has_dottxt_suffix(std::string_view name) noexcept
{
    return name.ends_with(".txt");
}
Koala answered 9/6 at 10:18 Comment(0)
A
0

Here is the "fully self-written" solution:

bool isEndsWith(const std::string& pstr, const std::string& substr) const
{
    int tlen = pstr.length();
    int slen = substr.length();

    if(slen > tlen)
        return false;

    const char* tdta = pstr.c_str();
    const char* sdta = substr.c_str();

    while(slen)
    {
        if(tdta[tlen] != sdta[slen])
            return false;

        --slen; --tlen;
    }
    return true;
}
Abbess answered 7/12, 2013 at 20:45 Comment(5)
+1, but why using the pointers instead of just using pstr[--tlen] != substr[--slen] in the test?Biparous
@Biparous well, I think Dietrich's solution is great, but I've written this when I was on 1st course and still use it in my engine till nowdays :)Abbess
Everyone has bad code lying around that was written when younger. But why clinging to it (unless you're forced to) ?Biparous
@6502, well I think my solution's not bad enough to optimize it even right now :)Abbess
propagating bad programming style: 1 - length() is unsigned. 2 - getting const char * is completely unnecessary.Penalize
E
-1

I implemented:

bool ends_with(std::string const &a, std::string const &b) {
    auto len = b.length();
    auto pos = a.length() - len;
    if (pos < 0)
        return false;
    auto pos_a = &a[pos];
    auto pos_b = &b[0];
    while (*pos_a)
        if (*pos_a++ != *pos_b++)
            return false;
    return true;
}

But ran into an issue while running it when a case of string b length was longer than string a. Using the auto type for the len and pos caused pos to not be negative, instead an extremely large positive value like it used a natural type. Change auto to int for those two and the same test case worked.

    bool ends_with(std::string const &a, std::string const &b) {
        int len = b.length();
        int pos = a.length() - len;
        if (pos < 0)
            return false;
        auto pos_a = &a[pos];
        auto pos_b = &b[0];
        while (*pos_a)
            if (*pos_a++ != *pos_b++)
                return false;
        return true;
    }
Eslinger answered 6/6 at 20:13 Comment(1)
That's crazy - trying to cram a std::size_t length into int will come unstuck quite easily. Just use their natural types and test a.length() < b.length() (remember that std::string::length() is very cheap, and any decent compiler will unify multiple calls for the same string).Koala

© 2022 - 2024 — McMap. All rights reserved.