Is there any safe strcmp?
Asked Answered
P

8

16

I made a function like this:

bool IsSameString(char* p1, char* p2) 
{
     return 0 == strcmp(p1, p2);
}

The problem is that sometimes, by mistake, arguments are passed which are not strings (meaning that p1 or p2 is not terminated with a null character). Then, strcmp continues comparing until it reaches non-accessible memory and crashes. Is there a safe version of strcmp? Or can I tell whether p1 (and p2) is a string or not in a safe manner?

Pumice answered 26/10, 2009 at 9:2 Comment(5)
I might have mistakingly retagged your question to C. Are you looking for a C or C++ solution?Differentiation
Just remember that even strncmp does not guarantee safety.Gravitate
@Pavel Shved, there's a bool return type, and C doesn't have the bool type, so I think it's C++Dasyure
If this is C++, then the safe solution is to use the string class. (As for the bool type, doesn't C99 introduce that as well)Autotomize
#include I would hope...Boardinghouse
L
22

No, there's no (standard) way to tell whether a char * actually points to valid memory.

In your situation, it is better to use std::string rather than char *s for all your strings, along with the overloaded == operator. If you do this, the compiler would enforce type safety.

EDIT: As per the comments below if you find yourself in a situation where you sometimes pass char *s that may or may not be valid strings to functions that expect null-terminated strings then something is fundamentally wrong with your approach, so basically @janm's answer below.

Loudmouth answered 26/10, 2009 at 9:6 Comment(7)
..and if you use std::string, you don't need to write a function like the one you've written, saving you time and energy.Tachometer
Using std::string still doesn't help if one of the strings is being fed as input from somewhere. At the end of the day, a std::string has to be instantiated off a char*. And if that char* does not point to a null-terminated string, the std::string constructor will either blow up or have undefined behavior. The only true solution is to sanitize strings in some way against a max length (either when constructing std::string instances if we go that route, or when using strncmp instead of strcmp when going the null-terminated char* route.Iolenta
-1 : propsing to use C++ "string" for someone who is using C is not really an answer at all. "The compiler would enforce type safety"... Only as long as someone does not do old C style casts (for example "std::string xxx = (char *)whatever;" ). What you might propose is to use something more complex than "char *" to represent strings. For example "struct mystr { uint32_t magicnum; ...}" where "magicnum" has to have a specific value; using "std:string" does not help here.Multimillionaire
@IngoBlackman The question is tagged C++. Also, the function returns bool, so a C++ assumption was fair. Also, this answer is more than 3 years old.Loudmouth
@IngoBlackman I didn't mean std::string xxx=(char *)whatever; - the OP should use std::strings to for all his strings. Then he can take advantage of the facilities the compiler provides, like type safety and an overloaded == operator (which would eliminate his need for a safe strcmp function). But you've got a point in that if the OP has situations where his char*s may sometimes point to strings and other times to invalid data then there is something wrong on a more fundamental level.Loudmouth
@Wensey: "The OP should use std::string for all his strings." This is not possible. Since "std::string" is NOT part of the C++ language, but part of the STL. What I mean by that: The only way to create "std::string" objects is either out of another "std::string" object or out of a "const char *" pointer. So at some point in your program you are basically forced to go back to "const char *" to create "std::string" objects; and that's exactly where you might get bugs, because you always can use a old C style cast to force anything to be type-casted to "const char *".Multimillionaire
@IngoBlackman I hear what you're saying, but your comments aren't relevant to the original question and answer any more. I've edited my answer slightly.Loudmouth
J
18

In some cases std::strncmp can solve your problem:

int strncmp ( const char * str1, const char * str2, size_t num ); 

It compares up to num characters of the C string str1 to those of the C string str2.


Also, take a look, what the US DHS National Cyber Security Division recommends on this matter:

Ensure that strings are null terminated before passing into strcmp. This can be enforced by always placing a \0 in the last allocated byte of the buffer.

char str1[] ="something";
char str2[] = "another thing";
/* In this case we know strings are null terminated. Pretend we don't. */
str1[sizeof(str1)-1] = '\0';
str2[sizeof(str2)-1] = '\0';
/* Now the following is safe. */
if (strcmp(str1, str2)) { /* do something */ } else { /* do something else */ }
Jed answered 26/10, 2009 at 9:5 Comment(2)
If the OS is Linux, it might de useful to know that strncmp is included into LSB: dev.linuxfoundation.org/navigator/browse/…Differentiation
it's still not safe if one of the strings is shorter than num.Mccutcheon
P
14

If you are passing strings to strcmp() that are not null terminated you have already lost. The fact that you have a string that is not null terminated (but should be) indicates that you have deeper issues in your code. You cannot change strcmp() to safely deal with this problem.

You should be writing your code so that can never happen. Start by using the string class. At the boundaries where you take data into your code you need to make sure you deal with the exceptional cases; if you get too much data you need to Do The Right Thing. That does not involve running off the end of your buffer. If you must perform I/O into a C style buffer, use functions where you specify the length of the buffer and detect and deal with cases where the buffer is not large enough at that point.

Pinkard answered 26/10, 2009 at 11:19 Comment(1)
Yes, what janm said. When you find a bug, fix the bug. Don't just try to hide it by complicating other code in a futile attempt to make the bug's symptoms less severe. In this case, there is a bug in the code that calls IsSameString().Fold
G
7

There's no cure for this that is portable. The convention states that there's an extra character holding a null character that belongs to the same correctly allocated block of memory as the string itself. Either this convention is followed and everything's fine or undefined behaviour occurs.

If you know the length of the string you compare against you can use strncmp() but his will not help if the string passed to your code is actually shorter than the string you compare against.

Galatia answered 26/10, 2009 at 9:4 Comment(0)
A
3

you can use strncmp, But if possible use std::string to avoid many problems :)

Anywise answered 26/10, 2009 at 9:7 Comment(1)
std::string can still be vulnerable if an instance is constructed off a char* that is not null terminated. At some point, for true or near-true secure code, a max-length has to be enforced according to the context (be it when using strncmp, or during instantiation of std::string.)Iolenta
C
1

You can put an upper limit on the number of characters to be compared using the strncmp function.

Cymograph answered 26/10, 2009 at 9:6 Comment(1)
But it doesn't make sense. A sane strcmp(s1, s2) will not look further in s1 than there are characters in s2, and at the same time, you have to compare the full length of the shortest of them.. so there is no way that strncmp can help.Minesweeper
I
-1

There is no best answer to this as you can't verify the char* is a string. The only solution is to create a type and use it for string for example str::string or create your own if you want something lighter. ie

struct MyString
  {
  MyString() : str(0), len(0) {}
  MyString( char* x ) { len = strlen(x); str = strdup(x); }
  ⁓MyString() { if(str) free(str); }
  char* str;
  size_t len;
  };

bool IsSameString(MyString& p1, MyString& p2) 
  {
  return 0 == strcmp(p1.str, p2.str);
  }


MyString str1("test");
MyString str2("test");

if( IsSameString( str1, str2 ) {}
Itacolumite answered 26/10, 2009 at 11:8 Comment(9)
This just changes the point of failure; if the memory passed in the constructor is not null terminated there will still be a failure.Pinkard
"Want something lighter" seems to be the downfall. std::string will do everything your class does, at no extra cost. Light != less functionality.Hearne
janm - I think that is a problem that you will not be able to solve without only using allowing static strings ie "" to be used and making all other uses of char* compile time errors.Itacolumite
GMan - I am not sure but I would expect that std::string will increase the size of your exec by quite a bit. It all depends on what you mean by lighter etc.Itacolumite
You'll be linking to the standard library anyway, if that's what you mean. I can't see saving a little space worth trying to correctly re-write an existing class. As it stands, your string class is buggy.Hearne
There are problems with your class unrelated to the actual problem being solved (eg. copyable, possible multiple frees on the same pointer, ...), but the issue related to the actual question is that the program will still crash. What happens when a MyString is constructed like "MyString s(broken_pointer)" or "MyString x = 0"? Answer: The process fails as described by the original poster. So: The actual problem is not the question as asked, it is that the code is broken somewhere else. Calling strcmp() on terminated strings is not bad, and an application can ensure strings are terminated.Pinkard
janm & gman - I have no intention of writing a MyString class just to prove I know what the bugs are with the one above. The strcmp should use the len for example. As for MyString s(broken_pointer). This should probably be a const char* with an explicit ie only allow "", but that does not stop someone putting a broken string in. You can't fix that in C or C++. The zero term str is an idiom, you follow the idiom or break it and get what you deserve.Itacolumite
The constructor of this struct is vulnerable to a non-null terminated char*. As vulnerable as strcmp, std::string, and even strncmp (when you pass a sufficiently large max, but one of the input strings is not null terminated.) So that which you call "the only solution" is not a solution at all either. It is equally vulnerable.Iolenta
@Iolenta - You have a solution to a generic char* string not having null terminator? As far as I know there is none. The only solution I know is to move your code completely was way from using bare cstr and only always using a managed type like std::string or your own MyString. Once your static string are in a string class then you can /hope/ that they never happen but even then it is a very large assumption.Itacolumite
I
-2

You dont write, what platform you are using. Windows has the following functions:

IsBadStringPtr might be what you are looking for, if you are using windows.

Invaluable answered 26/10, 2009 at 10:57 Comment(3)
The IsBadXXXPtr functions are generally not a good thing to use -- depending on where the pointer points, they can induce random crashes in other parts of the program, and it's much, much easier to debug a crash near its cause, instead of half an hour and sixteen source files away. See blogs.msdn.com/oldnewthing/archive/2006/09/27/773741.aspx for details.Jacy
Not only that, it masks other problems. The pointer might be valid from the operating system's point of view, but the memory could be something else (like an address on the stack ...)Pinkard
Oh, thanks for these comments. I had some similar experience, but counted that to homebrewn problems.Invaluable

© 2022 - 2024 — McMap. All rights reserved.