Comparing null-terminated string with a non null-terminated string in C
Asked Answered
P

3

3

The deserialization library (messagepack) I am using does not provide null-terminated strings. Instead, I get a pointer to the beginning of the string and a length. What is the fastest way to compare this string to a normal null-terminated string?

Policlinic answered 11/3, 2015 at 20:50 Comment(1)
First compare the lengths: if they are unequal, the strings must be unequal. If the lengths are equal you can memcmp() the string bodies.Mishnah
L
4

The fastest way is strncmp() which limits the length to be compared.

 if (strncmp(sa, sb, length)==0)  
    ...

This assumes however that the length you use is the maximum length of the two strings. If the null terminated string could have a bigger length, you first have to compare the length.

 if(strncmp(sa,sb, length)==0 && strlen(sa)<=length) // sa being the null terminated one
     ...

Note that the strlen() is checked on purpose after the comparison, to avoid unnecessarily iterating through all the characters o fthe null terminated string if the first caracters don't even match.

A final variant is:

 if(strncmp(sa,sb, length)==0 && sa[length]==0) // sa being the null terminated one
     ...
Lashaunda answered 11/3, 2015 at 20:52 Comment(5)
The problem is that the if I have "abc" (no null termination), and "abcdef" (with null termination), then strncmp with n = 3 will return that they are equal. I could do a strlen on "abcdef" first, but that would introduce an additional pass over "abcdef".Policlinic
You don't need an extra pass, just check for a null character in the null-terminated string at the known the length if they compare equal with strncmp. You may still get into trouble if the non-terminated string may contain embedded nulls though.Curd
@Curd Very clever! Thanks to you and Christophe.Policlinic
It makes too many passes, and is completely wrong in the presense of embedded NULs. BTW: what is length?Mishnah
@wildpasser the string he receives is a real string. It's just not null terminated (certainly the deserialisation library keeps the read data in a buffer and doesn't change it).Lashaunda
R
3

Here is one way:

bool is_same_string(char const *s1, char const *s2, size_t s2_len)
{
    char const *s2_end = s2 + s2_len;
    for (;;)
    {
        if ( s1[0] == 0 || s2 == s2_end )
            return s1[0] == 0 && s2 == s2_end;

        if ( *s1++ != *s2++ )
            return false;
    }
}
Rill answered 11/3, 2015 at 22:29 Comment(0)
M
1
int compare(char *one, size_t onelen, char *two, size_t twolen)
{
int dif;

  dif = memcmp(one, two, onelen < twolen ? onelen : twolen);
  if (dif) return dif;

  if (onelen == twolen) return 0;
  return onelen > twolen? 1 : -1;
}

usage:

...
int result;
char einz[4] = "einz"; // not terminated
char *zwei = "einz";   // terminated

result = compare(einz, sizeof einz, zwei, strlen(zwei));

...
Mishnah answered 11/3, 2015 at 21:3 Comment(1)
It's a nice alternative. Unfortunately, your usage example returns -1 despite the two strings being equal. May be you'd change the last statement to return onelen - twolen; . By the way, your usage scenario requires too many passes as you always have iterate through the null terminated string to get its length... ;-)Lashaunda

© 2022 - 2024 — McMap. All rights reserved.