Is it undefined behavior what strncmp(s1, s2, 0) returns (i.e. the last argument is zero)?
Asked Answered
P

2

7

It is not immediately clear from the standard what strncmp (from string.h)

int strncmp(const char *s1, const char *s2, size_t n);

should return if its 3rd argument n is 0.

According to the C17 standard draft, 7.24.4.4:

The strncmp function compares not more than n characters (characters that follow a null character are not compared) [...].

The strncmp function returns an integer greater than, equal to, or less than zero, accordingly as the [...] array pointed to by s1 is greater than, equal to, or less than the [...] array pointed to by s2.

What should strncmp(s1, s2, 0) return? Or is the standard silent on the case of strncmp's last argument being 0?

My intuition tells me that 0 as the return value would make the most sense:

  • 0 is the most "symmetric" answer (a negative or positive return value implies an asymmetry and is inconsistent with no comparisons having been undertaken).
  • 0 is consistent with a model where 0 is assumed until a difference has been found, n comparisons have been undertaken, or the end of the strings has been reached.

But the above reasoning is philosophical.

It seems that the standard doesn't technically proclaim anything about this case. I think it would be better if it

  • explicitly defined a result (such as 0) or
  • outlawed it.

For what it's worth, glibc gives me 0 (and no warnings or errors) for a bunch of simple test cases such as strncmp("abc", "def", 0) with the following compiler flags:

-Wall -Wextra -std=c90 -pedantic
-Wall -Wextra -std=c17 -pedantic
Peraza answered 3/4, 2023 at 13:36 Comment(0)
D
8

From the C11 Standard (7.23.1 String function conventions)

2 Where an argument declared as size_t n specifies the length of the array for a function, n can have the value zero on a call to that function. Unless explicitly stated otherwise in the description of a particular function in this subclause, pointer arguments on such a call shall still have valid values, as described in 7.1.4. On such a call, a function that locates a character finds no occurrence, a function that compares two character sequences returns zero, and a function that copies characters copies zero characters.

The same is written in the following C Standards including the C23 Standard.

It is logically consistent. The parameter n specifies the size of a range. When n is equal to 0 then that means that the range is empty. And two empty sets can not be greater or less each other. They are both empty sets and hence are equal each other.

Delafuente answered 3/4, 2023 at 13:42 Comment(2)
"specifies the length of the array for a function" -- how embarrassing that I didn't locate this wording (though I did look through the document a bit). However I have to say that I don't like the wording: (1) there is more than one array involved ("the" implies contextual uniqueness), and (2) the wording is vague.Peraza
@LoverofStructure FYI: there is "7.1.4 Use of library functions", which is useful.Minervamines
J
2

As a side note, looking at the GLIBC's source code, strncmp() is:

int
STRNCMP (const char *s1, const char *s2, size_t n)
{
  unsigned char c1 = '\0';
  unsigned char c2 = '\0';

  if (n >= 4)
    {
      size_t n4 = n >> 2;
      do
        {
          c1 = (unsigned char) *s1++;
          c2 = (unsigned char) *s2++;
          if (c1 == '\0' || c1 != c2)
            return c1 - c2;
          c1 = (unsigned char) *s1++;
          c2 = (unsigned char) *s2++;
          if (c1 == '\0' || c1 != c2)
            return c1 - c2;
          c1 = (unsigned char) *s1++;
          c2 = (unsigned char) *s2++;
          if (c1 == '\0' || c1 != c2)
            return c1 - c2;
          c1 = (unsigned char) *s1++;
          c2 = (unsigned char) *s2++;
          if (c1 == '\0' || c1 != c2)
            return c1 - c2;
        } while (--n4 > 0);
      n &= 3;
    }

  while (n > 0)
    {
      c1 = (unsigned char) *s1++;
      c2 = (unsigned char) *s2++;
      if (c1 == '\0' || c1 != c2)
        return c1 - c2;
      n--;
    }

  return c1 - c2;
}

The above code shows that c1 - c2 is returned. If the parameter n is 0, the function therefore returns '\0' - '\0' = 0.

Jacobina answered 3/4, 2023 at 14:49 Comment(2)
This is a language-lawyer question, meaning the answer should be determined by the language of the standard. What a particular implementation does is not evidence of what the standard requires, except to the extent it may cast light on ambiguity or interpretation. Implementations often extend the behavior required by the C standard, so the fact an implementation provides certain behavior does not indicate the standard required it.Biddle
@EricPostpischil I totally agree with you. That is why I begun my answer wih "as a side note" : to provide an implementation example. By the way, this confirms the examples given by the OP who made some tries with GLIBC (last part of his question).Jacobina

© 2022 - 2024 — McMap. All rights reserved.