Creating my own strcmp () function in C
Asked Answered
S

6

1

I was assigned by my teacher to write my own strcmp() function in C. I did create my own version of said function, and I was hoping to get some feedback.

int CompareTwoStrings ( char *StringOne, char *StringTwo ) {
    // Evaluates if both strings have the same length.
    if  ( strlen ( StringOne ) != strlen ( StringTwo ) ) {
        // Given that the strings have an unequal length, it compares between both
        // lengths.
        if  ( strlen ( StringOne ) < strlen ( StringTwo ) ) {
            return ( StringOneIsLesser );
        }
        if  ( strlen ( StringOne ) > strlen ( StringTwo ) ) {
            return ( StringOneIsGreater );
        }
    }
    int i;
    // Since both strings are equal in length...
    for ( i = 0; i < strlen ( StringOne ); i++ ) {
        // It goes comparing letter per letter.
        if  ( StringOne [ i ] != StringTwo [ i ] ) {
            if  ( StringOne [ i ] < StringTwo [ i ] ) {
                return ( StringOneIsLesser );
            }
            if  ( StringOne [ i ] > StringTwo [ i ] ) {
                return ( StringOneIsGreater );
            }
        }
    }
    // If it ever reaches this part, it means they are equal.
    return ( StringsAreEqual );
}

StringOneIsLesser, StringOneIsGreater, StringsAreEqual are defined as const int with the respective values: -1, +1, 0.

Thing is, I'm not exactly sure if, for example, my StringOne has a lesser length than my StringTwo, that automatically means StringTwo is greater, because I don't know how strcmp() is particularly implemented. I need some of your feedback for that.

Squall answered 9/4, 2014 at 20:52 Comment(11)
So what is the question?Hierocracy
To get some feedback from the code I posted. I thought I made that clear.Squall
This is not the best site for general feedback. CodeReview.stackexchange.com is good for that. Given that, the question I answered was "If StringOne has a lesser length than my StringTwo, [does] that automatically mean StringTwo is greater?"Henotheism
Oh, sorry, I didn't know that. Also, sorry for the grammar problems, english is not my native language (I'm from Chile).Squall
Aren't you overcomplicating the problem quite a bit? A standard strcmp implementation body can easily be written in 3 lines... move the two pointers forward while they point to identical characters (being careful to stop at the string terminators), and then just compare the characters you reached, which will be the first differing.Babbage
As I said, i was trying to avoid the use of pointers to compare because I still don't feel confortable using pointers. Perhaps I should have made that clear since the beginning.Squall
This scans both strings to full length, possibly multiple times, whereas an orthodox strcmp() typically only has to look at the first character to find a difference. That's a big performance hit if the strings are many kilobytes in length.Cube
Just a tip: dont use strlen(string1) inside the for loop. It is very bad for the speed. Use an aux variable such as: int size = strlen(string1) and then use that variable on the loop. (for(i=0;i<size);i++)Mendymene
Excuse me if I misunderstand you @jofra, but what you are implying is that if I use 'strlen (string)' inside the for loop, the code MUST calculate a new value for 'strlen (string)' after every increment of 'i', even though it remains the same value? Because if that's the reason I totally get why it slows down the execution of the code.Squall
Yes. It calculates it every loop and so it is quadratic (O(N^2)) instead of being linear O(N). In other words it runs much slower (really allot when n, the number of characters of the string, is big), if you have it calculating strlen(string) every time it loops.Mendymene
abhilekhblogs.blogspot.com/2021/01/…Penetrate
G
7

So much for such a simple task. I believe something simple as this would do:

int my_strcmp(const char *a, const char *b)
{
    while (*a && *a == *b) { ++a; ++b; }
    return (int)(unsigned char)(*a) - (int)(unsigned char)(*b);
}
Gram answered 9/4, 2014 at 21:10 Comment(10)
I would cast to int before subtracting to avoid overflows/underflows. Also, the arguments should be const.Babbage
If you allow me to be completely honest, I have no idea what that code does. I was trying to avoid to use pointers because I'm still quite green on the use of pointers. My idea of comparing both strings was comparing them letter by letter.Squall
Note that the 'cast to int' suggestion is particularly crucial if char is an unsigned type.Cube
All primitives are signed unless told otherwise, so I don't think its an issue. In addition, strcmp is supposed to compare readable text and don't need to be binary safe, so unexpected behaviour envolving arithmetics with "negative" symbols is something not to worry with considering you are using strcmp to perform the task it was designed for -- compare and sort text.Gram
while (*a && *a == *b) { ++a; ++b; } would do.Koziarz
@Gram UTF8 text is being more applied these days and used with strcmp(), so char is in the range -128 to -1 (or (128 to 255) in more common than it use to be,Koziarz
The functionality of strcmp() does not depend on the sign-ness of char. strcmp() is specified to perform as if "each character shall be interpreted as if it had the type unsigned char". C11 7.23.1. So unsigned char casts are good. This code compiles with strcmp() as long as 1) int much more range than unsigned char (usually this is true) and 2) char uses 2's complement (usually this is true too). But is does fail to match the const part of the strcmp() signature.Koziarz
@Gram could you please explain why int my_strcmp(char *a, char *b){ return (int)(*a) - (int)(*b);} won't work as as simplified function? It seems to work with all the inputs I have tried...Forborne
@user193203821309 Because you're only testing the first byte in the string.Gram
@Gram , "All primitives are signed unless told otherwise," --> not quite. char range matches signed char or unsigned char. It is implementation defined. This answer has the last 2 of 4 problems.Koziarz
H
3

strcmp compares alphabetically: so "aaa" < "b" even though "b" is shorter.

Because of this, you can skip the length check and just do the letter by letter comparison. If you get to a NULL character while both strings are equal so far, then the shorter one is the lesser one.

Also: make StringsAreEqual == 0, not 1 for compatibility with standard sorting functions.

Henotheism answered 9/4, 2014 at 20:56 Comment(0)
K
2

strcmp() is fairly easy to code. The usual mis-codings issues include:

Parameter type

strcmp(s1,s2) uses const char * types, not char *. This allows the function to be called with pointers to const data. It conveys to the user the function's non-altering of data. It can help with optimization.

Sign-less compare

All str...() function perform as if char was unsigned char, even if char is signed. This readily affects the result when strings differ and a character outside the range [1...CHAR_MAX] is found.

Range

On select implementations, the range of unsigned char minus unsigned char is outside the int range. Using 2 compares (a>b) - (a-b) avoids any problem rather than a-b;. Further: many compilers recognized that idiom and emit good code.

int my_strcmp(const char *s1, const char *s2) {
  // All compares done as if `char` was `unsigned char`
  const unsigned char *us1 = (const unsigned char *) s1;
  const unsigned char *us2 = (const unsigned char *) s2;

  // As long as the data is the same and '\0' not found, iterate
  while (*us1 == *us2 && *us1 != '\0') {
    us1++;
    us2++;
  }

  // Use compares to avoid any mathematical overflow 
  // (possible when `unsigned char` and `unsigned` have the same range).
  return (*us1 > *us2) - (*us1 < *us2);
}

Dinosaur computers

Machines that use a signed char and non-2's complement, the following can be wrong or a trap with *s1 != '\0'. Such machines could have a negative 0 - which does not indicate the end of a string, yet quits the loop. Using unsigned char * pointers solves that.

int my_strcmp(const char *s1, const char *s2) {
  while (*s1 == *s2 && *s1 != '\0') { // Error!
    s1++;
    s2++;
  }
Koziarz answered 10/1, 2019 at 20:33 Comment(0)
A
1
    int mystrncmp(const char * str1, const char * str2, unsigned int n)
     {
      while (*str1 == *str2) {
          if (*str1 == '\0' || *str2 == '\0')
             break;

          str1++;
          str2++;
       }


   if (*str1 == '\0' && *str2 == '\0')
      return 0;
   else
      return -1;
}
Avron answered 28/9, 2016 at 16:1 Comment(3)
While this code snippet may solve the question, including an explanation really helps to improve the quality of your post. Remember that you are answering the question for readers in the future, and those people might not know the reasons for your code suggestion.Booking
What's the purpose of the third function parameter: unsigned int n?Clue
This mystrncmp() never return a positive value , unlike strcmp(), so unexpectedly not useful for sorting. n parameter used - if meant for limiting compare range, should be size_t.Koziarz
H
0

Try this also for your better understanding:

#include <stdio.h>
#include <string.h>

int main(void)
{
    char string1[20], string2[20];
    int i=0,len=0, count=0;
    puts("enter the stirng one to compare");
    fgets(string1, sizeof(string1), stdin);
    len = strlen(string1);
    if(string1[len-1]=='\n')
    string1[len-1]='\0';

    puts("enter the stirng two to compare");
    fgets(string2, sizeof(string2), stdin);
    len = strlen(string2);
    if(string2[len-1]=='\n')
    string2[len-1]='\0';
    if(strlen(string1)==strlen(string2))
    {
    for(i=0;string1[i]!='\0', string2[i]!='\0', i<strlen(string1);i++)
    {
        count=string1[i]-string2[i];
        count+=count;
    }
        if(count==0)
            printf("strings are equal");
        else if(count<0)
            printf("string1 is less than string2");
        else if(count>0)
            printf("string2 is less than string1");
    }

    if(strlen(string1)<strlen(string2))
    {
    for(i=0;string1[i]!='\0', i<strlen(string1);i++)
    {
        count=string1[i]-string2[i];
        count+=count;
    }
        if(count==0)
            printf("strings are equal");
        else if(count<0)
            printf("string1 is less than string2");
        else if(count>0)
            printf("string2 is less than string1");
    }

    if(strlen(string1)>strlen(string2))
    {
    for(i=0;string2[i]!='\0', i<strlen(string2);i++)
    {
        count=string1[i]-string2[i];
        count+=count;
    }
        if(count==0)
            printf("strings are equal");
        else if(count<0)
            printf("string1 is less than string2");
        else if(count>0)
            printf("string2 is less than string1");
    }


    return 0;
}
Humane answered 5/9, 2014 at 19:34 Comment(0)
H
0
bool str_cmp(char* str1,char* str2)
{
    if (str1 == nullptr || str2 == nullptr)
        return false;


    const int size1 = str_len_v(str1);
    const int size2 = str_len_v(str2);

    if (size1 != size2)
        return false;

    for(int i=0;str1[i] !='\0' && str2[i] !='\0';i++)
    {
        if (str1[i] != str2[i])
            return false;
    }

    return true;
}
Hartzke answered 2/9, 2019 at 21:23 Comment(1)
Would be nice if you explain your solutionTherefor

© 2022 - 2024 — McMap. All rights reserved.