Remove spaces from a string in C
Asked Answered
M

16

59

What is the easiest and most efficient way to remove spaces from a string in C?

Materiel answered 13/11, 2009 at 0:6 Comment(2)
Easiest and most efficient are not necessarily the sameUnsteel
@JimFell the title of that question is (was) very misleading: it's just about removing spaces in the beginningDefinitely
L
110

Easiest and most efficient don't usually go together…

Here's a possible solution for in-place removal:

void remove_spaces(char* s) {
    char* d = s;
    do {
        while (*d == ' ') {
            ++d;
        }
    } while (*s++ = *d++);
}
Lunetta answered 13/11, 2009 at 0:11 Comment(14)
What happens if the input source was initialized from a string literal?Disclaimer
@Suppressingfire: assuming you mean RemoveSpaces("blah");, and not char a[] = "blah"; RemoveSpaces(a);, then undefined behaviour. But that's not the fault of this code. It is not recommended to pass a read-only string to a function which is documented to modify the string passed to it (by, for example, removing spaces) ;-)Knownothing
I think you should do *i = '\0'; in the end.Obbard
*i = 0 and *i = '\0' is the same :)Marnamarne
Could slightly simplify with do { *i = *j; if(*i != ' ') i++; } while(*j++ != 0). Then no need for final *i = 0;Inshore
Could simplify further by using source for j and ditching the j entirely. I see no reason to preserve the inbound source value unless the function is to return the original base address (such as the family of str functions in the standard library, which itself may not be such a bad idea).Ockeghem
How... How is this working? I am new to C and pointers so a walkthrough of what's happening would be much appreciatedAuditory
@Auditory take two pointers to the same address. while the pointer passed into the function (s) points to a space char, just increment s. This in effect moves past 1 or more space characters, but that doesn't touch the other pointer we created (d). then after that operation, you assign the d's value (which is still pointing just past the last non-space, or the beginning element) to the value of the s, which we caused to "skip over" all the space characters. after this assignment happens, we then increment both pointers. clever. needs double parens around the assignment thoughCelestinacelestine
@JoeMcDonagh Can you tell me on what conditions the while loop is going to break? After each iteration, s and d pointers increments. When d reaches '\0' then it assigns '\0' to where the s is pointing to and after that both s and d increments by one location after that how the while terminates?Hegarty
@Hegarty assignment returns the assigned value normally in languages like C, so when you hit the end of the string and assign the null character, it will be interpreted as falsy and the loop will break. I love this piece of code. It just so dense and beautiful!Celestinacelestine
@JoeMcDonagh Yeah it's a clever piece of code. My doubt is, let's say I have a string " blah" which has a 3 leading whitespaces, and I run the above function on this string after the first iteration d pointer will be ahead of the s, now in each iteration *s = *d first then s++ and d++, then if *s is not falsy then do{} executes again that's how the code works I suppose. So, after each assignment *s = *d and as d is ahead, and while() will break only when *s is falsy which is checked after s++ and d++ so don't you think d will go out-of-bound?Hegarty
@Hegarty first off just want to point out that increment operator takes precedence over assignment. what i would suggest you do to understand this piece of code is add printf statements to each iteration of the loop to show you the values in the pointers on each iteration.Celestinacelestine
The breaking condition for this do-while loop comes from "An assignment expression has the value of the left operand after the assignment."Wroughtup
If you're having trouble figuring how this snippet works, add printf("%zu %c = %zu %c \n", s, *s, d, *d) before the } while and check it out! Also, for the breaking condition, NULL gets assigned to s from d and hence returned. We have while(NULL) and it breaks.Derk
P
23

As we can see from the answers posted, this is surprisingly not a trivial task. When faced with a task like this, it would seem that many programmers choose to throw common sense out the window, in order to produce the most obscure snippet they possibly can come up with.

Things to consider:

  • You will want to make a copy of the string, with spaces removed. Modifying the passed string is bad practice, it may be a string literal. Also, there are sometimes benefits of treating strings as immutable objects.
  • You cannot assume that the source string is not empty. It may contain nothing but a single null termination character.
  • The destination buffer can contain any uninitialized garbage when the function is called. Checking it for null termination doesn't make any sense.
  • Source code documentation should state that the destination buffer needs to be large enough to contain the trimmed string. Easiest way to do so is to make it as large as the untrimmed string.
  • The destination buffer needs to hold a null terminated string with no spaces when the function is done.
  • Consider if you wish to remove all white space characters or just spaces ' '.
  • C programming isn't a competition over who can squeeze in as many operators on a single line as possible. It is rather the opposite, a good C program contains readable code (always the single-most important quality) without sacrificing program efficiency (somewhat important).
  • For this reason, you get no bonus points for hiding the insertion of null termination of the destination string, by letting it be part of the copying code. Instead, make the null termination insertion explicit, to show that you haven't just managed to get it right by accident.

What I would do:

void remove_spaces (char* restrict str_trimmed, const char* restrict str_untrimmed)
{
  while (*str_untrimmed != '\0')
  {
    if(!isspace(*str_untrimmed))
    {
      *str_trimmed = *str_untrimmed;
      str_trimmed++;
    }
    str_untrimmed++;
  }
  *str_trimmed = '\0';
}

In this code, the source string "str_untrimmed" is left untouched, which is guaranteed by using proper const correctness. It does not crash if the source string contains nothing but a null termination. It always null terminates the destination string.

Memory allocation is left to the caller. The algorithm should only focus on doing its intended work. It removes all white spaces.

There are no subtle tricks in the code. It does not try to squeeze in as many operators as possible on a single line. It will make a very poor candidate for the IOCCC. Yet it will yield pretty much the same machine code as the more obscure one-liner versions.

When copying something, you can however optimize a bit by declaring both pointers as restrict, which is a contract between the programmer and the compiler, where the programmer guarantees that the destination and source are not the same address. This allows more efficient optimization, since the compiler can then copy straight from source to destination without temporary memory in between.

Pibgorn answered 21/5, 2015 at 11:40 Comment(6)
Why use the restrict keyword? There is no reason you shouldn't be able to pass the same pointer as source and destination, and your code supports that.Linker
@Linker It could be removed, certainly, at the expense of slower code in the generic use-case. I don't think I have benchmarked this code, but I suspect it shouldn't make that much of a difference.Pibgorn
This is the most sensible answer I've seen. It's clear, concise and well understandable by a beginner! Thank you.Humiliate
I'd replace str_untrimmed by scattered and str_trimmed by condensed.Definitely
@Definitely Good for you. Now kindly stop vandalizing people's posts with minor superfluous edits or to change the coding style to your personal preference. You apparently have too high rep for getting edits reviewed, or you'd have an edit ban incoming.Pibgorn
@Pibgorn Thanks for letting me know that this kind of edits is considered harmful. Nevertheless, trim is the word that indicates (in most languages/libraries) removing leading and trailing spaces from strings.Definitely
C
21

Here's a very compact, but entirely correct version:

do while(isspace(*s)) s++; while(*d++ = *s++);

And here, just for my amusement, are code-golfed versions that aren't entirely correct, and get commenters upset.

If you can risk some undefined behavior, and never have empty strings, you can get rid of the body:

while(*(d+=!isspace(*s++)) = *s);

Heck, if by space you mean just space character:

while(*(d+=*s++!=' ')=*s);

Don't use that in production :)

Chabot answered 13/11, 2009 at 0:30 Comment(6)
Interesting, the first two function on my machine. But I guess all of these are undefined, since using s++ and *s in one statement results in undefined behavior?Ivatts
make sure that you aren't beyond the end of the string when dereferencing it.Pol
@Andomar: First one is completely safe and sound. Last two are sketchy indeed (tested in GCC4.2).Chabot
Calling it "sound" is perhaps a bit too polite. All 3 versions are completely unreadable, for no performance gained. Apple agrees that braces are unnecessary. I mean, what is many million dollars in losses and all the programmers in the world laughing at you, compared to the sheer agony involved in writing braces?Pibgorn
Why is it necessary to risk undefined behaviour, when you could solve that risk using the comma operator and a for loop?Auteur
stumbled upon this tonight looking for the most efficient way to do it as an exercise- i love it! (the first one i mean). parens around the assignment will squash warnings.Celestinacelestine
I
9

In C, you can replace some strings in-place, for example a string returned by strdup():

char *str = strdup(" a b c ");

char *write = str, *read = str;
do {
   if (*read != ' ')
       *write++ = *read;
} while (*read++);

printf("%s\n", str);

Other strings are read-only, for example those declared in-code. You'd have to copy those to a newly allocated area of memory and fill the copy by skipping the spaces:

char *oldstr = " a b c ";

char *newstr = malloc(strlen(oldstr)+1);
char *np = newstr, *op = oldstr;
do {
   if (*op != ' ')
       *np++ = *op;
} while (*op++);

printf("%s\n", newstr);

You can see why people invented other languages ;)

Ivatts answered 13/11, 2009 at 0:18 Comment(7)
Your second example forgets to properly terminate the destination string.Painless
..and your first example doesn't do the right thing at all (eg if the string starts off with two non-space characters).Painless
@caf: The while loop will run for the \0 terminator, because it's while (*(op++)) and not while (*(++op))Ivatts
That's true, which means the it's still buggy, because it skips the first character regardless of whether it's a space or not.Painless
You can common up the loop here: void copyExceptSpace(char*, const char*);, void removeSpace(char *s) { copyExceptSpace(s,s); }, char *dupExceptSpace(const char *s) { char *n = malloc(strlen(s)+1); if (n) copyExceptSpace(n,s); return n; }. Or something like that.Knownothing
Why mix the algorithm "remove spaces" together with memory allocation? There is no reason to do so. Avoid strdup() because it isn't standard. Never cast the result from malloc().Pibgorn
both these code examples leak memory. Fix by calling free() on the memory pointer returned strdup() /malloc(). That leak has been in these examples for nearly nine years.Fernand
A
3
#include <ctype>

char * remove_spaces(char * source, char * target)
{
     while(*source++ && *target)
     {
        if (!isspace(*source)) 
             *target++ = *source;
     }
     return target;
}

Notes;

  • This doesn't handle Unicode.
Alcoran answered 13/11, 2009 at 0:12 Comment(6)
Won't this skip the first character?Lunetta
You should cast the value passed to isspace to unsigned char, since that function is defined to accept a value either in the range of unsigned char, or EOF.Painless
It still removes the first character, and fails if it is called with target contating '\0' in its first element (I don't get what is the purpose of checking its contents). Changing the while(*source++ && *target) {...} to do {...} while(*source++); seems to work fine.Spontoon
Did you mean ctype.h?Dygal
1) Fails to remove an initial space in source. 2) Never appends a terminating null character to target if source == "". 3) Depends on value in target[0].Inshore
4) return target; ??Shadowy
A
2

if you are still interested, this function removes spaces from the beginning of the string, and I just had it working in my code:

void removeSpaces(char *str1)  
{
    char *str2; 
    str2=str1;  
    while (*str2==' ') str2++;  
    if (str2!=str1) memmove(str1,str2,strlen(str2)+1);  
}
Acerbate answered 23/6, 2013 at 8:14 Comment(0)
U
1
#include<stdio.h>
#include<string.h>
main()
{
  int i=0,n;
  int j=0;
  char str[]="        Nar ayan singh              ";
  char *ptr,*ptr1;
  printf("sizeof str:%ld\n",strlen(str));
  while(str[i]==' ')
   {
     memcpy (str,str+1,strlen(str)+1);
   }
  printf("sizeof str:%ld\n",strlen(str));
  n=strlen(str);
  while(str[n]==' ' || str[n]=='\0')
    n--;
  str[n+1]='\0';
  printf("str:%s ",str);
  printf("sizeof str:%ld\n",strlen(str));
}
Unconstitutional answered 3/4, 2014 at 6:20 Comment(2)
strlen returns size_t. So use %zu, not %ld. And use int main() along with return 0;Dygal
Additionally, memcpy is unsuitable for copying overlapping regions of memory. Use memmove instead.Auteur
A
1

The easiest and most efficient way to remove spaces from a string is to simply remove the spaces from the string literal. For example, use your editor to 'find and replace' "hello world" with "helloworld", and presto!

Okay, I know that's not what you meant. Not all strings come from string literals, right? Supposing this string you want spaces removed from doesn't come from a string literal, we need to consider the source and destination of your string... We need to consider your entire algorithm, what actual problem you're trying to solve, in order to suggest the simplest and most optimal methods.

Perhaps your string comes from a file (e.g. stdin) and is bound to be written to another file (e.g. stdout). If that's the case, I would question why it ever needs to become a string in the first place. Just treat it as though it's a stream of characters, discarding the spaces as you come across them...

#include <stdio.h>

int main(void) {
    for (;;) {
        int c = getchar();
        if (c == EOF) { break;    }
        if (c == ' ') { continue; }
        putchar(c);
    }
}

By eliminating the need for storage of a string, not only does the entire program become much, much shorter, but theoretically also much more efficient.

Auteur answered 21/5, 2015 at 9:44 Comment(2)
The question does not mention string literals at all. But you have to assume that a string literal can be passed to the function. And what if the input comes from somewhere else, for example you are writing some sort of text parser.Pibgorn
When questioning the efficiency of a program we must consider the entire program, not just a small part of it. That is what I'm trying to get across here, and I think you missed that point, @Lundin.Auteur
S
1
/* Function to remove all spaces from a given string.
   https://www.geeksforgeeks.org/remove-spaces-from-a-given-string/
*/
void remove_spaces(char *str)
{
    int count = 0;
    for (int i = 0; str[i]; i++)
        if (str[i] != ' ')
            str[count++] = str[i];
    str[count] = '\0';
}
Shown answered 25/4, 2021 at 16:39 Comment(1)
Change the type of count and i to size_t and you will have a clean and robust solution.Linker
B
0

Code taken from zString library

/* search for character 's' */
int zstring_search_chr(char *token,char s){
        if (!token || s=='\0')
        return 0;

    for (;*token; token++)
        if (*token == s)
            return 1;

    return 0;
}

char *zstring_remove_chr(char *str,const char *bad) {
    char *src = str , *dst = str;

    /* validate input */
    if (!(str && bad))
        return NULL;

    while(*src)
        if(zstring_search_chr(bad,*src))
            src++;
        else
            *dst++ = *src++;  /* assign first, then incement */

    *dst='\0';
    return str;
}

Code example

  Exmaple Usage
      char s[]="this is a trial string to test the function.";
      char *d=" .";
      printf("%s\n",zstring_remove_chr(s,d));

  Example Output
      thisisatrialstringtotestthefunction

Have a llok at the zString code, you may find it useful https://github.com/fnoyanisi/zString

Bedazzle answered 18/2, 2016 at 23:46 Comment(5)
Why do you repeatedly check if the passed parameter is NULL, over and over again? The superfluous NULL check makes this the least efficient version of all posted. Why not use standard strpbrk instead of your home-brewed version? And where is the const correctness?Pibgorn
Right, the first if statement could be removed and checkes could well be done within the logical test part of the for loop, thanks for that, I will look into that.....>> Why not use standard strpbrk instead of your home-brewed version? Just wrote this code (the whole zString thing) for fun, and tried not to use standard functions at all. So, no harm to say it is a fun project, but this of course should not stop anybody from contributing the codeBedazzle
unlike what the comment says, zstring_search_chr does not return the index of a chr, its char* argument should be const qualified. The function zstring_remove_chr is quite inefficient.Linker
@chqrlie, updated comments and the code for zstring_remove_chr(). I would love to see a more efficient version of zstring_remove_chr() or some recommendations from you. thanksBedazzle
You could post the code on codereview.stackexchange.com . I shall write a review if you do. There is indeed a few ideas for improvement.Linker
A
0

That's the easiest I could think of (TESTED) and it works!!

char message[50];
fgets(message, 50, stdin);
for( i = 0, j = 0; i < strlen(message); i++){
        message[i-j] = message[i];
        if(message[i] == ' ')
            j++;
}
message[i] = '\0';
Annul answered 26/12, 2016 at 7:30 Comment(0)
F
0

Here is the simplest thing i could think of. Note that this program uses second command line argument (argv[1]) as a line to delete whitespaces from.

#include <string.h>
#include <stdio.h>
#include <stdlib.h>

/*The function itself with debug printing to help you trace through it.*/

char* trim(const char* str)
{
    char* res = malloc(sizeof(str) + 1);
    char* copy = malloc(sizeof(str) + 1);
    copy = strncpy(copy, str, strlen(str) + 1);
    int index = 0;

    for (int i = 0; i < strlen(copy) + 1; i++) {
        if (copy[i] != ' ')
        {
            res[index] = copy[i];
            index++;
        }
        printf("End of iteration %d\n", i);
        printf("Here is the initial line: %s\n", copy);
        printf("Here is the resulting line: %s\n", res);
        printf("\n");
    }
    return res;
}

int main(int argc, char* argv[])
{
    //trim function test

    const char* line = argv[1];
    printf("Here is the line: %s\n", line);

    char* res = malloc(sizeof(line) + 1);
    res = trim(line);

    printf("\nAnd here is the formatted line: %s\n", res);

    return 0;
}
Frentz answered 24/4, 2020 at 15:5 Comment(0)
C
0

While this is not as concise as the other answers, it is very straightforward to understand for someone new to C, adapted from the Calculix source code.

char* remove_spaces(char * buff, int len)
{
    int i=-1,k=0;
    while(1){
        i++;
        if((buff[i]=='\0')||(buff[i]=='\n')||(buff[i]=='\r')||(i==len)) break;
        if((buff[i]==' ')||(buff[i]=='\t')) continue;
        buff[k]=buff[i];
        k++;
    }
    buff[k]='\0';
    return buff;
}
Calumny answered 6/7, 2021 at 4:46 Comment(0)
I
-1

I assume the C string is in a fixed memory, so if you replace spaces you have to shift all characters.

The easiest seems to be to create new string and iterate over the original one and copy only non space characters.

Infusible answered 13/11, 2009 at 0:14 Comment(0)
S
-1

I came across a variation to this question where you need to reduce multiply spaces into one space "represent" the spaces.

This is my solution:

char str[] = "Put Your string Here.....";

int copyFrom = 0, copyTo = 0;

printf("Start String %s\n", str);

while (str[copyTo] != 0) {
    if (str[copyFrom] == ' ') {
        str[copyTo] = str[copyFrom];
        copyFrom++;
        copyTo++;

        while ((str[copyFrom] == ' ') && (str[copyFrom] !='\0')) {
            copyFrom++;
        }
    }

    str[copyTo] = str[copyFrom];

    if (str[copyTo] != '\0') {
        copyFrom++;
        copyTo++;
    }
}

printf("Final String %s\n", str);

Hope it helps :-)

Sydelle answered 8/4, 2017 at 19:44 Comment(0)
E
-1

This is implemented in micro controller and it works, it should avoid all problems and it is not a smart way of doing it, but it will work :)

void REMOVE_SYMBOL(char* string, uint8_t symbol)
{
  uint32_t size = LENGHT(string); // simple string length function, made my own, since original does not work with string of size 1
  uint32_t i = 0;
  uint32_t k = 0;
  uint32_t loop_protection = size*size; // never goes into loop that is unbrakable
  while(i<size)
  {
    if(string[i]==symbol)
    {
      k = i;
      while(k<size)
      {
        string[k]=string[k+1];
        k++;
      }
    }
    if(string[i]!=symbol)
    {
      i++;
    }
    loop_protection--;
    if(loop_protection==0)
    {
      i = size;
      break;
    }
  }
}
Eos answered 9/6, 2021 at 19:57 Comment(1)
This is implemented in micro controller and it works: I'm afraid not. This solution is very inefficient (quadratic time complexity) and incorrect: instead of setting the null terminator, it duplicates the last byte of the string. The loop_protection kludge was added to attempt to fix the infinite loop on strings containing only spaces. It does not fix the issue and may even backfire on long strings. Study much simpler solutions from the other answers.Linker

© 2022 - 2024 — McMap. All rights reserved.