How do I trim leading/trailing whitespace in a standard way?
Asked Answered
F

40

210

Is there a clean, preferably standard method of trimming leading and trailing whitespace from a string in C? I'd roll my own, but I would think this is a common problem with an equally common solution.

Furlana answered 23/9, 2008 at 17:57 Comment(0)
G
205

If you can modify the string:

// Note: This function returns a pointer to a substring of the original string.
// If the given string was allocated dynamically, the caller must not overwrite
// that pointer with the returned value, since the original pointer must be
// deallocated using the same allocator with which it was allocated.  The return
// value must NOT be deallocated using free() etc.
char *trimwhitespace(char *str)
{
  char *end;

  // Trim leading space
  while(isspace((unsigned char)*str)) str++;

  if(*str == 0)  // All spaces?
    return str;

  // Trim trailing space
  end = str + strlen(str) - 1;
  while(end > str && isspace((unsigned char)*end)) end--;

  // Write new null terminator character
  end[1] = '\0';

  return str;
}

If you can't modify the string, then you can use basically the same method:

// Stores the trimmed input string into the given output buffer, which must be
// large enough to store the result.  If it is too small, the output is
// truncated.
size_t trimwhitespace(char *out, size_t len, const char *str)
{
  if(len == 0)
    return 0;

  const char *end;
  size_t out_size;

  // Trim leading space
  while(isspace((unsigned char)*str)) str++;

  if(*str == 0)  // All spaces?
  {
    *out = 0;
    return 1;
  }

  // Trim trailing space
  end = str + strlen(str) - 1;
  while(end > str && isspace((unsigned char)*end)) end--;
  end++;

  // Set output size to minimum of trimmed string length and buffer size minus 1
  out_size = (end - str) < len-1 ? (end - str) : len-1;

  // Copy trimmed string and add null terminator
  memcpy(out, str, out_size);
  out[out_size] = 0;

  return out_size;
}
Goltz answered 23/9, 2008 at 18:12 Comment(18)
Sorry, the first answer isn't good at all unless you don't care about memory leaks. You now have two overlapping strings (the original, which has it's trailing spaces trimmed, and the new one). Only the original string can be freed, but if you do, the second one points to freed memory.Ez
@nvl: There is no memory being allocated, so there is no memory to free.Goltz
@nvl: No. str is a local variable, and changing it does not change the original pointer being passed in. Function calls in C are always pass-by-value, never pass-by-reference.Goltz
@Adam: yes, actually. sorry for bugging. i didn't realize that you were just changing str not *str. Apologies.Heterogenous
if str len is zero, end is str-1 and standard does not give any grantee that it'll be valid pointer.c-faq.com/aryptr/non0based.htmlDebra
out[out_size] = 0; should be out[out_size + 1] = 0;Cirrose
@Ernest: No, it shouldn't be. That line is correct as written.Goltz
This first function is completely wrong. Any one who is looking for 'trim' method should avoid this. The first function will return the address that is different from the one that is passed as the argument. this is not right..as that is not a valid address to use in 'free' statement. The developer should trim the leading white space and move the data to beginning of the string.Excitation
@Raj: There's nothing inherently wrong with returning a different address from the one that was passed in. There's no requirement here that the returned value be a valid argument of the free() function. Quite the opposite -- I designed this to avoid the need for memory allocation for efficiency. If the passed in address was allocated dynamically, then the caller is still responsible for freeing that memory, and the caller needs to be sure not to overwrite that value with the value returned here.Goltz
BTW end > str is not needed as the while loop will terminate should it get to isspace(*str), which is known to be false.Burthen
i think the while loop condition should be while(end >= str && ...) end--;, consider the input string is "\n" itself, the output should be "\0", right?Daudet
@vivisidea: If the string is just "\n", then that will get caught by the if(*str == 0) // All spaces? check and not get to the second while loop. As for > vs. >=, it's already known at that point that !isspace(*str), so both would result in correct behavior, but using > avoids calling isspace() an extra time unnecessarily.Goltz
You have to cast the argument for isspace to unsigned char, otherwise you invoke undefined behavior.Sweeper
Doesn't the first example violate aliasing rules? Should a strdup have to occur to avoid it?Emersen
trimwhitespace(char *out, size_t len, const char *str) is UB when out, str overlap. Yet this is fixed with a small change: memcpy(out, str, out_size); --> memmove(out, str, out_size);.Burthen
while(end > str && isspace((unsigned char)*end)) end--; may be simplified to while(isspace((unsigned char)*end)) end--; as isspace((unsigned char)*str) is false.Burthen
if(*str == 0) // All spaces? { *out = 0; return 1; makes results inconsistent. trimwhitespace(out, 10, "x")` and trimwhitespace(out, 10, "")` both return 1. I'd expect 1,0.Burthen
Minor: size_t len would be more clear as size_t size or n. len looks to close to the string length as from strlen(), where the size of the buffer out needs to be passed in.Burthen
S
46

Here's one that shifts the string into the first position of your buffer. You might want this behavior so that if you dynamically allocated the string, you can still free it on the same pointer that trim() returns:

char *trim(char *str)
{
    size_t len = 0;
    char *frontp = str;
    char *endp = NULL;
    
    if( str == NULL ) { return NULL; }
    if( str[0] == '\0' ) { return str; }
    
    len = strlen(str);
    endp = str + len;
    
    /* Move the front and back pointers to address the first non-whitespace
     * characters from each end.
     */
    while( isspace((unsigned char) *frontp) ) { ++frontp; }
    if( endp != frontp )
    {
        while( isspace((unsigned char) *(--endp)) && endp != frontp ) {}
    }
    
    if(frontp != str && endp == frontp )
    {
        // Empty string
        *(isspace((unsigned char) *endp) ? str : (endp + 1)) = '\0';
    }
    else if( str + len - 1 != endp )
            *(endp + 1) = '\0';
    
    /* Shift the string so that it starts at str so that if it's dynamically
     * allocated, we can still free it on the returned pointer.  Note the reuse
     * of endp to mean the front of the string buffer now.
     */
    endp = str;
    if( frontp != str )
    {
            while( *frontp ) { *endp++ = *frontp++; }
            *endp = '\0';
    }
    
    return str;
}

Test for correctness:

#include <stdio.h>
#include <string.h>
#include <ctype.h>

/* Paste function from above here. */

int main()
{
    /* The test prints the following:
    [nothing to trim] -> [nothing to trim]
    [    trim the front] -> [trim the front]
    [trim the back     ] -> [trim the back]
    [    trim front and back     ] -> [trim front and back]
    [ trim one char front and back ] -> [trim one char front and back]
    [ trim one char front] -> [trim one char front]
    [trim one char back ] -> [trim one char back]
    [                   ] -> []
    [ ] -> []
    [a] -> [a]
    [ a ] -> [a]
    [] -> []
    */

    char *sample_strings[] =
    {
            "nothing to trim",
            "    trim the front",
            "trim the back     ",
            "    trim front and back     ",
            " trim one char front and back ",
            " trim one char front",
            "trim one char back ",
            "                   ",
            " ",
            "a",
            " a ",
            "",
            NULL
    };
    char test_buffer[64];
    char comparison_buffer[64];
    size_t index, compare_pos;

    for( index = 0; sample_strings[index] != NULL; ++index )
    {
        // Fill buffer with known value to verify we do not write past the end of the string.
        memset( test_buffer, 0xCC, sizeof(test_buffer) );
        strcpy( test_buffer, sample_strings[index] );
        memcpy( comparison_buffer, test_buffer, sizeof(comparison_buffer));
        
        printf("[%s] -> [%s]\n", sample_strings[index],
                                 trim(test_buffer));
        
        for( compare_pos = strlen(comparison_buffer);
             compare_pos < sizeof(comparison_buffer);
             ++compare_pos )
        {
            if( test_buffer[compare_pos] != comparison_buffer[compare_pos] )
            {
                printf("Unexpected change to buffer @ index %u: %02x (expected %02x)\n",
                    compare_pos, (unsigned char) test_buffer[compare_pos], (unsigned char) comparison_buffer[compare_pos]);
            }
        }
    }

    return 0;
}

Source file was trim.c. Compiled with cc -Wall trim.c -o trim.

Shanteshantee answered 23/9, 2008 at 18:48 Comment(6)
You have to cast the argument for isspace to unsigned char, otherwise you invoke undefined behavior.Sweeper
@RolandIllig: Thanks, I never realized that was necessary. Fixed it.Shanteshantee
@Simas: Why do you say that? The function calls isspace() so why would there be a difference between " " and "\n"? I added unit tests for newlines and it looks OK to me... ideone.com/bbVmqoShanteshantee
@Shanteshantee it will access invalid memory block when manually alloced. Namely this line: *(endp + 1) = '\0';. The example test on the answer uses a buffer of 64 which avoids this problem.Crumhorn
@Crumhorn is correct. I like this answer best, but there is a bug on that line in the special case that the string is entirely composed of spaces. This can be verified by int main(int c, char** v) { char* ptr = strdup(" "); printf("trim is [%s]\n", trim(ptr)); free(ptr); return 0; } gcc -g test.c; valgrind a.out This is because when frontp moves all the way to the end of the string endp never moves back off the final null terminator and so writing to *(endp+1) is beyond the buffer. The fix is to switch the order of if and else if operations around line 22.Atalante
@nolandda: Thanks for the detail. I fixed it and updated the test to detect the buffer overrun since I don't have access to valgrind at the moment.Shanteshantee
B
26

My solution. String must be changeable. The advantage above some of the other solutions that it moves the non-space part to the beginning so you can keep using the old pointer, in case you have to free() it later.

void trim(char * s) {
    char * p = s;
    int l = strlen(p);

    while(isspace(p[l - 1])) p[--l] = 0;
    while(* p && isspace(* p)) ++p, --l;

    memmove(s, p, l + 1);
}   

This version creates a copy of the string with strndup() instead of editing it in place. strndup() requires _GNU_SOURCE, so maybe you need to make your own strndup() with malloc() and strncpy().

char * trim(char * s) {
    int l = strlen(s);

    while(isspace(s[l - 1])) --l;
    while(* s && isspace(* s)) ++s, --l;

    return strndup(s, l);
}
Bordereau answered 23/9, 2008 at 20:42 Comment(3)
trim() invokes UB if s is "" as the first isspace() call would be isspace(p[-1]) and p[-1] does not necessarily reference a legal location.Burthen
You have to cast the argument for isspace to unsigned char, otherwise you invoke undefined behavior.Sweeper
should add if(l==0)return; to avoid zero-length strEllingson
C
11

Here's my C mini library for trimming left, right, both, all, in place and separate, and trimming a set of specified characters (or white space by default).

contents of strlib.h:

#ifndef STRLIB_H_
#define STRLIB_H_ 1
enum strtrim_mode_t {
    STRLIB_MODE_ALL       = 0, 
    STRLIB_MODE_RIGHT     = 0x01, 
    STRLIB_MODE_LEFT      = 0x02, 
    STRLIB_MODE_BOTH      = 0x03
};

char *strcpytrim(char *d, // destination
                 char *s, // source
                 int mode,
                 char *delim
                 );

char *strtriml(char *d, char *s);
char *strtrimr(char *d, char *s);
char *strtrim(char *d, char *s); 
char *strkill(char *d, char *s);

char *triml(char *s);
char *trimr(char *s);
char *trim(char *s);
char *kill(char *s);
#endif

contents of strlib.c:

#include <strlib.h>

char *strcpytrim(char *d, // destination
                 char *s, // source
                 int mode,
                 char *delim
                 ) {
    char *o = d; // save orig
    char *e = 0; // end space ptr.
    char dtab[256] = {0};
    if (!s || !d) return 0;

    if (!delim) delim = " \t\n\f";
    while (*delim) 
        dtab[*delim++] = 1;

    while ( (*d = *s++) != 0 ) { 
        if (!dtab[0xFF & (unsigned int)*d]) { // Not a match char
            e = 0;       // Reset end pointer
        } else {
            if (!e) e = d;  // Found first match.

            if ( mode == STRLIB_MODE_ALL || ((mode != STRLIB_MODE_RIGHT) && (d == o)) ) 
                continue;
        }
        d++;
    }
    if (mode != STRLIB_MODE_LEFT && e) { // for everything but trim_left, delete trailing matches.
        *e = 0;
    }
    return o;
}

// perhaps these could be inlined in strlib.h
char *strtriml(char *d, char *s) { return strcpytrim(d, s, STRLIB_MODE_LEFT, 0); }
char *strtrimr(char *d, char *s) { return strcpytrim(d, s, STRLIB_MODE_RIGHT, 0); }
char *strtrim(char *d, char *s) { return strcpytrim(d, s, STRLIB_MODE_BOTH, 0); }
char *strkill(char *d, char *s) { return strcpytrim(d, s, STRLIB_MODE_ALL, 0); }

char *triml(char *s) { return strcpytrim(s, s, STRLIB_MODE_LEFT, 0); }
char *trimr(char *s) { return strcpytrim(s, s, STRLIB_MODE_RIGHT, 0); }
char *trim(char *s) { return strcpytrim(s, s, STRLIB_MODE_BOTH, 0); }
char *kill(char *s) { return strcpytrim(s, s, STRLIB_MODE_ALL, 0); }

The one main routine does it all. It trims in place if src == dst, otherwise, it works like the strcpy routines. It trims a set of characters specified in the string delim, or white space if null. It trims left, right, both, and all (like tr). There is not much to it, and it iterates over the string only once. Some folks might complain that trim right starts on the left, however, no strlen is needed which starts on the left anyway. (One way or another you have to get to the end of the string for right trims, so you might as well do the work as you go.) There may be arguments to be made about pipelining and cache sizes and such -- who knows. Since the solution works from left to right and iterates only once, it can be expanded to work on streams as well. Limitations: it does not work on unicode strings.

Conscientious answered 15/6, 2010 at 4:34 Comment(3)
I upvoted this and I know its old but I think there's a bug. dtab[*d] does not cast *d to unsigned int before using it as an array index. On a system with signed char this will read up to dtab[-127] which will cause bugs and possibly crash.Mcquillin
Potential undefined behavior on dtab[*delim++] because char index values must be cast to unsigned char. The code assumes 8-bit char. delim should be declared as const char *. dtab[0xFF & (unsigned int)*d] would clearer as dtab[(unsigned char)*d]. The code works on UTF-8 encoded strings, but will not strip non ASCII spacing sequences.Kory
@michael-plainer, this looks interesting. Why don't you test it and put it on GitHub?Altruistic
R
10

Here is my attempt at a simple, yet correct in-place trim function.

void trim(char *str)
{
    int i;
    int begin = 0;
    int end = strlen(str) - 1;

    while (isspace((unsigned char) str[begin]))
        begin++;

    while ((end >= begin) && isspace((unsigned char) str[end]))
        end--;

    // Shift all characters back to the start of the string array.
    for (i = begin; i <= end; i++)
        str[i - begin] = str[i];

    str[i - begin] = '\0'; // Null terminate string.
}
Rawdan answered 20/10, 2010 at 7:8 Comment(6)
Suggest changing to while ((end >= begin) && isspace(str[end])) to prevent UB when str is "". Prevents str[-1]`.Burthen
Btw, I have to change this to str[i - begin + 1] in order to worksArm
You have to cast the argument for isspace to unsigned char, otherwise you invoke undefined behavior.Sweeper
@RolandIllig, why would it be undefined behavior? The function is intended to work with chars.Malemute
@Malemute No, it isn't. The functions from <ctype.h> are intended to work with ints, which represent either unsigned char or the special value EOF. See https://mcmap.net/q/128878/-is-it-safe-to-call-the-functions-from-lt-cctype-gt-with-char-arguments/225757.Sweeper
@RolandIllig, with "intended" I meant that the practical use case for isspace() is to test if a character is a space. I can't think of a reason to call isspace() for an arbitrary int value. So I would expect that the implementation would work with chars, performing a cast to unsigned char internally if necessary. But you're right about what the standard says. Sometimes I forget how incredibly conservative and outdated the C standard is :(Malemute
B
9

Late to the trim party

Features:
1. Trim the beginning quickly, as in a number of other answers.
2. After going to the end, trimming the right with only 1 test per loop. Like @jfm3, but works for an all white-space string)
3. To avoid undefined behavior when char is a signed char, cast *s to unsigned char.

Character handling "In all cases the argument is an int, the value of which shall be representable as an unsigned char or shall equal the value of the macro EOF. If the argument has any other value, the behavior is undefined." C11 §7.4 1

#include <ctype.h>

// Return a pointer to the trimmed string
char *string_trim_inplace(char *s) {
  while (isspace((unsigned char) *s)) s++;
  if (*s) {
    char *p = s;
    while (*p) p++;
    while (isspace((unsigned char) *(--p)));
    p[1] = '\0';
  }

  // If desired, shift the trimmed string

  return s;
}

@chqrlie commented the above does not shift the trimmed string. To do so....

// Return a pointer to the (shifted) trimmed string
char *string_trim_inplace(char *s) {
  char *original = s;
  size_t len = 0;

  while (isspace((unsigned char) *s)) {
    s++;
  } 
  if (*s) {
    char *p = s;
    while (*p) p++;
    while (isspace((unsigned char) *(--p)));
    p[1] = '\0';
    // len = (size_t) (p - s);   // older errant code
    len = (size_t) (p - s + 1);  // Thanks to @theriver
  }

  return (s == original) ? s : memmove(original, s, len + 1);
}
Burthen answered 17/11, 2014 at 23:38 Comment(1)
Yay, finally someone who knows about the ctype undefined behavior.Sweeper
P
4

Here's a solution similar to @adam-rosenfields in-place modification routine but without needlessly resorting to strlen(). Like @jkramer, the string is left-adjusted within the buffer so you can free the same pointer. Not optimal for large strings since it does not use memmove. Includes the ++/-- operators that @jfm3 mentions. FCTX-based unit tests included.

#include <ctype.h>

void trim(char * const a)
{
    char *p = a, *q = a;
    while (isspace(*q))            ++q;
    while (*q)                     *p++ = *q++;
    *p = '\0';
    while (p > a && isspace(*--p)) *p = '\0';
}

/* See http://fctx.wildbearsoftware.com/ */
#include "fct.h"

FCT_BGN()
{
    FCT_QTEST_BGN(trim)
    {
        { char s[] = "";      trim(s); fct_chk_eq_str("",    s); } // Trivial
        { char s[] = "   ";   trim(s); fct_chk_eq_str("",    s); } // Trivial
        { char s[] = "\t";    trim(s); fct_chk_eq_str("",    s); } // Trivial
        { char s[] = "a";     trim(s); fct_chk_eq_str("a",   s); } // NOP
        { char s[] = "abc";   trim(s); fct_chk_eq_str("abc", s); } // NOP
        { char s[] = "  a";   trim(s); fct_chk_eq_str("a",   s); } // Leading
        { char s[] = "  a c"; trim(s); fct_chk_eq_str("a c", s); } // Leading
        { char s[] = "a  ";   trim(s); fct_chk_eq_str("a",   s); } // Trailing
        { char s[] = "a c  "; trim(s); fct_chk_eq_str("a c", s); } // Trailing
        { char s[] = " a ";   trim(s); fct_chk_eq_str("a",   s); } // Both
        { char s[] = " a c "; trim(s); fct_chk_eq_str("a c", s); } // Both

        // Villemoes pointed out an edge case that corrupted memory.  Thank you.
        // http://stackoverflow.com/questions/122616/#comment23332594_4505533
        {
          char s[] = "a     ";       // Buffer with whitespace before s + 2
          trim(s + 2);               // Trim "    " containing only whitespace
          fct_chk_eq_str("", s + 2); // Ensure correct result from the trim
          fct_chk_eq_str("a ", s);   // Ensure preceding buffer not mutated
        }

        // doukremt suggested I investigate this test case but
        // did not indicate the specific behavior that was objectionable.
        // http://stackoverflow.com/posts/comments/33571430
        {
          char s[] = "         foobar";  // Shifted across whitespace
          trim(s);                       // Trim
          fct_chk_eq_str("foobar", s);   // Leading string is correct

          // Here is what the algorithm produces:
          char r[16] = { 'f', 'o', 'o', 'b', 'a', 'r', '\0', ' ',                     
                         ' ', 'f', 'o', 'o', 'b', 'a', 'r', '\0'};
          fct_chk_eq_int(0, memcmp(s, r, sizeof(s)));
        }
    }
    FCT_QTEST_END();
}
FCT_END();
Prophet answered 22/12, 2010 at 1:47 Comment(6)
This solution is downright dangerous! If the original string does not contain any non-whitespace characters, the last line of trim happily overwrites whatever precedes a, if those bytes happen to contain 'whitespace' bytes. Compile this without optimizations and see what happens to y: unsigned x = 0x20202020; char s[4] = " "; unsigned y = 0x20202020; printf("&x,&s,&y = %p,%p,%p\n", &x, &s, &y); printf("x, [s], y = %08x, [%s], %08x\n", x, s, y); trim_whitespace(s); printf("x, [s], y = %08x, [%s], %08x\n", x, s, y);Beitris
@Villemoes, thank you for the bug report. I've updated the logic to avoid walking off the left side of the buffer when the string contains only whitespace. Does this new version address your concerns?Prophet
Language lawyers would probably shout at you for the mere thought of speculating about creating a pointer to the char preceding the one 'a' points to (which is what your '--p' will do). In the real world, you're probably ok. But you can also just change '>=' to '>' and move the decrement of p to 'isspace(*--p)'.Beitris
I think the lawyers would be okay as it's just comparing an address without touching it, but I do like your suggestion on the decrement too. I've updated it accordingly. Thanks.Prophet
The algorithm is wrong. Try stripping " foobar" and see what happens.Moussorgsky
doukremt, is your concern that the entire buffer after foobar is not filled with zeros? If so, it'd be quite a bit more helpful if you said so explicitly rather than throwing vague rocks.Prophet
F
4

Another one, with one line doing the real job:

#include <stdio.h>

int main()
{
   const char *target = "   haha   ";
   char buf[256];
   sscanf(target, "%s", buf); // Trimming on both sides occurs here
   printf("<%s>\n", buf);
}
F answered 11/6, 2014 at 17:16 Comment(2)
Good idea to use scanf; but his will only work with a single word which may not be what the OP wanted (i.e. trimming " a b c " should probably result in "a b c", while your single scanf just results in "a"). So we need a loop, and a counter for the skipped chars with the %n conversion specifier, and in the end it's just simpler to do it by hand, I'm afraid.Hibernia
Very useful when you want the first word of the string disregarding any initial spaces.Ami
T
4

If you're using glib, then you can use g_strstrip

Taxable answered 8/10, 2016 at 11:0 Comment(0)
H
3

I'm not sure what you consider "painless."

C strings are pretty painful. We can find the first non-whitespace character position trivially:

while (isspace(* p)) p++;

We can find the last non-whitespace character position with two similar trivial moves:

while (* q) q++;
do { q--; } while (isspace(* q));

(I have spared you the pain of using the * and ++ operators at the same time.)

The question now is what do you do with this? The datatype at hand isn't really a big robust abstract String that is easy to think about, but instead really barely any more than an array of storage bytes. Lacking a robust data type, it is impossible to write a function that will do the same as PHperytonby's chomp function. What would such a function in C return?

Hirai answered 23/9, 2008 at 18:39 Comment(1)
This work well unless the string is made up of all white-spaces. Need a one time check before do { q--; } ... to know *q != 0.Burthen
S
3

I didn't like most of these answers because they did one or more of the following...

  1. Returned a different pointer inside the original pointer's string (kind of a pain to juggle two different pointers to the same thing).
  2. Made gratuitous use of things like strlen() that pre-iterate the entire string.
  3. Used non-portable OS-specific lib functions.
  4. Backscanned.
  5. Used comparison to ' ' instead of isspace() so that TAB / CR / LF are preserved.
  6. Wasted memory with large static buffers.
  7. Wasted cycles with high-cost functions like sscanf/sprintf.

Here is my version:

void fnStrTrimInPlace(char *szWrite) {

    const char *szWriteOrig = szWrite;
    char       *szLastSpace = szWrite, *szRead = szWrite;
    int        bNotSpace;

    // SHIFT STRING, STARTING AT FIRST NON-SPACE CHAR, LEFTMOST
    while( *szRead != '\0' ) {

        bNotSpace = !isspace((unsigned char)(*szRead));

        if( (szWrite != szWriteOrig) || bNotSpace ) {

            *szWrite = *szRead;
            szWrite++;

            // TRACK POINTER TO LAST NON-SPACE
            if( bNotSpace )
                szLastSpace = szWrite;
        }

        szRead++;
    }

    // TERMINATE AFTER LAST NON-SPACE (OR BEGINNING IF THERE WAS NO NON-SPACE)
    *szLastSpace = '\0';
}
Silverweed answered 17/11, 2015 at 12:46 Comment(5)
You have to cast the argument for isspace to unsigned char, otherwise you invoke undefined behavior.Sweeper
As this answer is concerned about "Wasted cycles" , note that code unnecessarily copies the entire sting when there is no space. A leading while (isspace((unsigned char) *szWrite)) szWrite++; would prevent that. Code also copies all the trailing white space.Burthen
@chux this implementation mutates in-place with separate read & write pointers (as opposed to returning a new pointer in a different location), so the suggestion for jumping szWrite to the first non-space on line-one would leave the leading space in the original string.Silverweed
@chux, you are correct that it does copy trailing white-space (before adding a null after the last non-space character), but that's the price I chose to pay to avoid pre-scanning the string. For modest amounts of trailing WS, it is cheaper to just copy the bytes rather than to pre-scan the entire string for the last non-WS char. For large amounts of trailing WS, pre-scanning would probably be worth the reduction in writes.Silverweed
@chux, for the "copies when there is no space" situation, only performing *szWrite = *szRead when the pointers are not equal would skip the writes in that case, but then we've added another comparison/branch. With modern CPU/MMU/BP, I have no idea if that check would be a loss or a gain. With simpler processors and memory architectures, it's cheaper to just do the copy and skip the compare.Silverweed
C
2

Use a string library, for instance:

Ustr *s1 = USTR1(\7, " 12345 ");

ustr_sc_trim_cstr(&s1, " ");
assert(ustr_cmp_cstr_eq(s1, "12345"));

...as you say this is a "common" problem, yes you need to include a #include or so and it's not included in libc but don't go inventing your own hack job storing random pointers and size_t's that way only leads to buffer overflows.

Cetacean answered 24/9, 2008 at 4:7 Comment(0)
A
2

A bit late to the game, but I'll throw my routines into the fray. They're probably not the most absolute efficient, but I believe they're correct and they're simple (with rtrim() pushing the complexity envelope):

#include <ctype.h>
#include <string.h>

/*
    Public domain implementations of in-place string trim functions

    Michael Burr
    [email protected]
    2010
*/

char* ltrim(char* s) 
{
    char* newstart = s;

    while (isspace( *newstart)) {
        ++newstart;
    }

    // newstart points to first non-whitespace char (which might be '\0')
    memmove( s, newstart, strlen( newstart) + 1); // don't forget to move the '\0' terminator

    return s;
}


char* rtrim( char* s)
{
    char* end = s + strlen( s);

    // find the last non-whitespace character
    while ((end != s) && isspace( *(end-1))) {
            --end;
    }

    // at this point either (end == s) and s is either empty or all whitespace
    //      so it needs to be made empty, or
    //      end points just past the last non-whitespace character (it might point
    //      at the '\0' terminator, in which case there's no problem writing
    //      another there).    
    *end = '\0';

    return s;
}

char*  trim( char* s)
{
    return rtrim( ltrim( s));
}
Armillas answered 16/3, 2010 at 6:8 Comment(1)
you should cast the char argument to isspace() to (unsigned char) to avoid undefined behavior on potentially negative values. Also avoid moving the string if in ltrim() if not necessary.Kory
R
2

Very late to the party...

Single-pass forward-scanning solution with no backtracking. Every character in the source string is tested exactly once twice. (So it should be faster than most of the other solutions here, especially if the source string has a lot of trailing spaces.)

This includes two solutions, one to copy and trim a source string into another destination string, and the other to trim the source string in place. Both functions use the same code.

The (modifiable) string is moved in-place, so the original pointer to it remains unchanged.

#include <stddef.h>
#include <ctype.h>

char * trim2(char *d, const char *s)
{
    // Sanity checks
    if (s == NULL  ||  d == NULL)
        return NULL;

    // Skip leading spaces        
    const unsigned char * p = (const unsigned char *)s;
    while (isspace(*p))
        p++;

    // Copy the string
    unsigned char * dst = (unsigned char *)d;   // d and s can be the same
    unsigned char * end = dst;
    while (*p != '\0')
    {
        if (!isspace(*dst++ = *p++))
            end = dst;
    }

    // Truncate trailing spaces
    *end = '\0';
    return d;
}

char * trim(char *s)
{
    return trim2(s, s);
}
Racemose answered 6/7, 2018 at 17:20 Comment(4)
Every character in the source string is tested exactly once: not really, most characters in the source string are tested twice: compared to '\0' and then tested with isspace(). It seems wasteful to test all characters with isspace(). Backtracking from the end of the string should be more efficient for non pathological cases.Kory
@Kory - Yes, each character does get tested twice. I would like to see this code actually tested, especially given strings with lots of trailing spaces, as compared to other algorithms here.Racemose
trim() OK. Corner case: trim2(char *d, const char *s) has trouble when d,s overlap and s < d.Burthen
@chux - In that corner case, how should trim() behave? You're asking to trim and copy a string into memory occupied by the string itself. Unlike memmove(), this requires determining the length of the source string before doing the trim itself, which requires scanning the entire string an additional time. Better to write a different rtrim2() function that knows to copy the source to the destination backwards, and probably takes an additional source string length argument.Racemose
O
2

If, and ONLY IF there's only one contiguous block of text between whitespace, you can use a single call to strtok(3), like so:

char *trimmed = strtok(input, "\r\t\n ");

This works for strings like the following:

"   +1.123.456.7890 "
" 01-01-2020\n"
"\t2.523"

This will not work for strings that contain whitespace between blocks of non-whitespace, like " hi there ". It's probably better to avoid this approach, but now it's here in your toolbox if you need it.

Osterman answered 11/3, 2022 at 18:13 Comment(0)
K
1

Just to keep this growing, one more option with a modifiable string:

void trimString(char *string)
{
    size_t i = 0, j = strlen(string);
    while (j > 0 && isspace((unsigned char)string[j - 1])) string[--j] = '\0';
    while (isspace((unsigned char)string[i])) i++;
    if (i > 0) memmove(string, string + i, j - i + 1);
}
Koppel answered 8/11, 2015 at 21:3 Comment(5)
strlen() returns a size_t that can exceed the range of int. white space is not restricted to the space character. Finally but most important: Undefined behavior on strcpy(string, string + i * sizeof(char)); because source and destination arrays overlap. Use memmove() instead of strcpy().Kory
@Kory you are right, just included your suggestions. I understand that copying when the source and the destination overlap can cause undefined behavior, but just want to point that in this particular case this shouldn't cause any problem since we are always going to copy from a later position of memory to the beginning, thanks for the feedback.Koppel
it does not matter how the source and destination arrays overlap, it is undefined behavior. Do not rely on the assumption that copying may take place one byte at a time along increasing addresses. Also I forgot to mention that while (isspace((int)string[i])) string[i--] = '\0'; may loop beyond the beginning of the string. You should combine this loop with the previous and following lines and write while (i > 0 && isspace((unsigned char)string[--i])) { string[i] = '\0'; } size_t end = i;Kory
@Kory good point, a string with all white spaces would have caused to loop past the beginning, didn't thought of that.Koppel
Actually, my suggestion was incorrect as end did not point to the trailing null byte and your end = ++i; still had a problem for strings containing all whitespace characters. I just fixed the code.Kory
I
1

I know there have many answers, but I post my answer here to see if my solution is good enough.

// Trims leading whitespace chars in left `str`, then copy at almost `n - 1` chars
// into the `out` buffer in which copying might stop when the first '\0' occurs, 
// and finally append '\0' to the position of the last non-trailing whitespace char.
// Reture the length the trimed string which '\0' is not count in like strlen().
size_t trim(char *out, size_t n, const char *str)
{
    // do nothing
    if(n == 0) return 0;    

    // ptr stop at the first non-leading space char
    while(isspace(*str)) str++;    

    if(*str == '\0') {
        out[0] = '\0';
        return 0;
    }    

    size_t i = 0;    

    // copy char to out until '\0' or i == n - 1
    for(i = 0; i < n - 1 && *str != '\0'; i++){
        out[i] = *str++;
    }    

    // deal with the trailing space
    while(isspace(out[--i]));    

    out[++i] = '\0';
    return i;
}
Indivertible answered 9/8, 2017 at 5:18 Comment(2)
Note: isspace(*str) UB when *str < 0.Burthen
Use of size_t n is good, yet the interface does not inform the caller in any way when about n being too small for a complete trimmed string. Consider trim(out, 12, "delete data not")Burthen
C
1

The easiest way to skip leading spaces in a string is, imho,

#include <stdio.h>

int main()
{
char *foo="     teststring      ";
char *bar;
sscanf(foo,"%s",bar);
printf("String is >%s<\n",bar);
    return 0;
}
Columbia answered 29/4, 2018 at 13:22 Comment(1)
This will not work for strings with spaces in the middle, such as " foo bar ".Racemose
P
1

Ok this is my take on the question. I believe it's the most concise solution that modifies the string in place (free will work) and avoids any UB. For small strings, it's probably faster than a solution involving memmove.

void stripWS_LT(char *str)
{
    char *a = str, *b = str;
    while (isspace((unsigned char)*a)) a++;
    while (*b = *a++)  b++;
    while (b > str && isspace((unsigned char)*--b)) *b = 0;
}
Predestination answered 6/10, 2018 at 1:0 Comment(1)
The b > str test is only needed once. *b = 0; only needed once.Burthen
I
1
#include <ctype.h>
#include <string.h>

char *trim_space(char *in)
{
    char *out = NULL;
    int len;
    if (in) {
        len = strlen(in);
        while(len && isspace(in[len - 1])) --len;
        while(len && *in && isspace(*in)) ++in, --len;
        if (len) {
            out = strndup(in, len);
        }
    }
    return out;
}

isspace helps to trim all white spaces.

  • Run a first loop to check from last byte for space character and reduce the length variable
  • Run a second loop to check from first byte for space character and reduce the length variable and increment char pointer.
  • Finally if length variable is more than 0, then use strndup to create new string buffer by excluding spaces.
Irving answered 31/12, 2018 at 15:53 Comment(3)
Just a little nitpick, strndup() is not part of the C standard but only Posix. But as it is quite easy to implement it's not a big deal.Rosariarosario
trim_space("") returns NULL. I'd expect a pointer to "". int len; should be size_t len;. isspace(in[len - 1]) UB when in[len - 1] < 0.Burthen
An initial while (isspace((unsigned char) *in) in++; before len = strlen(in); would be more efficient than the later while(len && *in && isspace(*in)) ++in, --len;Burthen
S
1

This one is short and simple, uses for-loops and doesn't overwrite the string boundaries. You can replace the test with isspace() if needed.

void trim (char *s)         // trim leading and trailing spaces+tabs
{
 int i,j,k, len;

 j=k=0;
 len = strlen(s);
                    // find start of string
 for (i=0; i<len; i++) if ((s[i]!=32) && (s[i]!=9)) { j=i; break; }
                    // find end of string+1
 for (i=len-1; i>=j; i--) if ((s[i]!=32) && (s[i]!=9)) { k=i+1; break;} 

 if (k<=j) {s[0]=0; return;}        // all whitespace (j==k==0)

 len=k-j;
 for (i=0; i<len; i++) s[i] = s[j++];   // shift result to start of string
 s[i]=0;                // end the string

}//_trim
Syllabize answered 19/9, 2019 at 18:24 Comment(0)
F
0

Personally, I'd roll my own. You can use strtok, but you need to take care with doing so (particularly if you're removing leading characters) that you know what memory is what.

Getting rid of trailing spaces is easy, and pretty safe, as you can just put a 0 in over the top of the last space, counting back from the end. Getting rid of leading spaces means moving things around. If you want to do it in place (probably sensible) you can just keep shifting everything back one character until there's no leading space. Or, to be more efficient, you could find the index of the first non-space character, and shift everything back by that number. Or, you could just use a pointer to the first non-space character (but then you need to be careful in the same way as you do with strtok).

Fromm answered 23/9, 2008 at 18:16 Comment(1)
strtok is generally not a very good tool to use - not least because it is not re-entrant. If you stay inside a single function, it can be used safely, but if there's any possibility of threads or calling other functions which might themselves use strtok, you are in trouble.Governor
P
0

I'm only including code because the code posted so far seems suboptimal (and I don't have the rep to comment yet.)

void inplace_trim(char* s)
{
    int start, end = strlen(s);
    for (start = 0; isspace(s[start]); ++start) {}
    if (s[start]) {
        while (end > 0 && isspace(s[end-1]))
            --end;
        memmove(s, &s[start], end - start);
    }
    s[end - start] = '\0';
}

char* copy_trim(const char* s)
{
    int start, end;
    for (start = 0; isspace(s[start]); ++start) {}
    for (end = strlen(s); end > 0 && isspace(s[end-1]); --end) {}
    return strndup(s + start, end - start);
}

strndup() is a GNU extension. If you don't have it or something equivalent, roll your own. For example:

r = strdup(s + start);
r[end-start] = '\0';
Pharisaic answered 23/9, 2008 at 18:49 Comment(1)
isspace(0) is defined to be false, you can simplify both functions. Also move the memmove() inside the if block.Kory
P
0
#include "stdafx.h"
#include "malloc.h"
#include "string.h"

int main(int argc, char* argv[])
{

  char *ptr = (char*)malloc(sizeof(char)*30);
  strcpy(ptr,"            Hel  lo    wo           rl   d G    eo rocks!!!    by shahil    sucks b i          g       tim           e");

  int i = 0, j = 0;

  while(ptr[j]!='\0')
  {

      if(ptr[j] == ' ' )
      {
          j++;
          ptr[i] = ptr[j];
      }
      else
      {
          i++;
          j++;
          ptr[i] = ptr[j];
      }
  }


  printf("\noutput-%s\n",ptr);
        return 0;
}
Particiaparticipant answered 4/12, 2009 at 5:46 Comment(2)
This made me laugh because I thought dreamlax had edited the test string to include "sucks big time". Nope. The original author is just honest.Karen
Don't use this code. It produces a buffer overflow.Sweeper
L
0

Most of the answers so far do one of the following:

  1. Backtrack at the end of the string (i.e. find the end of the string and then seek backwards until a non-space character is found,) or
  2. Call strlen() first, making a second pass through the whole string.

This version makes only one pass and does not backtrack. Hence it may perform better than the others, though only if it is common to have hundreds of trailing spaces (which is not unusual when dealing with the output of a SQL query.)

static char const WHITESPACE[] = " \t\n\r";

static void get_trim_bounds(char  const *s,
                            char const **firstWord,
                            char const **trailingSpace)
{
    char const *lastWord;
    *firstWord = lastWord = s + strspn(s, WHITESPACE);
    do
    {
        *trailingSpace = lastWord + strcspn(lastWord, WHITESPACE);
        lastWord = *trailingSpace + strspn(*trailingSpace, WHITESPACE);
    }
    while (*lastWord != '\0');
}

char *copy_trim(char const *s)
{
    char const *firstWord, *trailingSpace;
    char *result;
    size_t newLength;

    get_trim_bounds(s, &firstWord, &trailingSpace);
    newLength = trailingSpace - firstWord;

    result = malloc(newLength + 1);
    memcpy(result, firstWord, newLength);
    result[newLength] = '\0';
    return result;
}

void inplace_trim(char *s)
{
    char const *firstWord, *trailingSpace;
    size_t newLength;

    get_trim_bounds(s, &firstWord, &trailingSpace);
    newLength = trailingSpace - firstWord;

    memmove(s, firstWord, newLength);
    s[newLength] = '\0';
}
Longrange answered 6/5, 2011 at 19:34 Comment(1)
If you are concerned with performance, do not use strspn() and strcspn() in a tight loop. This is very inefficient and the overhead will dwarf the unproven advantage of the single forward pass. strlen() is usually expanded inline with very efficient code, not a real concern. Trimming the beginning and end of the string will be much faster than testing every character in the string for whiteness even in the special case of strings with very few or no non-white characters.Kory
S
0

This is the shortest possible implementation I can think of:

static const char *WhiteSpace=" \n\r\t";
char* trim(char *t)
{
    char *e=t+(t!=NULL?strlen(t):0);               // *e initially points to end of string
    if (t==NULL) return;
    do --e; while (strchr(WhiteSpace, *e) && e>=t);  // Find last char that is not \r\n\t
    *(++e)=0;                                      // Null-terminate
    e=t+strspn (t,WhiteSpace);                           // Find first char that is not \t
    return e>t?memmove(t,e,strlen(e)+1):t;                  // memmove string contents and terminator
}
Subarid answered 20/2, 2013 at 11:33 Comment(1)
How about this: char *trim(char *s) { char *p = s, *e = s + strlen(s); while (e > s && isspace((unsigned char)e[-1])) { *--e = '\0'; } while (isspace((unsigned char)*p)) { p++; } if (p > s) { memmove(s, p, e + 1 - p); } return s; }Kory
S
0

These functions will modify the original buffer, so if dynamically allocated, the original pointer can be freed.

#include <string.h>

void rstrip(char *string)
{
  int l;
  if (!string)
    return;
  l = strlen(string) - 1;
  while (isspace(string[l]) && l >= 0)
    string[l--] = 0;
}

void lstrip(char *string)
{
  int i, l;
  if (!string)
    return;
  l = strlen(string);
  while (isspace(string[(i = 0)]))
    while(i++ < l)
      string[i-1] = string[i];
}

void strip(char *string)
{
  lstrip(string);
  rstrip(string);
}
Shank answered 22/5, 2013 at 13:49 Comment(1)
rstrip() invokes undefined behavior on the empty string. lstrip() is unnecessarily slow on string with a long initial portion of whitespace characters. isspace() should not be passed a char argument because it invokes undefined behavior on negative values different than EOF.Kory
C
0

What do you think about using StrTrim function defined in header Shlwapi.h.? It is straight forward rather defining on your own.
Details can be found on:
http://msdn.microsoft.com/en-us/library/windows/desktop/bb773454(v=vs.85).aspx

If you have
char ausCaptain[]="GeorgeBailey ";
StrTrim(ausCaptain," ");
This will give ausCaptain as "GeorgeBailey" not "GeorgeBailey ".

Costermonger answered 18/10, 2013 at 3:34 Comment(0)
G
0

To trim my strings from the both sides I use the oldie but the gooody ;) It can trim anything with ascii less than a space, meaning that the control chars will be trimmed also !

char *trimAll(char *strData)
{
  unsigned int L = strlen(strData);
  if(L > 0){ L--; }else{ return strData; }
  size_t S = 0, E = L;
  while((!(strData[S] > ' ') || !(strData[E] > ' ')) && (S >= 0) && (S <= L) && (E >= 0) && (E <= L))
  {
    if(strData[S] <= ' '){ S++; }
    if(strData[E] <= ' '){ E--; }
  }
  if(S == 0 && E == L){ return strData; } // Nothing to be done
  if((S >= 0) && (S <= L) && (E >= 0) && (E <= L)){
    L = E - S + 1;
    memmove(strData,&strData[S],L); strData[L] = '\0';
  }else{ strData[0] = '\0'; }
  return strData;
}
Gene answered 23/2, 2016 at 10:12 Comment(3)
You should use size_t instead of unsigned int. The code has a lot of redundant tests and invokes undefined behavior on strncpy(strData,&strData[S],L) because the source and destination arrays overlap. Use memmove() instead of strncpy().Kory
In this case it is ok as the destination address always has smaller index than the source, but yes memmove will be better indeed.Rommel
no it is not OK. it does not matter how the source and destination arrays overlap, it invokes undefined behavior because you cannot safely make assumptions on the implementation of the library functions beyond their standard specification. Modern compilers tend to take unfair advantage of situations with potential undefined behavior, play it safe and stay away from UB, and do not let newbie make unsafe assumptions.Kory
E
0

Here i use the dynamic memory allocation to trim the input string to the function trimStr. First, we find how many non-empty characters exist in the input string. Then, we allocate a character array with that size and taking care of the null terminated character. When we use this function, we need to free the memory inside of main function.

#include<stdio.h>
#include<stdlib.h>

char *trimStr(char *str){
char *tmp = str;
printf("input string %s\n",str);
int nc = 0;

while(*tmp!='\0'){
  if (*tmp != ' '){
  nc++;
 }
 tmp++;
}
printf("total nonempty characters are %d\n",nc);
char *trim = NULL;

trim = malloc(sizeof(char)*(nc+1));
if (trim == NULL) return NULL;
tmp = str;
int ne = 0;

while(*tmp!='\0'){
  if (*tmp != ' '){
     trim[ne] = *tmp;
   ne++;
 }
 tmp++;
}
trim[nc] = '\0';

printf("trimmed string is %s\n",trim);

return trim; 
 }


int main(void){

char str[] = " s ta ck ove r fl o w  ";

char *trim = trimStr(str);

if (trim != NULL )free(trim);

return 0;
}
Edom answered 1/2, 2018 at 12:41 Comment(0)
C
0

Here is how I do it. It trims the string in place, so no worry about deallocating a returned string or losing the pointer to an allocated string. It may not be the shortest answer possible, but it should be clear to most readers.

#include <ctype.h>
#include <string.h>
void trim_str(char *s)
{
    const size_t s_len = strlen(s);

    int i;
    for (i = 0; i < s_len; i++)
    {
        if (!isspace( (unsigned char) s[i] )) break;
    }

    if (i == s_len)
    {
        // s is an empty string or contains only space characters

        s[0] = '\0';
    }
    else
    {
        // s contains non-space characters

        const char *non_space_beginning = s + i;

        char *non_space_ending = s + s_len - 1;
        while ( isspace( (unsigned char) *non_space_ending ) ) non_space_ending--;

        size_t trimmed_s_len = non_space_ending - non_space_beginning + 1;

        if (s != non_space_beginning)
        {
            // Non-space characters exist in the beginning of s

            memmove(s, non_space_beginning, trimmed_s_len);
        }

        s[trimmed_s_len] = '\0';
    }
}
Creamery answered 15/6, 2018 at 18:57 Comment(1)
absolutely clear for readers, but strlen performs another loop.. :)Cinchonism
B
0
char* strtrim(char* const str)
{
    if (str != nullptr)
    {
        char const* begin{ str };
        while (std::isspace(*begin))
        {
            ++begin;
        }

        auto end{ begin };
        auto scout{ begin };
        while (*scout != '\0')
        {
            if (!std::isspace(*scout++))
            {
                end = scout;
            }
        }

        auto /* std::ptrdiff_t */ const length{ end - begin };
        if (begin != str)
        {
            std::memmove(str, begin, length);
        }

        str[length] = '\0';
    }

    return str;
}
Boxfish answered 6/8, 2018 at 23:35 Comment(1)
While this code may answer the question, providing additional context regarding how and/or why it solves the problem would improve the answer's long-term value.Claudicant
R
0

As the other answers don't seem to mutate the string pointer directly, but rather rely on the return value, I thought I would provide this method which additionally does not use any libraries and so is appropriate for operating system style programming:

// only used for printf in main
#include <stdio.h>

// note the char ** means we can modify the address
char *trimws(char **strp) { 
  char *str;
  // check if empty string
  if(!*str)
    return;
  // go to the end of the string
  for (str = *strp; *str; str++) 
    ;
  // back up one from the null terminator
  str--; 
  // set trailing ws to null
  for (; *str == ' '; str--) 
    *str = 0;
  // increment past leading ws
  for (str = *strp; *str == ' '; str++) 
    ;
  // set new begin address of string
  *strp = str; 
}

int main(void) {
  char buf[256] = "   whitespace    ";
  // pointer must be modifiable lvalue so we make bufp
  char **bufp = &buf;
  // pass in the address
  trimws(&bufp);
  // prints : XXXwhitespaceXXX
  printf("XXX%sXXX\n", bufp); 
  return 0;
}
Rida answered 5/12, 2019 at 20:5 Comment(2)
Even without taking into account incompatible pointer types you're using, what if strp points to an address of a zero-length string?Lucienlucienne
Edited to add empty string checkRida
L
0

IMO, it can be done without strlen and isspace.

char *
trim (char * s, char c)
{
    unsigned o = 0;
    char * sb = s;

    for (; *s == c; s++)
        o++;

    for (; *s != '\0'; s++)
        continue;
    for (; s - o > sb && *--s == c;)
        continue;

    if (o > 0)
        memmove(sb, sb + o, s + 1 - o - sb);
    if (*s != '\0')
        *(s + 1 - o) = '\0';

    return sb;
}
Lucienlucienne answered 24/4, 2021 at 5:5 Comment(0)
V
-1

Here is a function to do what you want. It should take care of degenerate cases where the string is all whitespace. You must pass in an output buffer and the length of the buffer, which means that you have to pass in a buffer that you allocate.

void str_trim(char *output, const char *text, int32 max_len)
{
    int32 i, j, length;
    length = strlen(text);

    if (max_len < 0) {
        max_len = length + 1;
    }

    for (i=0; i<length; i++) {
        if ( (text[i] != ' ') && (text[i] != '\t') && (text[i] != '\n') && (text[i] != '\r')) {
            break;
        }
    }

    if (i == length) {
        // handle lines that are all whitespace
        output[0] = 0;
        return;
    }

    for (j=length-1; j>=0; j--) {
        if ( (text[j] != ' ') && (text[j] != '\t') && (text[j] != '\n') && (text[j] != '\r')) {
            break;
        }
    }

    length = j + 1 - i;
    strncpy(output, text + i, length);
    output[length] = 0;
}

The if statements in the loops can probably be replaced with isspace(text[i]) or isspace(text[j]) to make the lines a little easier to read. I think that I had them set this way because there were some characters that I didn't want to test for, but it looks like I'm covering all whitespace now :-)

Volar answered 23/9, 2008 at 18:24 Comment(3)
The maxlen < 0 test leads to dangerous behaviour.Governor
hmm...good point. I'll have to fix my code. Thanks for noting that.Volar
The code invokes undefined behavior if output and test overlap. Use memmove instead of strncpy. As a rule of thumb, strncpy() is never the right tool for the job.Kory
M
-1

Here is what I disclosed regarding the question in Linux kernel code:

/**
 * skip_spaces - Removes leading whitespace from @s.
 * @s: The string to be stripped.
 *
 * Returns a pointer to the first non-whitespace character in @s.
 */
char *skip_spaces(const char *str)
{
    while (isspace(*str))
            ++str;
    return (char *)str;
}

/**
 * strim - Removes leading and trailing whitespace from @s.
 * @s: The string to be stripped.
 *
 * Note that the first trailing whitespace is replaced with a %NUL-terminator
 * in the given string @s. Returns a pointer to the first non-whitespace
 * character in @s.
 */
char *strim(char *s)
{
    size_t size;
    char *end;

    size = strlen(s);

    if (!size)
            return s;

    end = s + size - 1;
    while (end >= s && isspace(*end))
            end--;
    *(end + 1) = '\0';

    return skip_spaces(s);
}

It is supposed to be bug free due to the origin ;-)

Mine one piece is closer to KISS principle I guess:

/**
 * trim spaces
 **/
char * trim_inplace(char * s, int len)
{
    // trim leading
    while (len && isspace(s[0]))
    {
        s++; len--;
    }

    // trim trailing
    while (len && isspace(s[len - 1]))
    {
        s[len - 1] = 0; len--;
    }

    return s;
}
Misery answered 26/9, 2014 at 15:44 Comment(2)
You have to cast the argument for isspace to unsigned char, otherwise you invoke undefined behavior.Sweeper
You could improve the code by first trimming the end and then trimming the beginning: you would no longer need to update len for the latter operation.Kory
S
-1

C++ STL style

std::string Trimed(const std::string& s)
{
    std::string::const_iterator begin = std::find_if(s.begin(),
                                                 s.end(),
                                                 [](char ch) { return !std::isspace(ch); });

    std::string::const_iterator   end = std::find_if(s.rbegin(),
                                                 s.rend(),
                                                 [](char ch) { return !std::isspace(ch); }).base();
    return std::string(begin, end);
}

http://ideone.com/VwJaq9

Serviceable answered 13/1, 2016 at 12:14 Comment(3)
You have to cast the argument for isspace to unsigned char, otherwise you invoke undefined behavior.Sweeper
this only example which works for ASCII (range to 127) no matter casting or no casting, so there is no UB. You can improve it by using cplusplus.com/reference/locale/isspaceServiceable
Since the question doesn't mention ASCII and your answer also doesn't, I thought it would apply to all character sets.Sweeper
O
-1
void trim(char* const str)
{
    char* begin = str;
    char* end = str;
    while (isspace(*begin))
    {
        ++begin;
    }
    char* s = begin;
    while (*s != '\0')
    {
        if (!isspace(*s++))
        {
            end = s;
        }
    }
    *end = '\0';
    const int dist = end - begin;
    if (begin > str && dist > 0)
    {
        memmove(str, begin, dist + 1);
    }
}

Modifies string in place, so you can still delete it.

Doesn't use fancy pants library functions (unless you consider memmove fancy).

Handles string overlap.

Trims front and back (not middle, sorry).

Fast if string is large (memmove often written in assembly).

Only moves characters if required (I find this true in most use cases because strings rarely have leading spaces and often don't have tailing spaces)

I would like to test this but I'm running late. Enjoy finding bugs... :-)

Outguess answered 10/2, 2016 at 0:54 Comment(1)
you should cast the arguments to isspace() as (unsigned char). You function will fail on a string comprising only white space because dist > 0 will be false.Kory
C
-2
#include<stdio.h>
#include<ctype.h>

main()
{
    char sent[10]={' ',' ',' ','s','t','a','r','s',' ',' '};
    int i,j=0;
    char rec[10];

    for(i=0;i<=10;i++)
    {
        if(!isspace(sent[i]))
        {

            rec[j]=sent[i];
            j++;
        }
    }

printf("\n%s\n",rec);

}
Castleberry answered 18/7, 2012 at 10:18 Comment(2)
Doesn't this trim all spaces? I think the OP wants just leading/trailing spaces to be trimmed.Charmainecharmane
This invokes UB with isspace(sent[10]).Burthen
F
-2
void trim(char* string) {
    int lenght = strlen(string);
    int i=0;

    while(string[0] ==' ') {
        for(i=0; i<lenght; i++) {
            string[i] = string[i+1];
        }
        lenght--;
    }


    for(i=lenght-1; i>0; i--) {
        if(string[i] == ' ') {
            string[i] = '\0';
        } else {
            break;
        }
    }
}
Fecteau answered 9/3, 2016 at 18:25 Comment(2)
trim should handle all white space characters, not just space.Kory
This is an appalling algorithm. Suppose you have a string of 20 KiB with 32 leading spaces. It will copy 20 KiB - 1 bytes down one place, then 20 KiB - 2, then 20 KiB - 3, etc, copying around about 640 KiB of data in total. A sane algorithm will copy that data just once, for a total of about 20 KiB copied.Governor

© 2022 - 2024 — McMap. All rights reserved.