Word Wrap Program C
Asked Answered
C

3

5

At the end of Chapter 1 of The C Programming Language, there are a few exercises to complete. The one I am doing now asks you to make a program that wraps a long string of text into multiple lines at a specific length. The following function works 100%, aside from the last line which does not get wrapped, no matter the specified maximum width of a line.

// wrap: take a long input line and wrap it into multiple lines
void wrap(char s[], const int wrapline)
{
    int i, k, wraploc, lastwrap;

    lastwrap = 0; // saves character index after most recent line wrap
    wraploc = 0; // used to find the location for next word wrap

    for (i = 0; s[i] != '\0'; ++i, ++wraploc) {

        if (wraploc >= wrapline) {
            for (k = i; k > 0; --k) {
                // make sure word wrap doesn't overflow past maximum length
                if (k - lastwrap <= wrapline && s[k] == ' ') {
                    s[k] = '\n';
                    lastwrap = k+1;
                    break;
                }
            }
            wraploc = 0;
        }

    } // end main loop

    for (i = 0; i < wrapline; ++i) printf(" ");
    printf("|\n");
    printf("%s\n", s);
}

I have found the issue to be with the variable wraploc, which is incremented until it is greater than wrapline (the maximum index of a line). Once it is greater than wrapline, a newline is inserted at the appropriate location and wraploc is reset to 0.

The problem is that on the last line, wraploc is never greater than wrapline, even when it should be. It increments perfectly throughout iteration of the string, until the last line. Take this example:

char s[] = "This is a sample string the last line will surely overflow";
wrap(s, 15);

$ ./a.out
               |
this is a
sample string
the last line
will surely overflow

The line represents the location where it should be wrapped. In this case, wraploc has the value 14, when there are clearly more characters than that.

I have no idea why this is happening, can someone help me out?

(Also I'm a complete beginner to C and I have no experience with pointers so please stay away from those in your answers, thanks).

Cowherd answered 22/3, 2014 at 20:13 Comment(4)
Try for(k = i - 1; k > 0; --k) instead of for(k = i; k > 0; --k)Pegboard
no, doesn't make a difference @PegboardCowherd
I am unable to understand the use of k - lastwrap <= wrapline. Is this condition checking really necessary?Lallans
Also, the program works fine for if (wraploc == wrapline), instead of if (wraploc >= wrapline).Lallans
D
4

You increment wraploc with i until it reaches wrapline (15 in the example).
When you wrap, you backtrack from i, back to the last whitespace.
That means that in your next line you already have some characters between the lastwrap location and i, i.e., you can't reset wraploc to 0 there.
Try setting wraploc = i-lastwrap instead.

Directorial answered 22/3, 2014 at 20:37 Comment(1)
Thank you so much! I can't believe I didn't recognize that factCowherd
R
5

Anybody who might, like me, find this question and run into a problem with new-lines in the source string.

This is my answer:

inline int wordlen(const char * str){
   int tempindex=0;
   while(str[tempindex]!=' ' && str[tempindex]!=0 && str[tempindex]!='\n'){
      ++tempindex;
   }
   return(tempindex);
}
void wrap(char * s, const int wrapline){

   int index=0;
   int curlinelen = 0;
   while(s[index] != '\0'){

      if(s[index] == '\n'){
         curlinelen=0;
      }
      else if(s[index] == ' '){

         if(curlinelen+wordlen(&s[index+1]) >= wrapline){
            s[index] = '\n';
            curlinelen = 0;
         }

      }

      curlinelen++;
      index++;
   }

}

Richrichara answered 3/2, 2017 at 3:38 Comment(0)
D
4

You increment wraploc with i until it reaches wrapline (15 in the example).
When you wrap, you backtrack from i, back to the last whitespace.
That means that in your next line you already have some characters between the lastwrap location and i, i.e., you can't reset wraploc to 0 there.
Try setting wraploc = i-lastwrap instead.

Directorial answered 22/3, 2014 at 20:37 Comment(1)
Thank you so much! I can't believe I didn't recognize that factCowherd
M
1

This is an adaptation of user KECG's previous answer.

Adaptations made:

  • Expands the line breakable character set to include hyphen, tabs, etc
    • implements switch() to facilitate easy expansion of that character set.
  • Stores results in an allocated buffer that is returned to caller.
    • We do not overwrite the breakable chars: hyphen, tab, etc, but store them in the buffer.
    • Adds newline chars to output buffer to signal line breaks
    • Disallows unwanted chars from entering buffer (like carriage return ('\r')
    • Translates tabs and places variabley defined SPACE chars into buffer.
  • Ugly-breaks a long word in an unfortunate scenario. "Unfortunate scenario" defined as:
    • When NOT ugly-breaking a word would result in spaces consuming more than a user-defined percentage of the current line

Runnable tested code with output is here.

#include <stdio.h>
#include <string.h>
#include <stdlib.h>

#define TRUE 1
#define FALSE 0
#define SPACE (char)('+') /*visible representation of tab replacement for analysis */
#define TOLERANCE 0.10


void ErrorExit(char *str)
{
    puts(str);  
    exit(0);
}


/*--------------------------------------------------------------------------
    
    next_break()
    
    Algo: function does a look-ahead for a space, a hyphen... anything that
    constitutes a natural sentence break oppty.   Returns the index of 
    the break oppty to the caller.
*--------------------------------------------------------------------------*/
int next_break(const char * str)
{
   int done = FALSE, tempindex= -1;
   char ch;

   while(!done)
   {
       ch = str[++tempindex];
       switch( ch ) 
       {
            case 0:
            case (char)' ':
            case (char)'\n':
            case (char)'\t':
            case (char)'-':
                done = TRUE;
            break;

            default:
            break;
       }
   }
   return(tempindex);
}

/*-------------------------------------------------------------------------------------
    
    wordwrap()
    
    Algo: parses a long string looking for line break opportunities with 
    every char. If a break oppty is found at cuurent offs, does a qwk scan ahead 
    via next_break() to see if a better oppty exists ahead. ('Better' means closer 
    to the margin but NOT past the margin)

    If no better oppty found ahead, inserts a newline into buffer & restarts the line
    count.  Else, postpones the newline until chars are read up to the better oppty.
    
    Inputs: char *src buffer needing word wrap formatting.
            int max_line_len for wrap margin.
            int pointer *ugly_breaks for returning number of middle-of-word breaks. 

    Returns a buffer having the formatted text.
*-------------------------------------------------------------------------------------*/
char *wordwrap(const char *src, const int max_line_len, int *ugly_breaks)
{
    int src_idx=0, dest_idx = 0, cur_line_len = 0, done = FALSE;
    char ch;
    char *dest = malloc(strlen(src)*3); /* Enough space for even the worst of wrap possibilities.*/ 
    int new_line_needed = FALSE;

    if(!dest)
        ErrorExit("Memory Allocation error in wordwrap");

    while(!done)
    {
        ch = src[src_idx];
        switch(ch)
        {
            case 0:
                done = TRUE;
            break;

            case (char)' ':
            case (char)'-':
                dest[dest_idx++]=ch; /* No matter what happens next, we will include this char... */
                cur_line_len++;   /* ... and so of course we need to say this. */
                /* Would the next break oppty put us past the margin/line limit? */
                if(cur_line_len + next_break(&src[src_idx+1]) >= max_line_len)
                {
                    /* A: Yes.  Take the break oppty here, Now*/
                    new_line_needed = TRUE;
                }
            break;

            case (char)'\n':
                /* NOTE:  you don't have to honor existing line breaks in the text.
                * You can comment out these next 2 lines (and remove the newline ('\n') case in 
                * function next_break()) to completely reformat paragraphs.  It's actually more 
                * aesthetic if you do, but this is an opinion, and so I leave the code here. */
                dest[dest_idx++]=ch;
                cur_line_len=0;
            break;

            case (char)'\r': /* Nope, stripping these */
            break;

            case (char)'\t': /* Tab, replace with space(s)*/    
                    
                    if(cur_line_len+1 + next_break(&src[src_idx+1]) >= max_line_len)
                    {
                        /* We have a tab as the last character of the current line.
                        * You can expect this to be rare and it is.  But if you don't 
                        * care for it, result will be disappointing sooner or later*/
                        new_line_needed = TRUE;
                    }
                    else
                    {
                        /* Replace the 4s here with any tab stop you like. 8 is the standard.*/
                        int to_add = 4-((cur_line_len)%4);

                        while(to_add-- && cur_line_len < max_line_len)
                        {
                            dest[dest_idx++]=SPACE;  /* Adaptable space replacement char */
                            cur_line_len++;
                        }
                    }
            break;

            default:
                dest[dest_idx++]=ch;
                cur_line_len++;
            break;
        }
        
        /* Has one of our cases flagged a need for newline? */
        if(new_line_needed)
        {
            int space_remaining = (max_line_len-cur_line_len);
            double percent_remain = 0.0;

            new_line_needed = FALSE;

            /* We now take the newline request as advisement.  We inspect
            * the length of remaining chars on the current line before we agree.
            * If some long word is next, then we're going to break it up ugly 
            * instead of leaving a lot of unused space in our buffer/application.
            * It's merely trading one kind of ugly (unused space) for another (broken word). 
            * 
            * We want to keep going (no newline) if more than -- say 10% -- of current line 
            * would become white space by newlining right now.
            *
            * Set percent_remain tolerance lower than 10% to get more greedy
            * with space conservation but get more ugly word breaks.
            *
            * 5% (0.05) is pretty nice with an avg of only 2 ugly breaks per 
            * a paragraph with a "reasonable" margin (70 chars or more).
            *
            * Set to 100% (1.0) and you won't get any ugly breaks -- unless 
            * you encounter a Huge word that is longer than your margin limit.
            */
            if(max_line_len > 0 )
                    percent_remain = (double)space_remaining/max_line_len;
            if(percent_remain < TOLERANCE) 
            {
                /* Not much space remaining, we can newline here */
                dest[dest_idx++]='\n';
                cur_line_len = 0;
            }
        }
        /* Since we are habitually ignoring new line requests made by the cases,
        * -- AND because it is possible to get some long character sequence or word
        * which may exceed our margin -- 
        * ... check for margin overflow with every loop. */ 
        if(cur_line_len >= max_line_len)
        {
            /* We have or will overflow with next char.
            * This is called breaking the word ugly.  Sorry babe.*/
            dest[dest_idx++]='\n';
            cur_line_len = 0;
            /* Track ugly breaks for tolerance & adjusting newline rejections*/
            (*ugly_breaks)++;  
        }
        src_idx++;
    }
    dest[dest_idx++]='\0';  /* cap it */
    return dest;
}
Modred answered 8/1 at 23:44 Comment(1)
Nicely done :). My code will never look that clean.Richrichara

© 2022 - 2024 — McMap. All rights reserved.