Trouble reading a line using fscanf()
Asked Answered
A

8

6

I'm trying to read a line using the following code:

while(fscanf(f, "%[^\n\r]s", cLine) != EOF )
{
    /* do something with cLine */
}

But somehow I get only the first line every time. Is this a bad way to read a line? What should I fix to make it work as expected?

Asbestosis answered 14/5, 2009 at 6:12 Comment(0)
P
20

It's almost always a bad idea to use the fscanf() function as it can leave your file pointer in an unknown location on failure.

I prefer to use fgets() to get each line in and then sscanf() that. You can then continue to examine the line read in as you see fit. Something like:

#define LINESZ 1024
char buff[LINESZ];
FILE *fin = fopen ("infile.txt", "r");
if (fin != NULL) {
    while (fgets (buff, LINESZ, fin)) {
        /* Process buff here. */
    }
    fclose (fin);
}

fgets() appears to be what you're trying to do, reading in a string until you encounter a newline character.

Penza answered 14/5, 2009 at 6:19 Comment(3)
How could I use the sscanf function to read only a line (BTY is 1024 the size of a line?) thanks!Asbestosis
fgets reads one line "or less". fgets(buffer, 1024, file) will read a line, as much as there is on the file, or 1024 characters. If you read a whole line, then buffer[strlen(buffer)] == '\n'. If you reach EOF, it returns null, and otherwise, there's more text on the line.Contagium
#865835Clave
L
2

Your loop has several issues. You wrote:

while( fscanf( f, "%[^\n\r]s", cLine ) != EOF ) 
    /* do something */;

Some things to consider:

  1. fscanf() returns the number of items stored. It can return EOF if it reads past the end of file or if the file handle has an error. You need to distinguish a valid return of zero in which case there is no new content in the buffer cLine from a successfully read.

  2. You do a have a problem when a failure to match occurs because it is difficult to predict where the file handle is now pointing in the stream. This makes recovery from a failed match harder to do than might be expected.

  3. The pattern you wrote probably doesn't do what you intended. It is matching any number of characters that are not CR or LF, and then expecting to find a literal s.

  4. You haven't protected your buffer from an overflow. Any number of characters may be read from the file and written to the buffer, regardless of the size allocated to that buffer. This is an unfortunately common error, that in many cases can be exploited by an attacker to run arbitrary code of the attackers choosing.

  5. Unless you specifically requested that f be opened in binary mode, line ending translation will happen in the library and you will generally never see CR characters, and usually not in text files.

You probably want a loop more like the following:

while(fgets(cLine, N_CLINE, f)) {
    /* do something */ ;
}

where N_CLINE is the number of bytes available in the buffer starting a cLine.

The fgets() function is a much preferred way to read a line from a file. Its second parameter is the size of the buffer, and it reads up to 1 less than that size bytes from the file into the buffer. It always terminates the buffer with a nul character so that it can be safely passed to other C string functions.

It stops on the first of end of file, newline, or buffer_size-1 bytes read.

It leaves the newline character in the buffer, and that fact allows you to distinguish a single line longer than your buffer from a line shorter than the buffer.

It returns NULL if no bytes were copied due to end of file or an error, and the pointer to the buffer otherwise. You might want to use feof() and/or ferror() to distinguish those cases.

Legerdemain answered 14/5, 2009 at 7:22 Comment(2)
thanks, I did so but I wonder what if my line is greater than the size I set will it cut part of the next line or might it cause any other issuesAsbestosis
If an input line is longer than the buffer passed to fgets(), it will stop reading before the end of the input line and give you what it read so far in the buffer. You know that happened because there isn't a \n at the end of the buffer. Each call to fgets() will keep reading, so you can handle a long line a buffer at a time by looping until a buffer ends with \n. The only issue is a matter of sensibly parsing your input when its broken at an arbitrary place.Legerdemain
P
2

Using fscanf to read/tokenise a file always results in fragile code or pain and suffering. Reading a line, and tokenising or scanning that line is safe, and effective. It needs more lines of code - which means it takes longer to THINK about what you want to do (and you need to handle a finite input buffer size) - but after that life just stinks less.

Don't fight fscanf. Just don't use it. Ever.

Pendent answered 8/12, 2009 at 12:29 Comment(0)
F
2

If you want read a file line by line (Here, line separator == '\n') just make that:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main(int argc, char **argv)
{
        FILE *fp;
        char *buffer;
        int ret;

        // Open a file ("test.txt")
        if ((fp = fopen("test.txt", "r")) == NULL) {
                fprintf(stdout, "Error: Can't open file !\n");
                return -1;
        }
        // Alloc buffer size (Set your max line size)
        buffer = malloc(sizeof(char) * 4096);
        while(!feof(fp))
        {
                // Clean buffer
                memset(buffer, 0, 4096);
                // Read a line
                ret = fscanf(fp, "%4095[^\n]\n", buffer);
                if (ret != EOF) {
                        // Print line
                        fprintf(stdout, "%s\n", buffer);
                }
        }
        // Free buffer
        free(buffer);
        // Close file
        fclose(fp);
        return 0;
}

Enjoy :)

Flopeared answered 15/6, 2011 at 11:59 Comment(0)
K
1

It looks to me like you're trying to use regex operators in your fscanf string. The string [^\n\r] doesn't mean anything to fscanf, which is why your code doesn't work as expected.

Furthermore, fscanf() doesn't return EOF if the item doesn't match. Rather, it returns an integer that indicates the number of matches--which in your case is probably zero. EOF is only returned at the end of the stream or in case of an error. So what's happening in your case is that the first call to fscanf() reads all the way to the end of the file looking for a matching string, then returns 0 to let you know that no match was found. The second call then returns EOF because the entire file has been read.

Finally, note that the %s scanf format operator only captures to the next whitespace character, so you don't need to exclude \n or \r in any case.

Consult the fscanf documentation for more information: http://www.cplusplus.com/reference/clibrary/cstdio/fscanf/

Kwh answered 14/5, 2009 at 6:17 Comment(2)
[^a-z] does actually exclude a-z in scanf. Although, the string as pointed out above, looks for "a pair of characters, the first one not being a line break, and the second being an s"Contagium
The fscanf documentation at cplusplus.com is incomplete. Google 'fscanf scanset'.Lemcke
D
1

If you try while( fscanf( f, "%27[^\n\r]", cLine ) == 1 ) you might have a little more luck. The three changes from your original:

  • length-limit what gets read in - I've used 27 here as an example, and unfortunately the scanf() family require the field width literally in the format string and can't use the * mechanism that the printf() can for passing the value in
  • get rid of the s in the format string - %[ is the format specifier for "all characters matching or not matching a set", and the set is terminated by a ] on its own
  • compare the return value against the number of conversions you expect to happen (and for ease of management, ensure that number is 1)

That said, you'll get the same result with less pain by using fgets() to read in as much of a line as will fit in your buffer.

Dree answered 14/5, 2009 at 7:27 Comment(2)
This will still leave him with his original problem of only reading the first line. Better would be "%27[^\n\r]%*[\n\r]" so the non-matching character is consumed.Lemcke
I'm afraid this does not work: fscanf() will return 0 if the stream contains an empty line. Using fgets() or getline() is a much better approach.Mcdavid
M
-1

i think the problem with this code is because when you read with %[^\n\r]s, in fact, you reading until reach '\n' or '\r', but you don't reading the '\n' or '\r' also. So you need to get this character before you read with fscanf again at loop. Do something like that:

do{
    fscanf(f, "%[^\n\r]s", cLine) != EOF

    /* Do something here */

}while(fgetc(file) != EOF)
Morgun answered 30/4, 2017 at 22:55 Comment(0)
W
-1

Know this is a very old post, but when using fscanf for file reading line by line you need to consume the line ending that you matched as well.

I usually do:

int c = 0;
while( EOF != (c = fscanf(fp, "%[^\n]", line)))
{
    fgetc(fp);
    /* For blank lines you could ... */
    if(c != 0)
    {
        /* do something */
    }

    /* OR blank the buffer before next read) */
    memset(line, '\0', <buffer_size>);
    /* Or .... */
    strcpy(line, "");
    
}
Wimble answered 7/8, 2024 at 7:57 Comment(5)
I'm afraid this does not work: fscanf() will return 0 (not EOF) if the stream contains an empty line, leaving line unmodified, causing the /* do something */ to process invalid input. Furthermore a sufficiently long input line will cause a buffer overflow. Using fgets() or getline() is a much better approach.Mcdavid
Linux man pages say that fscanf returns EOF on error, so that works. This pattern has worked fine for me for over 30 years. You can always do fscanf(fp,"%100[^\n]", line) to match your buffer size, then process as needed if the fgetc() doesn't return '\n'Wimble
OH ... and I always memset the recieving buffer to all zeroes before reading anything.Wimble
@Peter, if memset() is necessary, it should be in the code. As @Mcdavid wrote, if you do not test for zero, having input file as "abc\n\nbcd\n", you will get "abc", then "abc" again (fscanf returned "zero" on empty line), then "bcd". Just tested exactly that case, with a printf("%s\n", line) in the loop.Etymology
That's what the memset is for BUT is not necessary. There are other ways of dealing with that eventuality. Like paying attention to the fscanf return value. However, the question asked was 'why do I only read the first line' the answer to which is 'consume the termination character'Wimble

© 2022 - 2025 — McMap. All rights reserved.