When/why is it a bad idea to use the fscanf() function?
Asked Answered
F

6

7

In an answer there was an interesting statement: "It's almost always a bad idea to use the fscanf() function as it can leave your file pointer in an unknown location on failure. I prefer to use fgets() to get each line in and then sscanf() that."

Could you expand upon when/why it might be better to use fgets() and sscanf() to read some file?

Fogy answered 14/5, 2009 at 19:49 Comment(0)
S
14

Imagine a file with three lines:

   1
   2b
   c

Using fscanf() to read integers, the first line would read fine but on the second line fscanf() would leave you at the 'b', not sure what to do from there. You would need some mechanism to move past the garbage input to see the third line.

If you do a fgets() and sscanf(), you can guarantee that your file pointer moves a line at a time, which is a little easier to deal with. In general, you should still be looking at the whole string to report any odd characters in it.

I prefer the latter approach myself, although I wouldn't agree with the statement that "it's almost always a bad idea to use fscanf()"... fscanf() is perfectly fine for most things.

Swastika answered 14/5, 2009 at 19:55 Comment(2)
Please change gets() to fgets(). gets() should never ever be used.Kilowatt
Must'a been a typo :) Thanks for catching that.Swastika
K
5

The case where this comes into play is when you match character literals. Suppose you have:

int n = fscanf(fp, "%d,%d", &i1, &i2);

Consider two possible inputs "323,A424" and "323A424".

In both cases fscanf() will return 1 and the next character read will be an 'A'. There is no way to determine if the comma was matched or not.

That being said, this only matters if finding the actual source of the error is important. In cases where knowing there is malformed input error is enough, fscanf() is actually superior to writing custom parsing code.

Kalinda answered 14/5, 2009 at 21:38 Comment(0)
I
3

When fscanf() fails, due to an input failure or a matching failure, the file pointer (that is, the position in the file from which the next byte will be read) is left in a position other than where it would be had the fscanf() succeeded. This is typically undesirable in sequential file reads. Reading one line at a time results in the file input being predictable, while single line failures can be handled individually.

Ina answered 14/5, 2009 at 19:57 Comment(0)
E
2

There are two reasons:

  • scanf() can leave stdin in a state that's difficult to predict; this makes error recovery difficult if not impossible (this is less of a problem with fscanf()); and
  • The entire scanf() family take pointers as arguments, but no length limit, so they can overrun a buffer and alter unrelated variables that happen to be after the buffer, causing seemingly random memory corruption errors that very difficult to understand, find, and debug, particularly for less experienced C programmers.

Novice C programmers are often confused about pointers and the “address-of” operator, and frequently omit the & where it's needed, or add it “for good measure” where it's not. This causes “random” segfaults that can be hard for them to find. This isn't scanf()'s fault, so I leave it off my list, but it is worth bearing in mind.

After 23 years, I still remember it being a huge pain when I started C programming and didn't know how to recognize and debug these kinds of errors, and (as someone who spent years teaching C to beginners) it's very hard to explain them to a novice who doesn't yet understand pointers and stack.

Anyone who recommends scanf() to a novice C programmer should be flogged mercilessly.

OK, maybe not mercilessly, but some kind of flogging is definitely in order ;o)

Elvera answered 14/3, 2014 at 18:42 Comment(1)
The statement "take pointers as arguments, but no length limit" is misleading: for most types, sizes are fixed (%i, %d, %lf) so length limits are not needed. One exception is for reading strings with %s. But even with that, a limit can be specified by adding a number between % and s: %99s for a character string declared as char s[100].Longobard
P
1

Basically, there's no way to to tell that function not to go out of bounds for the memory area you've allocated for it.

A number of replacements have come out, like fnscanf, which is an attempt to fix those functions by specifying a maximum limit for the reader to write, thus allowing it to not overflow.

Playa answered 14/5, 2009 at 19:56 Comment(5)
while buffer overflows are one problem with the scanf() family of functions, they are unrelated to the problem asked about here. -1Ina
"Could you expand upon why it might be better to use fgets() and sscanf() to read some file." I was expanding upon his question. I reject your overambitious "-1"Playa
I take "expand on why" to mean that your answer should be based on the premise already presented, that being the file pointer issue. If he wanted OTHER reasons he would not have linked to the origin of the question, or quoted the relevant part of it.Ina
Ahh. i understood it as other reasons as truly 'other reasons', since he explained the reason already in the question ;) Different read on the same question i guess.Playa
Since the original question referenced had to do with using fscanf() to read whole lines, there is a lot more relevance to the comparison to fgets() and concerns about the buffer than just the question of where the file pointer landed on failure to match, although that was the example cited in the other thread.Deportation
M
1

It's almost always a bad idea to use the fscanf() function as it can leave your file pointer in an unknown location on failure. I prefer to use fgets() to get each line in and then sscanf() that.

You can always use ftell() to find out current position in file, and then decide what to do from there. Basicaly, if you know what you can expect then feel free to use fscanf().

Moralist answered 14/5, 2009 at 20:9 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.