Using fseek to backtrack
Asked Answered
S

5

5

Is using fseek to backtrack character fscanf operations reliable?

Like for example if I have just fscanf-ed 10 characters but I would like to backtrack the 10 chars can I just fseek(infile, -10, SEEK_CUR) ?

For most situations it works but I seem to have problems with the character ^M. Apparently fseek registers it as a char but fscanf doesn't register it, thus in my previous example a 10 char block containing a ^M would require fseek(infile, -11, SEEK_CUR) instead. fseek(infile, -10, SEEK_CUR) would make bring it short by 1 character.

Why is this so?

Edit: I was using fopen in text mode

Slinky answered 23/4, 2009 at 4:21 Comment(0)
E
8

You're seeing the difference between a "text" and a "binary" file. When a file is opened in text mode (no 'b' in the fopen second argument), the stdio library may (indeed, must) interpret the contents of the file according to the operating system's conventions for text files. For example, in Windows, a line ends with \r\n, and this gets translated to a single \n by stdio, since that is the C convention. When writing to a text file, a single \n gets output as \r\n.

This makes it easier to write portable C programs that handle text files. Some details become complicated, however, and fseeking is one of them. Because of this, the C standard only defines fseek in text files in a few cases: to the very beginning, to the very end, to the current position, and to a previous position that has been retrieved with ftell. In other words, you can't compute a location to seek to for text files. Or you can, but you have to take care of the all the platform-specific details yourself.

Alternatively, you can use binary files and do the line-ending transformations yourself. Again, portability suffers.

In your case, if you just want to go back to where you last did fscancf, the easiest would be to use ftell just before you fscanf.

Englis answered 23/4, 2009 at 5:33 Comment(1)
Thanks, I didn't know about ftell... definitely a better way to implement than to fseek manuallySlinky
D
2

This is because fseek works with bytes, whereas fscanf intelligently handles that the carriage return and line feed are two bytes, and swallows them as one char.

Detruncate answered 23/4, 2009 at 4:41 Comment(3)
Yes, I think you're right; This matches observation. I forgot to consider text and binary modes, my fopen defaulted to text mode if I'm not wrongSlinky
I would question the use of the word "intelligently". How much harder is it to just process both \r and \n yourself in binary mode? And that way you get uniform behavior across systems (for example if your program is running on unix but someone throws a DOS text file full of \r's at it, it will still work). I always go with "text mode considered harmful".Positively
Sounds like you're saying rather than use the built in functionality of the library you'd duplicate it yourself, because it's not hard. By that logic why use any libraries?Detruncate
S
1

Fseek has no understanding of the file's contents and just moves the filepointer 10 characters back.

fscanf depending on the OS, may interpret newlines differently; it may even be so that fscanf will insert the ^M if you're on DOS and the ^M does not appear in the file. Check your manual that came with your C compiler

Spendthrift answered 23/4, 2009 at 4:38 Comment(0)
P
1

Just tried this with VS2008 and found that fscanf and fseek treated the CR and LF characters in the same way (as a single character).

So with two files:

0000000: 3132 3334 3554 3738 3930 3132 3334 3536 12345X7890123456

and

0000000: 3132 3334 350d 0a37 3839 3031 3233 3435 12345..789012345

If I read 15 characters I get to the second '5', then seek back 10 characters, my next character read is the 'X' in the first case and the CRLF in the second.

This seems like a very OS/compiler specific problem.

Persiflage answered 23/4, 2009 at 4:48 Comment(0)
P
0

Did you test the return value of fscanf? Post some code.

Take a look at ungetc. You may have to run a loop over it.

Padus answered 23/4, 2009 at 5:32 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.