carriage return by fgets
Asked Answered
B

3

9

I am running the following code:

#include<stdio.h>
#include<string.h>
#include<io.h>

int main(){
    FILE *fp;
    if((fp=fopen("test.txt","r"))==NULL){
        printf("File can't be read\n");
        exit(1);
    }
    char str[50];
    fgets(str,50,fp);
    printf("%s",str);
    return 0;
}

text.txt contains: I am a boy\r\n

Since I am on Windows, it takes \r\n as a new line character and so if I read this from a file it should store "I am a boy\n\0" in str, but I get "I am a boy\r\n". I am using mingw compiler.

Bruton answered 7/10, 2012 at 13:32 Comment(5)
How are you determining the content of str? Your program never seems to investigate that.Titer
By printing and it must not show '\r' right?Bruton
fgets will include the new line characters plus a null terminator, so the output you get is perfectly fine.Lesslie
@Lesslie But I am getting carriage return also which I should not get.Bruton
No, you should get the carriage return, that is the correct behaviour. it takes \r\n as a new line character is incorrect, it takes \n as a new line character.Circumlunar
L
6

Since I am on Windows, it takes \r\n as a new line character...

This assumption is wrong. The C standard treats carriage return and new line as two different things, as evidenced in C99 §5.2.1/3 (Character sets):

[...] In the basic execution character set, there shall be control characters representing alert, backspace, carriage return, and new-line. [...]

The fgets function description is as follows, in C99 §7.19.7.2/2:

The fgets function reads at most one less than the number of characters specified by n from the stream pointed to by stream into the array pointed to by s. No additional characters are read after a new-line character (which is retained) or after end-of-file. A null character is written immediately after the last character read into the array.

Therefore, when encountering the string I am a boy\r\n, a conforming implementation should read up to the \n character. There is no possibly sane reason why the implementation should discard \r based on the platform.

Lesslie answered 7/10, 2012 at 14:23 Comment(1)
ok thanx @netcoder. I read that Windows takes \r\n as a new line character, so I was in confusion.Bruton
O
10

The behavior depends on the c library implementation and which mode you pass to fopen. See this quote from the MSDN documentation on fopen (fopen on MSDN):

b - Open in binary (untranslated) mode; translations involving carriage-return and linefeed characters are suppressed.

Means, if you use the Microsoft c library, and open your file omitting the 'b', the carriage return characters will be removed from the stream.

Since you're using mingw, your compiler probably links against the GNU c library which follows the POSIX standard. This is what the GNU documentation says about fopen (fopen on gnu.org):

The character ‘b’ in opentype has a standard meaning; it requests a binary stream rather than a text stream. But this makes no difference in POSIX systems (including GNU systems).

Concluding: you're omitting the 'b' mode char, which opens your stream in text mode. You're on Windows but use a GNU c library which makes no difference between text and binary mode. This is why fgets reads both carriage return and new line.

Outbrave answered 27/9, 2013 at 6:37 Comment(0)
L
6

Since I am on Windows, it takes \r\n as a new line character...

This assumption is wrong. The C standard treats carriage return and new line as two different things, as evidenced in C99 §5.2.1/3 (Character sets):

[...] In the basic execution character set, there shall be control characters representing alert, backspace, carriage return, and new-line. [...]

The fgets function description is as follows, in C99 §7.19.7.2/2:

The fgets function reads at most one less than the number of characters specified by n from the stream pointed to by stream into the array pointed to by s. No additional characters are read after a new-line character (which is retained) or after end-of-file. A null character is written immediately after the last character read into the array.

Therefore, when encountering the string I am a boy\r\n, a conforming implementation should read up to the \n character. There is no possibly sane reason why the implementation should discard \r based on the platform.

Lesslie answered 7/10, 2012 at 14:23 Comment(1)
ok thanx @netcoder. I read that Windows takes \r\n as a new line character, so I was in confusion.Bruton
T
2

The c standard says this about text streams in (among other things):

Characters may have to be added, altered, or deleted on input and output to conform to differing conventions for representing text in the host environment. Thus, there need not be a one-to-one correspondence between the characters in a stream and those in the external representation. Data read in from a text stream will necessarily compare equal to the data that were earlier written out to that stream only if: the data consist only of printing characters and the control characters horizontal tab and new-line; no new-line character is immediately preceded by space characters; and the last character is a new-line character.

In other words, if a file is opened in text mode, an implementation is free to add, remove and modify control characters if it wants/needs to when going to and from disk. Which is apparently what the microsoft implementation does with the carriage return, but the gnu implementation doesn't.

Tolerable answered 19/3, 2021 at 8:57 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.