Is the default mode of ofstream implementation defined?
Asked Answered
J

2

5

Given the following code:

std::ofstream stream("somefile");

if (!stream)
{
   return 1;
}

When invoking .write(....) and using stdc++ and libc++ the stream is in binary mode (std::ios::binary).

However when using MSVC (2015/2017RC1) it seems to be in text mode or something weird, because the the resulting file is larger than what is actually written.

But if i explicitly set the mode std::ios::binary MSVC behaves similarly to the std::ofstream implementations of other standard libraries mentioned earlier.


Example code:

#include <vector>
#include <cstdio>
#include <fstream>

std::size_t fsz(const char* filename) {
    std::ifstream in(filename, std::ifstream::ate | std::ifstream::binary);
    return static_cast<std::size_t>(in.tellg());
}

int main() {
   std::ofstream stream("filename");

   if (!stream)
      return 1;

   std::vector<unsigned long long int> v = {0x6F1DA2C6AC0E0EA6, 0x42928C47B18C31A2, 0x95E20A7699DC156A, 0x19F9C94F27FFDBD0};

   stream.write(reinterpret_cast<const char*>(v.data()),v.size() * sizeof(unsigned long long int));

   stream.close();

   printf("expect: %d\n", v.size() * sizeof(unsigned long long int));
   printf("file size: %d\n", fsz("filename"));

   return 0;
}

Output for the above code when run with msvc:

expect: 32 
file size: 33

Output for the above code when run with libc++, stdc++:

expect: 32 
file size: 32

The difference can get much larger, it depends on how much data is written and the contents of the data.

at the end my question is still the same, is it undefined or unspecified behavior?


changing the above vector to the following makes the example more obvious as to whats going on.

std::vector<unsigned long long int> v = {0x0A0A0A0A0A0A0A0A, 0x0A0A0A0A0A0A0A0A, 0x0A0A0A0A0A0A0A0A, 0x0A0A0A0A0A0A0A0A};
Jerkwater answered 23/2, 2017 at 10:15 Comment(2)
Please describe exactly what you see. Larger by how much? What does the file contain? Where is your write() call? Present a minimal reproducible example.Acknowledge
@LightnessRacesinOrbit I've added an example.Jerkwater
E
6

When I run your code on windows using g++ and libstdc++, i get the following result:

expect: 32
file size: 33

So the problem is not compiler specific, but rather OS specific.

While C++ uses a single character \n to represent a line ending in a string, Windows uses two bytes 0x0D and 0x0A for a line ending in a file. This means that if you write a string into a file in text mode, all occurrences of the single character \n are written using those two bytes. That's why you get additional bytes in the file size of your examples.

Engrave answered 23/2, 2017 at 12:58 Comment(1)
'\n' in a character or a string literal ends a line in every C and C++ program. In a file under Windows the end of a line is represented by two bytes, with the values 0x0D and 0x0A. By coincidence, those values happen to be the same as the values that compilers use when they see '\r' and '\n' in a source file. There is no required connection between '\r' or '\n' and any particular value in compiled code or in a text file.Sandell
H
9

The default mode used by the stream constructor is ios_base::out. As there is no explicit text mode flag, this implies the stream is opened in text mode. Text mode only has an effect on Windows systems, where it converts \n characters to CR/LF pairs. On POSIX systems it has no effect, and text and binary modes are synonymous on these systems.

Hysteric answered 23/2, 2017 at 11:11 Comment(2)
0x0A bytes are converted to 0x0D0A, so in the 3rd element of the vector the 0x0A byte under windows is 'doubled'.Jerkwater
So @Torrie how did you come to the following conclusion? "When invoking .write(....) and using stdc++ and libc++ the stream is in binary mode (std::ios::binary)." That does not seem to be true.Acknowledge
E
6

When I run your code on windows using g++ and libstdc++, i get the following result:

expect: 32
file size: 33

So the problem is not compiler specific, but rather OS specific.

While C++ uses a single character \n to represent a line ending in a string, Windows uses two bytes 0x0D and 0x0A for a line ending in a file. This means that if you write a string into a file in text mode, all occurrences of the single character \n are written using those two bytes. That's why you get additional bytes in the file size of your examples.

Engrave answered 23/2, 2017 at 12:58 Comment(1)
'\n' in a character or a string literal ends a line in every C and C++ program. In a file under Windows the end of a line is represented by two bytes, with the values 0x0D and 0x0A. By coincidence, those values happen to be the same as the values that compilers use when they see '\r' and '\n' in a source file. There is no required connection between '\r' or '\n' and any particular value in compiled code or in a text file.Sandell

© 2022 - 2024 — McMap. All rights reserved.