I want to interpret stdin as a binary file. Why is freopen failing on Windows?
Asked Answered
R

1

5

TL;DR: Why does freopen(NULL, "rb", stdin) always fail on Windows?

I'm trying to re-implement a base64 encoder in C that takes input from stdin and outputs the encoded equivalent into stdout. I had a problem in my previous post where fread was signalling EOF prematurely. Here is what my main method looks like:

int main(void)
{
    unsigned char buffer[BUFFER_SIZE];
    unsigned char base64_buffer[BASE64_BUFFER];

    while (1)
    {
        TRACE_PUTS("Reading in data from stdin...");
        size_t read = fread(buffer, 1, sizeof(buffer), stdin); /* Read the data in using fread(3) */

        /* Process the buffer */

        TRACE_PRINTF("Amount read: %zu\n", read);
        TRACE_PUTS("Beginning base64 encode of buffer");
        size_t encoded = base64_encode(buffer, read, base64_buffer, sizeof(base64_buffer));

        /* Write the data to stdout */
        TRACE_PUTS("Writing data to standard output");
        ...

        if (read < sizeof(buffer))
        {
            break; /* We reached EOF or had an error during the read */
        }
    }

    if (ferror(stdin))
    {
        /* Handle errors */
        fprintf(stderr, "%s\n", "There was a problem reading from the file.");
        exit(1);
    }

    puts(""); /* Output a newline before finishing */

    return 0;
}

Essentially, it reads in data from stdin using fread, encodes to base64, writes it to stdout, then checks if EOF has been reached at the end of the loop.

When I piped the contents of a binary file to this app's stdin, it would only read a fraction of the total bytes in the file. For example:

$ cat /bin/echo | my_base64_program >/dev/null # only view the trace output
TRACE: C:/Users/James/Code/c/base64/main.c:23: Reading in data from stdin...
TRACE: C:/Users/James/Code/c/base64/main.c:28: Amount read: 600
TRACE: C:/Users/James/Code/c/base64/main.c:29: Beginning base64 encode of buffer
TRACE: C:/Users/James/Code/c/base64/main.c:43: Writing data to standard output
TRACE: C:/Users/James/Code/c/base64/main.c:23: Reading in data from stdin...
TRACE: C:/Users/James/Code/c/base64/main.c:28: Amount read: 600
TRACE: C:/Users/James/Code/c/base64/main.c:29: Beginning base64 encode of buffer
TRACE: C:/Users/James/Code/c/base64/main.c:43: Writing data to standard output
TRACE: C:/Users/James/Code/c/base64/main.c:23: Reading in data from stdin...
TRACE: C:/Users/James/Code/c/base64/main.c:28: Amount read: 600
TRACE: C:/Users/James/Code/c/base64/main.c:29: Beginning base64 encode of buffer
TRACE: C:/Users/James/Code/c/base64/main.c:43: Writing data to standard output
TRACE: C:/Users/James/Code/c/base64/main.c:23: Reading in data from stdin...
TRACE: C:/Users/James/Code/c/base64/main.c:28: Amount read: 569
TRACE: C:/Users/James/Code/c/base64/main.c:29: Beginning base64 encode of buffer
TRACE: C:/Users/James/Code/c/base64/main.c:43: Writing data to standard output

$ cat /bin/echo | wc -c
28352

As you can see /bin/echo is 28352 bytes long, but only ~2400 of them are being processed. I believe the reason is because stdin is not being considered a binary file, so certain control characters (like Control-Z as mentioned in the linked post's answer) were prematurely signalling EOF.

I took a look at the base64 source code and it looks like they're using xfreopen (which is just a wrapper for freopen) to tell fread to interpret stdin as binary. So I went ahead and did that before the while-loop:

if (!freopen(NULL, "rb", stdin))
{
    fprintf(stderr, "freopen failed. error: %s\n", strerror(errno));
    exit(1);
}

However, now my app always exits at that point with:

$ cat /bin/echo | my_base64_program
freopen failed. error: Invalid argument

So why is freopen at that point failing, when it works for base64? I'm using MinGW-w64 with GCC on Windows if that's relevant.

Retractor answered 6/9, 2016 at 0:42 Comment(3)
I remember I saw that here somewhere...somewhere...yes: #7053396 looks tricky but possibleWhitmire
@Whitmire Ah, so that's why it's failing on Windows- the second answer in the question you linked to states that NULL is not accepted for the filename on Windows, and that it returns NULL if that's the case. (I'm using MinGW with GCC.)Retractor
https://mcmap.net/q/246973/-c-read-binary-stdin .. although I guess you saw that already since you edited that question (but apparently didn't read the answer comments that already explain that freopen(NULL, ...) does not work on Windows).Kristankriste
D
9

Why freopen() might fail in general

The C standard says:

If filename is a null pointer, the freopen function attempts to change the mode of the stream to that specified by mode, as if the name of the file currently associated with the stream had been used. It is implementation-defined which changes of mode are permitted (if any), and under what circumstances.

Presumably, your implementation doesn't allow the changes you're trying to make. On Mac OS X, for example, the man page for freopen() adds:

The new mode must be compatible with the mode that the stream was originally opened with:

  • Streams originally opened with mode "r" can only be reopened with that same mode.
  • Streams originally opened with mode "a" can be reopened with the same mode, or mode "w".
  • Streams originally opened with mode ``w'' can be reopened with the same mode, or mode "a".
  • Streams originally opened with mode "r+", "w+", or "a+" can be reopened with any mode.

With that said, on Mac OS X (where b is a no-op anyway), you'd be OK.

Why freopen() fails on Windows specifically

However, you're on Windows. You need to learn how to find and read the documentation. I use a Google search with the term 'site:msdn.microsoft.com freopen' for whatever function I'm looking for. That specific search yields the manual for freopen() where it says:

If path, mode, or stream is a null pointer, or if filename is an empty string, these functions invoke the invalid parameter handler, as described in Parameter Validation. If execution is allowed to continue, these functions set errno to EINVAL and return NULL.

That's the documented behaviour: it is also the behaviour you are seeing. The manual for your system is helpful. It basically says "thou shalt not".

How to fix the input mode of standard input on Windows

I note that in my answer to your previous question, I pointed to _setmode():

However, it is more likely that you need _setmode():

_setmode(_fileno(stdin), O_BINARY);

This is the advice that is given in the answers to the question that deamentiaemundi pointed to.

I note in passing that the Microsoft manual page for setmode() says:

This POSIX function is deprecated. Use the ISO C++ conformant _setmode instead.

This is a curious comment because POSIX does not standardize a function setmode() in the first place.

You can find Microsoft's documentation for fileno(). It too has the spiel about POSIX (but this time it is accurate; POSIX does specify fileno()) and refers you to _fileno().

Dost answered 6/9, 2016 at 1:25 Comment(6)
Microsoft doesn't specify who has deprecated the functions. Those functions are deprecated by Microsoft on just Microsoft platforms.Filmy
I've produced diatribes about the way Microsoft treats POSIX functions as deprecated even if you explicitly request their presence using the (C and C++) standard approved techniques. I couldn't be bothered to find or repeat them this time. All else apart, SO is supposed to 'family friendly' in is content, and my main thoughts on the subject are not.Dost
Jonathan, the "POSIX function" line is almost certainly boilerplate which they just put on the documentation every runtime library function which needed to be renamed to conform with ISO C++. Most of them were POSIX or at least POSIX-y, doubtless nobody thought it worth while tracking down the exceptions. May I ask whether that part of your answer is really relevant to the question? The OP just needs to know to use _setmode and _fileno, the relationship or lack thereof between those functions and POSIX seems unimportant.Branch
@HarryJohnston: I deem it relevant. If you want to produce an answer with distinctively new information in it that doesn't contain that, be my guest. And note that all I said in the answer is that Microsoft specifies that it is deprecated (on their system since that is what they are documenting), and the claiming the setmode() is a POSIX function is inaccurate; it is not standardized by POSIX. Nothing contentious about that, I think. (Oh, the other answer talks about setmode() and not _setmode(), so it is relevant to explain that discrepancy.)Dost
Not contentious per se, just liable to attract comment. That's your prerogative, of course, no criticism was intended. As for the documentation, on second thoughts it isn't so much a question of boilerplate as abridgment; what it means, IMO, is something like "This function, one of a collection of functions which was based upon parts of the original de-facto C standard that later became a part of POSIX, is deprecated." Which, while more accurate, is a bit of a mouthful. :-) Perhaps they should have said "POSIX-related function" rather than "POSIX function" though.Branch
IoW, while it is true that the setmode() function isn't standardized in POSIX, the function acts on a file descriptor, which is. So it is only imprecise to call it a POSIX function, not outright wrong. (Although I have this nagging recollection that there's another example of the same boilerplate on a function that really has nothing at all to do with POSIX. I'm not sure, perhaps I'm imaging things.)Branch

© 2022 - 2024 — McMap. All rights reserved.