Proper error handling for fclose impossible (according to manpage)?
Asked Answered
T

2

7

So I'm studying fclose manpage for quite I while and my conclusion is that if fclose is interrupted by some signal, according to the manpage there is no way to recover...? Am I missing some point?

Usually, with unbuffered POSIX functions (open, close, write, etc...) there is ALWAYS a way to recover from signal interruption (EINTR) by restarting the call; in contrast documentation of buffered calls states that after a failed fclose attempt another try has undefined behavior... no hint about HOW to recover instead. Am I just "unlucky" if a signal interrupts fclose? Data might be lost and I can't be sure whether the file descriptor is actually closed or not. I do know that the buffer is deallocated, but what about the file descriptor? Think about large scale applications that use lot's of fd's simultaneously and would run into problems if fd's are not properly freed -> I would assume there must be a CLEAN solution to this problem.

So let's assume I'm writing a library and it's not allowed to use sigaction and SA_RESTART and lots of signals are sent, how do I recover if fclose is interrupted? Would it be a good idea to call close in a loop (instead of fclose) after fclose failed with EINTR? Documentation of fclose simply doesn't mention the state of the file descriptor; UNDEFINED is not very helpful though... if fd is closed and I call close again, weird hard-to-debug side-effects could occur so naturally I would rather ignore this case as doing the wrong thing... then again, there is no unlimited number of file descriptors available, and resource leakage is some sort of bug (at least to me).

Of course I could check one specific implementation of fclose but I can't believe someone designed stdio and didn't think about this problem? Is it just the documentation that is bad or the design of this function?

This corner case really bugs me :(

Tinney answered 20/7, 2015 at 22:59 Comment(11)
I'm guessing you could get more (if not all) of the guarantees you're looking for by calling fflush first (and re-calling it if interrupted) before calling fclose.Damiandamiani
Where did you read about this in the manpages? I can't seem to find it.Melodeemelodeon
I have ubuntu 14.04 linux and the man page for fclose() has no mention of problems with being interrupted with a signal. Where did you see this info about a problem with a signal interrupting the fclose function?Dropwort
AFAIK, fclose(fileptr); does up to three things: ensures all data is written to the file in files open for write/update, (possibly) unlocks the file before closing the file descriptor if it is locked (e.g. C11's fopen(filename, "r+x"); for exclusive read/write access without truncating the file), and ensures that the file is closed. To that extent, fclose() could fail during fflush() (basically a loop that calls write() until all buffered data is written or possibly fsync()); one of flock(), lockf(), fcntl(), or an OS-specific lock (if a lock is used at all); or close().Uppermost
one easy fix is to call fsync() before calling fclose() as fsync() assures that all data has actually been written to the harddisk. first call fsync() on the file pointer, then call fsync() on the directory containing the file. However, hardly anyone ever bothers with those two calls to fsync()Dropwort
(continued...) See their man pages if you want more information about how to handle things. Notice that EINTR appears for write(), fsync(), and close(), so it's impossible to determine which one failed. Perhaps you'd prefer avoiding fclose() on POSIX systems to work around the issue?Uppermost
@ChronoKitsune, fclose flushes the user space buffers but not the OS buffers nor the Harddisk cache nor the directory that contains the file. However, as I stated above, two appropriate calls to fsync() will assure that everything is updated 'now' Otherwise, the OS will handle the actual updates when it 'gets around to it'Dropwort
@Dropwort True, but there is no verbiage preventing the OS from updating the file immediately.Uppermost
@user3629249: That might help if the OS crashes, but as I read it, OP is not concerned about that. And I would not recommend doing this, as it actively contradicts OS buffering.Wigan
You run into the same problem with close(). Essentially, unless the return code is 0 (success) or the error is EBADF (bad file descriptor — one which was not already open), there is no way to 'recover' from any other errors. The state of the underlying file descriptor is indeterminate. It might still be usable; it might simply be made available to another thread that happens to call open() at the wrong time — and you can't safely re-close the file descriptor because you don't know (in general).Gomorrah
The problematic part of the manpage can be found here f.e.: linux.die.net/man/3/fclose: The fclose() function may also fail and set errno for any of the errors specified for the routines close(2), write(2) or fflush(3).Tinney
E
7

EINTR and close()

In fact, there are also problems with close(), not only with fclose().

POSIX states that close() returns EINTR, which usually means that application may retry the call. However things are more complicated in linux. See this article on LWN and also this post.

[...] the POSIX EINTR semantics are not really possible on Linux. The file descriptor passed to close() is de-allocated early in the processing of the system call and the same descriptor could already have been handed out to another thread by the time close() returns.

This blog post and this answer explain why it's not a good idea to retry close() failed with EINTR. So in Linux, you can do nothing meaningful if close() failed with EINTR (or EINPROGRESS).

Also note that close() is asynchronous in Linux. E.g., sometimes umount may return EBUSY immediately after closing last opened descriptor on filesystem since it's not yet released in kernel. See interesting discussion here: page 1, page 2.


EINTR and fclose()

POSIX states for fclose():

After the call to fclose(), any use of stream results in undefined behavior.

Whether or not the call succeeds, the stream shall be disassociated from the file and any buffer set by the setbuf() or setvbuf() function shall be disassociated from the stream. If the associated buffer was automatically allocated, it shall be deallocated.

I believe it means that even if close() failed, fclose() should free all resources and produce no leaks. It's true at least for glibc and uclibc implementations.


Reliable error handling

  • Call fflush() before fclose().

    Since you can't determine if fclose() failed when it called fflush() or close(), you have to explicitly call fflush() before fclose() to ensure that userspace buffer was successfully sent to kernel.

  • Don't retry after EINTR.

    If fclose() failed with EINTR, you can not retry close() and also can not retry fclose().

  • Call fsync() if you need.

    • If you care about data integrity, you should call fsync() or fdatasync() before calling fclose() 1.
    • If you don't, just ignore EINTR from fclose().

Notes

  • If fflush() and fsync() succeeded and fclose() failed with EINTR, no data is lost and no leaks occur.

  • You should also ensure that the FILE object is not used between fflush() and fclose() calls from another thread 2.


[1] See "Everything You Always Wanted to Know About Fsync()" article which explains why fsync() may also be an asynchronous operation.

[2] You can call flockfile() before calling fflush() and fclose(). It should work with fclose() correctly.

Edieedification answered 28/7, 2015 at 8:44 Comment(1)
Archived blog post web.archive.org/web/20190926012553/https://alobbs.com/post/…Regretful
C
2

Think about large scale applications that use lot's of fd's simultaneously and would run into problems if fd's are not properly freed -> I would assume there must be a CLEAN solution to this problem.

The possibility to retry fflush() and then close() on the underlying file descriptor was already mentioned in the comments. For a large scale application, I would favour the pattern to use threads and have one dedicated signal handling thread, while all other threads have signals blocked using pthread_sigmask(). Then, when fclose() fails, you have a real problem.

Chlamydeous answered 21/7, 2015 at 6:18 Comment(6)
Of course I personally would always prefer to simply block signals and read them synchronously, but there are cases (library) where I might not be in control of how signals are handled and still don't want my code to leak resources.Tinney
Is this a general concern? Because I'd argue a sane library should either leave sigmasks alone or explicitly document that.Chlamydeous
If I leave sigmask alone there is a risk that a user of my library run into this problem. And depending on how the user of my library uses signals it might start leaking resources.This is a corner case, and usually I don't dig into every manual in detail. I stumbled over it. It's a general concern; To me, either there is a severe bug in fclose() or the documentation should state that fclose() will never return EINTR (as it is complete nonsense to do so) or it should at least mention actions that have to be taken. EINTR is not a real error, it's just a coincident that has to be dealt with.Tinney
"To me, either there is a severe bug in fclose() or the documentation should state that fclose() will never return EINTR (as it is complete nonsense to do so)" -- I don't think so, because this is part of the standard library and so shouldn't set constraints on its usage, users might decide timely delivered signals are more important. Restarting syscalls inside fclose() would be a bad decision, too, because users expect they CAN interrupt library calls with a signal. It MIGHT be wise to elaborate a bit in the documentation... that said, it's more the concept of signals I'd consider "broken"Chlamydeous
Well okay, so does that mean I can safely assume that fd is valid after EINTR from fclose(), so if I don't care about data loss or minimize the risk by flushing before (and repeating on EINTR) I can close the fd via close() (possibly repeating on EINTR)?Tinney
You can't in Linux. fclose() will free memory and call close() even if fflush() failed. close() will release file descriptor even if it will EINTR later. So both fd and FILE become invalid after fclose().Edieedification

© 2022 - 2024 — McMap. All rights reserved.