Does fwrite buffer the output?
Asked Answered
M

5

15

In C++, there is an std::fwrite() which writes a buffer to a file on disk.

Can you please tell me if there is any buffer inside that fwrite implementation?

i.e. if I call fwrite() multiple times (say 10 times), does it actually invoke file I/O 10 different times?

I am asking this for Ubuntu 10.04 environment.

Matthaus answered 10/5, 2010 at 20:12 Comment(0)
R
17

Yes, it is buffered. The size of the buffer is defined by BUFSIZ. (Thanks @ladenedge.) Ten sufficiently large operations will result in ten system calls.

You are also allowed to set your own buffer using std::setbuf and std::setvbuf.

The functions in cstdio are inherited from C. The preferred interface in C++ is fstream and filebuf.

Rifleman answered 10/5, 2010 at 20:17 Comment(2)
I believe the default buffer size is BUFSIZ, as defined in stdio.h, though of course callers shouldn't need that information. setbuf may be used to control the size on some systems.Finnougrian
Lol, you posted that literally at the exact second I made an edit: "ladenedge 1 sec ago"Rifleman
I
8

It depends. On most platforms, file IO is cached hence the fflush() call exists to write cached data to disk - in that respect, the answer to your first question is (usually) "yes", and your second "no". That said, it's not guaranteed to be that way by any stretch of the imagination for any particular platform - generally, the standards only specify the interface for such functions, not their implementation. Also, its quite possible that calling fwrite() will cause an implicit flush if the cache becomes "full", in which case calling fwrite() may in fact trigger file IO - especially if calling fwrite() with large amounts of data.

Intuition answered 10/5, 2010 at 20:19 Comment(8)
The library IO functions are buffered, but also the system IO functions usually are; not only that, the disk controller too has a cache. Actually, usually the problem with disks is not getting sure that your output gets cached, but getting sure it's really written to disk when you need it. :)Usm
@Matteo, I know what you mean - for example, using fwrite to write an error log can be interesting if your program crashes before the caches are written to disk. Luckily there's fflush for that...Intuition
"standards only specify the interface for such functions, not their implementation" is not accurate. While the standard does not specify implementation, it does specify behavior. And in this case, the standard does specify that FILE* is buffered (C++ incorporates the C standard in 17.3.1.4/1 and the C standard define that FILE* is buffered in 7.19.5.6/2).Brewer
Worst case, assume 1:1 write operations with fwrite(). Profile your code to see if there is only 1 file write operation. There is generally a difference in execution speed between 1 file write operation and 10. I prefer to perform buffering in my code rather than rely on the OS or Run-Time Library.Counterproductive
Profile your code to see what the bottleneck is, rather than worrying about what you think it is.Desiccated
@R Samuel Klatchko: while I might concede that the sentence you highlight could be a little more accurate, I still stand by it - note the use of "generally" preceding the portion you quoted. One might also consider behaviour of an API part of the interface, in that a programmer may need to know about it to successfully use said API. That said, thanks for taking the time to find out what the standard actually does say - I've certainly learnt something I didn't know before.Intuition
@Intuition - okay, I could go with the definition that behavior is part of the interface and not the implementation. But your answer is still incorrect in that it says buffering is not guaranteed even though it is required behavior.Brewer
@R Samuel Klatchko: Sure, on any standards-compliant platform you'll be guaranteed buffering. That wont stop a non-compliant platform from not buffering output though. I don't know of any such platforms offhand, but I can easily imagine (say) a microcontroller-based platform using immediate file IO rather than buffered IO due to memory constraints. You're absolutely right - that would not be standard behaviour, but that doesn't mean it can't happen.Intuition
B
4

FILE* I/O can be buffered but it does not have to be (it can be different for each stream). Furthermore, there are multiple ways to do buffering (fully buffered where the buffer is not flushed until it is full or an explicit call to fflush is made or line buffered where the buffer is not flushed until it sees an EOL). It is also possible to turn off buffering altogether.

You can change the type of buffering with the setvbuf call.

Brewer answered 10/5, 2010 at 20:48 Comment(0)
D
0

It depends on the library implementation. You might take a look at something like dtruss or strace to see what system calls are actually invoked by the implementation.

Why do you care about this?

Desiccated answered 10/5, 2010 at 20:20 Comment(2)
My program is doing a lot of fwrite(). And it is running on Ubuntu 10.04. I would like to know if I need to buffering up the IO myself before calling fwrite() for performance reason.Matthaus
@michael: Yes, buffer up the I/O. Reducing the number of calls to fwrite will at the minimum reduce the number of branches in your code. Branches tend to slow down performance as many processors reload their instruction cache after a branch. Also, less code to execute makes a program faster.Counterproductive
A
0

I don't believe the standards says anything about this but in general: no the operating system will decide how many I/O operations to do. It could be 1 it could be 10. It could even be more if the data size is large. Most operating systems allow you to control how I/o should behave but AFAIK there's no portable way to do it. (and often you don't want to anyway)

Aurora answered 10/5, 2010 at 20:20 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.