StringIO
has the following notes in its code:
Notes:
- Using a real file is often faster (but less convenient).
- There's also a much faster implementation in C, called cStringIO, but
it's not subclassable.
The "real file is often faster" line seemed really odd to me: how could writing to disk beat writing to memory? I tried profiling these different cases and got results that contradict these docs, as well as the answer to this question. This other question does explain why cStringIO is slower under some circumstances, though I'm not doing any concatenating here. The test writes a given amount of data to a file, then seeks to the beginning and reads it back out. On the "new" tests, I created a new object each time, and on the "same" ones I truncate and reuse the same object for each repetition to rule out that source of overhead. That overhead mattered for using tempfiles with small data sizes but not large ones.
Code is here.
Using 1000 passes with size 1.0KiB
New StringIO: 0.0026 0.0025 0.0034
Same StringIO: 0.0026 0.0023 0.0030
New cStringIO: 0.0009 0.0010 0.0008
Same cStringIO: 0.0009 0.0009 0.0009
New tempfile: 0.0679 0.0554 0.0542
Same tempfile: 0.0069 0.0064 0.0070
==============================================================
Using 1000 passes with size 100.0KiB
New StringIO: 0.0093 0.0099 0.0108
Same StringIO: 0.0109 0.0090 0.0086
New cStringIO: 0.0130 0.0139 0.0120
Same cStringIO: 0.0118 0.0115 0.0124
New tempfile: 0.1006 0.0905 0.0899
Same tempfile: 0.0573 0.0526 0.0523
==============================================================
Using 1000 passes with size 1.0MiB
New StringIO: 0.0727 0.0700 0.0717
Same StringIO: 0.0740 0.0735 0.0712
New cStringIO: 0.1484 0.1399 0.1470
Same cStringIO: 0.1493 0.1393 0.1465
New tempfile: 0.6576 0.6750 0.6821
Same tempfile: 0.5951 0.5870 0.5678
==============================================================
Using 1000 passes with size 10.0MiB
New StringIO: 1.0965 1.1129 1.1079
Same StringIO: 1.1206 1.2979 1.1932
New cStringIO: 2.2532 2.2162 2.2482
Same cStringIO: 2.2624 2.2225 2.2377
New tempfile: 6.8350 6.7924 6.8481
Same tempfile: 6.8424 7.8114 7.8404
==============================================================
The two StringIO
implementations were pretty comparable, though cStringIO
slowed down significantly for large data sizes. But the tempfile.TemporaryFile
always took 3 times as long as the slowest StringIO
.
file
-- nottempfile.TemporaryFile
. – ArleenTemporaryFile
to be much different thanfile
for reading and writing. And looking attempfile
,TemporaryFile
is actually a function that returnsfile
s, at least on POSIX (which is what I used). Any overhead from creation differences should be taken care of by the "Same tempfile" cases. – Bagasse