StringIO is the file-like string buffer object we use when reading pandas dataframe from text, e.g. "How to create a Pandas DataFrame from a string?"
Which of these two imports should we use for StringIO (within pandas)? This is a long-running question that has never been resolved over four years.
StringIO.StringIO
(Python 2) /io.StringIO
(Python 3)- Advantages: more stable for futureproofing code, but forces us to version-fork, e.g. see code at bottom from EmilH.
pandas.compat.StringIO
- pandas.compat is a 2/3 compatibility package ("without the need for 2to3") introduced back in 0.13.0 (Jan 2014)
- pandas.compat package is still marked 'private' as of 0.22 and no plans to make 'public' says "Warning The pandas.core, pandas.compat, and pandas.util top-level modules are considered to be PRIVATE. Stability of functionality in those modules in not guaranteed." although they essentially haven't broken since 0.13
- pandas.compat source defines
the imports
builtins, StringIO/cStringIO, BytesIO, cPickle, httplib
, iterator versions of range, filter, map and zip, plus other necessary elements for Python 3 compatibility - see the 0.13.0 whatsnew
Version 2/3 forking code for imports from standard (from EmilH):
import sys
if sys.version_info[0] < 3:
from StringIO import StringIO
else:
from io import StringIO
# Note: but this is very much a poor-man's version of pandas.compat, which contains much much more
Note:
pandas.compat
has existed since pandas 0.13.0 (Jan 2014) as a subpackage within pandas- it also seems to have been released as a standalone package: 0.1.0 (Jun 10, 2017) and 0.1.1 (Jun 10, 2017)
pandas.compat
. We don't even need everything insidepandas.compat
to be stable, only the identifiers I named, but in any case it has been stable since late 2015, so their warning is overly severe – JamesonStringIO
). But, no matter the pandas developers opinion on: Which of these two imports should we use for StringIO (within pandas)? the answer is still based on opinion. If the question was: Is there an officially recommended way to useStringIO
(within pandas)? That would not be opinion based, but reading the docs the recommendation would still currently be to not use thepandas.compat
(despite that being an arguably cleaner way of getting hold ofStringIO
). – Iphlgeniapandas.compat
, per the pandas devel crew. (Not the doc's old 'official recommendation' and the doc on that which are clearly 3+ years out-of-date). Please look at the github links for code and issues, which I cited you. Where the current code disagrees with 4-year-old doc, ignore the doc. This wouldn't be the first time that a package's doc lagged the reality of its code, or github, by years. – Jameson