Should we use pandas.compat.StringIO or Python 2/3 StringIO?
Asked Answered
J

2

9

StringIO is the file-like string buffer object we use when reading pandas dataframe from text, e.g. "How to create a Pandas DataFrame from a string?"

Which of these two imports should we use for StringIO (within pandas)? This is a long-running question that has never been resolved over four years.

  1. StringIO.StringIO (Python 2) / io.StringIO (Python 3)
    • Advantages: more stable for futureproofing code, but forces us to version-fork, e.g. see code at bottom from EmilH.
  2. pandas.compat.StringIO

Version 2/3 forking code for imports from standard (from EmilH):

import sys
if sys.version_info[0] < 3: 
    from StringIO import StringIO
else:
    from io import StringIO

# Note: but this is very much a poor-man's version of pandas.compat, which contains much much more

Note:

Jameson answered 11/5, 2018 at 0:27 Comment(6)
cc: @Jeff jreback ...Jameson
The lack of a standard causes both confusion and breakage.Jameson
I think this is primarily opinion based. All the approaches work so use the one that you feel most comfortable with. When I answered the referenced question I used the snippet to show what to use in both Python 3 and Python 2. Today, 4 years later I'm only using Python 3 so it's a non-issue for me. Stackoverflow is probably not the place to push for a standard on this. If it's important raise an issue on the pandas issue tracker perhaps?Iphlgenia
@EmilH: it's not opinion-based, it depends on whether the pandas developers plan to change their guidance on pandas.compat. We don't even need everything inside pandas.compat to be stable, only the identifiers I named, but in any case it has been stable since late 2015, so their warning is overly severeJameson
@Jameson I agree that today their warning is overly severe (at least for StringIO). But, no matter the pandas developers opinion on: Which of these two imports should we use for StringIO (within pandas)? the answer is still based on opinion. If the question was: Is there an officially recommended way to use StringIO (within pandas)? That would not be opinion based, but reading the docs the recommendation would still currently be to not use the pandas.compat (despite that being an arguably cleaner way of getting hold of StringIO).Iphlgenia
@EmilH: it's not based on opinion, it's based on facts. Specifically, the current and future status of pandas.compat, per the pandas devel crew. (Not the doc's old 'official recommendation' and the doc on that which are clearly 3+ years out-of-date). Please look at the github links for code and issues, which I cited you. Where the current code disagrees with 4-year-old doc, ignore the doc. This wouldn't be the first time that a package's doc lagged the reality of its code, or github, by years.Jameson
R
7

I know this is an old question, but I followed breadcrumbs here, so perhaps still worth answering. It's not totally definitive, but current Pandas documentation suggests using the built in StringIO rather than it's own internal methods.

For examples that use the StringIO class, make sure you import it with from io import StringIO for Python 3.

Regulation answered 11/8, 2021 at 11:2 Comment(1)
Yes that's the answer these days. (I had meant to self-answer and close this years ago)Jameson
H
6

FYI, as of pandas 0.25, StringIO was removed from pandas.compat (PR #25954), so you'll now see:

from pandas.compat import StringIO

ImportError: cannot import name 'StringIO' from 'pandas.compat'

This means the only answer is to import from the io module.

Hautbois answered 21/12, 2021 at 23:22 Comment(1)
Thanks for the update MikeTJameson

© 2022 - 2024 — McMap. All rights reserved.