Say you read a row from a CSV:
from StringIO import StringIO
import csv
infile = StringIO('hello,"foo, bar"')
reader = csv.reader(infile)
row = reader.next() # row is ['hello', 'foo, bar']
The second value in the row is foo, bar
instead of "foo, bar"
. This isn't some Python oddity, it's a reasonable interpretation of CSV syntax. The quotes probably weren't placed there to be part of a value, but rather to show that foo, bar
is one value and shouldn't be split into foo
and bar
based on the comma (,
). An alternative solution would be to escape the comma when creating the CSV file, so the line would look like:
hello,foo \,bar
So it's quite a strange request to want to keep those quotes. If we know more about your use case and the bigger picture we can help you better. What are you trying to achieve? Where does the input file come from? Is it really a CSV or is it some other syntax that looks similar? For example if you know that every line consists of two values separated by a comma, and the first value never contains a comma, then you can just split on the first comma:
print 'hello,"foo, bar"'.split(',', 1) # => ['hello', '"foo, bar"']
But I doubt the input has such restrictions which is why things like quotes are needed to resolve ambiguities.
If you're trying to write to a CSV again, then the quotes will be recreated as you're doing so. They don't have to be there in the intermediate list:
outfile = StringIO()
writer = csv.writer(outfile)
writer.writerow(row)
print outfile.getvalue()
This will print
hello,"foo, bar"
You can customise the exact CSV output by setting a new dialect.
If you want to grab the individual values in the row with the appropriate quoting rules applied to them, it's possible, but it's a bit of a hack:
# We're going to write individual strings, so we don't want a line terminator
csv.register_dialect('no_line_terminator', lineterminator='')
def maybe_quote_string(s):
out = StringIO()
# writerow iterates over its argument, so don't give it a plain string
# or it'll break it up into characters
csv.writer(out, 'no_line_terminator').writerow([s])
return out.getvalue()
print maybe_quote_string('foo, bar')
print map(maybe_quote_string, row)
The output is:
"foo, bar"
['hello', '"foo, bar"']
This is the closest I can come to answering your question. It's not really keeping the double quotes, rather it's removing them and adding them back with likely the same rules that put them there in the first place.
I'll say it again, you're probably headed down the wrong path with this question. Others will probably agree. That's why you're struggling to get good answers. What is the bigger problem that you're trying to solve? We can help you better to achieve that.
csv.reader()
? – Pupa['hello', 'foo, bar']
? – Denominational['hello', 'foo, bar']
, how are you running it? – Denominational""foo, bar""
– Faustinafaustine