correctly parsing a CSV file from an FTP server with app engine
Asked Answered
R

2

6

I'm trying to read a CSV file from an FTP server and parse it on app engine. I can access the file and read it to StringIO but when I try to loop over the files lines it just loops over every character instead of lines. Not sure what I'm doing wrong here:

ftp = FTP('ftp.mydomain.com', 'username', 'pwd')
ftp.set_pasv(True)
r = StringIO()
ftp.retrbinary('RETR test.csv', r.write)

csvfile = csv.reader(r.getvalue(), delimiter=',')

for line in csvfile: 
    print line

this ends up in something like this:

['O']
['R']
['D']
['E']
['R']
['N']
['O']
['', '']
['O']
['R']
['D']
['E']
['R']
['D']
['A']
['T']
['E']
['', '']
['I']
['N']
['V']
['O']
['I']
['C']
['E']
['N']
['O']
['', '']
...

What is the correct way to do this and correctly parse the file from FTP so the csv module can read it correctly?

Reserved answered 18/6, 2014 at 17:47 Comment(0)
M
4

Split the long string on newlines; csv.reader() expects an iterable, where each iteration a line is yielded. You are giving it a string, iteration is over individual characters then:

csvfile = csv.reader(r.getvalue().splitlines(), delimiter=',')

You don't show how StringIO() was imported. If it is the python version (from StringIO import StringIO) you can simply seek back to the start and pass that in directly:

r.seek(0)
csvfile = csv.reader(r, delimiter=',')
Manganate answered 18/6, 2014 at 17:53 Comment(0)
C
2

For Python 3.x and csv.DictReader:

bio = io.BytesIO()
resp = ftp.retrbinary("RETR " + filename, bio.write)
bio.seek(0)
csv_data = csv.DictReader(io.TextIOWrapper(bio, newline=None), delimiter=',')
for row in data:
    ...

It took me a while to find this solution so I'm posting it. The answers I found did not address the problem to preserve the data in a way that would make DictReader happy.

If you don't care about DictReader the following might work out:

sio = io.StringIO()
resp = ftp.retrlines("RETR " + filename, sio.write)
sio.seek(0)

Not that you need the retrlines because the Python3 StringIO does not accept binary.

Concertino answered 21/7, 2017 at 12:23 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.