Convert large CSV file to excel using Python 3
Asked Answered
F

2

2

this is my code covert CSV file to .xlsx file, for small size CSV file this code is working fine, but when I tried for larger size CSV files, Its shows an error.

import os
import glob
import csv
from xlsxwriter.workbook import Workbook

for csvfile in glob.glob(os.path.join('.', 'file.csv')):
    workbook = Workbook(csvfile[:-4] + '.xlsx')
    worksheet = workbook.add_worksheet()
    with open(csvfile, 'r', encoding='utf8') as f:
        reader = csv.reader(f)
        for r, row in enumerate(reader):
            for c, col in enumerate(row):
                worksheet.write(r, c, col)
    workbook.close()

the error is

File "CsvToExcel.py", line 12, in <module>
for r, row in enumerate(reader):
_csv.Error: field larger than field limit (131072)
Exception ignored in: <bound method Workbook.__del__ of 
<xlsxwriter.workbook.Workbook object at 0x7fff4e731470>>
Traceback (most recent call last):
File "/usr/local/lib/python3.5/dist-packages/xlsxwriter/workbook.py", line 
153, in __del__
Exception: Exception caught in workbook destructor. Explicit close() may be 
required for workbook.
Faille answered 16/11, 2017 at 13:16 Comment(4)
Did you try to directly import the csv file into Excel?Chiba
yes I am trying to convert my local csv file into excel file directlyFaille
And is it correctly imported when you import it directly form the Excel application?Chiba
I find Out New Code with panda package, my new code is working fine now import pandas data = pandas.read_csv('Documents_2/AdvMedcsv.csv') data = data.groupby(lambda x: data['research_id'][x]).first() writer = pandas.ExcelWriter('Documents_2/AdvMed.xlsx',engine='xlsxwriter') data.to_excel(writer) writer.save()Faille
F
1

I found Out New Code with panda package, this code is working fine now

import pandas
data = pandas.read_csv('Documents_2/AdvMedcsv.csv') 
data = data.groupby(lambda x: data['research_id'][x]).first() 
writer = pandas.ExcelWriter('Documents_2/AdvMed.xlsx',engine='xlsxwriter')data.to_excel(writer) 
writer.save()
Faille answered 6/7, 2020 at 6:30 Comment(1)
Not sure why I am receiving a syntax error with this code (python 3.7). Any suggestions?Triennium
S
0

While using large files, it's better to use 'constant_memory' for controlled memory usage like:

workbook = Workbook(csvfile + '.xlsx', {'constant_memory': True}).

Ref: xlsxwriter.readthedocs.org/en/latest/working_with_memory.htm‌​l

Scornful answered 16/11, 2017 at 13:25 Comment(5)
_csv.Error: field larger than field limit (131072)Faille
Excel is limited to somewhat over 1 million rows ( 2^20 to be precise), and apparently you're trying to load more than that. I think that the technique you refer to as splitting is the built-in thing Excel has, but I'm afraid that only works for width problems, not for length problems. The really easiest way I see right away is to use some file splitting tool - there's tons of 'em and use that to load the resulting partial csv files into multiple worksheets.Scornful
is there any other way to convert the large CSV file to excel throw python code only.?Faille
why don't you convert it into chunks of data as you know the limitation, i am sure it won't failRackrent
I find Out New Code with panda package, my new code is working fine now import pandas data = pandas.read_csv('Documents_2/AdvMedcsv.csv') data = data.groupby(lambda x: data['research_id'][x]).first() writer = pandas.ExcelWriter('Documents_2/AdvMed.xlsx',engine='xlsxwriter') data.to_excel(writer) writer.save()Faille

© 2022 - 2024 — McMap. All rights reserved.