I am now using PyExcelerator for reading excel files, but it is extremely slow. As I always need to open excel files more than 100MB, it takes me more than twenty minutes to only load one file.
The functionality I need are:
- Open Excel Files, Select Specific Tables, And Load Them Into a Dict or List object.
- Sometimes: Select Specific Columns And Only Load Whole Lines Which Have the Specific Columns in Specific Values.
- Read Excel Files With Password Protected.
And the code I am using now is:
book = pyExcelerator.parse_xls(filepath)
parsed_dictionary = defaultdict(lambda: '', book[0][1])
number_of_columns = 44
result_list = []
number_of_rows = 500000
for i in range(0, number_of_rows):
ok = False
result_list.append([])
for h in range(0, number_of_columns):
item = parsed_dictionary[i,h]
if type(item) is StringType or type(item) is UnicodeType:
item = item.replace("\t","").strip()
result_list[i].append(item)
if item != '':
ok = True
if not ok:
break
Any suggestions?