Python + MySQLdb executemany
Asked Answered
A

2

7

I'm using Python and its MySQLdb module to import some measurement data into a Mysql database. The amount of data that we have is quite high (currently about ~250 MB of csv files and plenty of more to come).

Currently I use cursor.execute(...) to import some metadata. This isn't problematic as there are only a few entries for these.

The problem is that when I try to use cursor.executemany() to import larger quantities of the actual measurement data, MySQLdb raises a

TypeError: not all arguments converted during string formatting

My current code is

def __insert_values(self, values):
    cursor = self.connection.cursor()
    cursor.executemany("""
        insert into values (ensg, value, sampleid)
        values (%s, %s, %s)""", values)
    cursor.close()

where values is a list of tuples containing three strings each. Any ideas what could be wrong with this?

Edit:

The values are generated by

yield (prefix + row['id'], row['value'], sample_id)

and then read into a list one thousand at a time where row is and iterator coming from csv.DictReader.

Astrodynamics answered 10/6, 2009 at 10:9 Comment(2)
Have you verified that values contains the right data? As an aside, you should look at using LOAD LOCAL DATA INFILE. It can be a lot faster.Tambour
The problem is that the data needs to be preprocessed (which is currently done in the same python script) and it isn't very viable to create redundant copies of such large datasets. The values should be ok as I checked that with a debugger.Astrodynamics
A
8

In retrospective this was a really stupid but hard to spot mistake. Values is a keyword in sql so the table name values needs quotes around it.

def __insert_values(self, values):
    cursor = self.connection.cursor()
    cursor.executemany("""
        insert into `values` (ensg, value, sampleid)
        values (%s, %s, %s)""", values)
    cursor.close()
Astrodynamics answered 16/6, 2009 at 14:54 Comment(0)
S
3

The message you get indicates that inside the executemany() method, one of the conversions failed. Check your values list for a tuple longer than 3.

For a quick verification:

max(map(len, values))

If the result is higher than 3, locate your bad tuple with a filter:

[t for t in values if len(t) != 3]

or, if you need the index:

[(i,t) for i,t in enumerate(values) if len(t) != 3]
Shingles answered 10/6, 2009 at 11:2 Comment(1)
Thanks but that isn't the problem. All of the tuples have the correct length.Astrodynamics

© 2022 - 2024 — McMap. All rights reserved.