pysqlite2: ProgrammingError - You must not use 8-bit bytestrings
Asked Answered
W

5

13

I'm currently persisting filenames in a sqlite database for my own purposes. Whenever I try to insert a file that has a special character (like é etc.), it throws the following error:

pysqlite2.dbapi2.ProgrammingError: You must not use 8-bit bytestrings unless you use a text_factory that can interpret 8-bit bytestrings (like text_factory = str). It is highly recommended that you instead just switch your application to Unicode strings.

When I do "switch my application over to Unicode strings" by wrapping the value sent to pysqlite with the unicode method like: unicode(filename), it throws this error:

UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 66: ordinal not in range(128)

Is there something I can do to get rid of this? Modifying all of my files to conform isn't an option.

UPDATE If I decode the text via filename.decode("utf-8"), I'm still getting the ProgrammingError above.

My actual code looks like this:

cursor.execute("select * from musiclibrary where absolutepath = ?;",
    [filename.decode("utf-8")])

What should my code here look like?

Wast answered 14/5, 2010 at 22:44 Comment(2)
Looks like this code, after you updated the question, wasn't actually the code producing the error, right?Varhol
Right, it was similar code later on in the application.Wast
O
14

You need to specify the encoding of filename for conversion to Unicode, for example: filename.decode('utf-8'). Just using unicode(...) picks the console encoding, which is often unreliable (and often ascii).

Ofori answered 14/5, 2010 at 22:47 Comment(2)
I tried doing that, but it seems that I'm still getting the errors mentioned above. I updated the post with what I'm doing now, so you can see what I'm doing. Thanks!Wast
My bad, I had some more bad conversion happening later on in my script that was throwing the same error :)Wast
P
3

You should pass as Unicode the arguments of your SQL statement.

Now, it all depends on how you obtain the filename list. Perhaps you're reading the filesystem using os.listdir or os.walk? If that is the case, there is a way to have directly the filenames as Unicode just by passing a Unicode argument to either of these functions:
Examples:

  • os.listdir(u'.')
  • os.walk(u'.')

Of course, you can substitute the u'.' directory with the actual directory whose contents you are reading. Just make sure it's a Unicode string.

Preconize answered 10/6, 2010 at 13:43 Comment(0)
K
1

Have you tried to pass the unicode string directly:

cursor.execute("select * from musiclibrary where absolutepath = ?;",(u'namé',))

You will need to add the file encoding at the beginning of the script:

# coding: utf-8
Kwangchowan answered 15/5, 2010 at 0:25 Comment(1)
If I try that, it seems to be working. I'm iterating over around 3000 files, and it fails on a filename like: 02 - Neighborhood #2 (Laïka).mp3 . Is there a conversion technique that I'm missing somewhere?Wast
V
1

You figured this out already, but:

I don't think you could actually get that ProgrammingError exception from cursor.execute("select * from musiclibrary where absolutepath = ?;", [filename.decode("utf-8")]), as the question currently states.

Either the utf-8 decode would explode, or the cursor.execute call would be happy with the result.

Varhol answered 11/1, 2011 at 5:23 Comment(0)
R
-1

Try to change to this:

cursor.execute("select * from musiclibrary where absolutepath = ?;",
    [unicode(filename,'utf8')])

In your filename origin not encode with utf8, change utf8 to your encoding.

Rudderpost answered 19/8, 2017 at 3:19 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.