Multiprocess sqlite INSERT: "database is locked"
Asked Answered
B

1

8

(Please note: There is a question called "SQLite3 and Multiprocessing" but that question is actually about multithreading and so is the accepted answer, this isn't a duplicate)

I'm implementing a multiprocess script, each process will need to write some results in an sqlite table. My program keeps crashing with database is locked (with sqlite only one DB modification is allowed at a time).

Here's an example of what I have:

def scan(n):
    n = n + 1 # Some calculation

    cur.execute("                      \
                    INSERT INTO hello  \
                    (n)                \
                    VALUES ('"+n+"')   \
                ")

    con.commit()
    con.close()

    return True


if __name__ == '__main__':

    pool = Pool(processes=int(sys.argv[1]))

    for status in pool.imap_unordered(scan, range(0,9999)):
        if status:
            print "ok"

    pool.close()

I've tried using a lock by declaring a lock in the main and using it as a global in scan(), but it didn't stop me getting the database is locked.

What is the proper way of making sure only one INSERT statement will get issued at the same time in a multiprocess Python script?

EDIT:

I'm running on a Debian-based Linux.

Belief answered 10/11, 2014 at 2:2 Comment(10)
What's your operating system? SQLite supports advisory / region locking for competition between competing writers, so even though one write can happen at a time, on an appropriate operating system with appropriate configuration, this just results it being slow / blocking for other writers to be done, not on this.Stereochemistry
This is discussed in detail in sqlite.org/lockingv3.html. For that matter -- specifically, which version of SQLite is this? If it's pre-3.0 (which it shouldn't be -- that's over a decade old), then you don't have that feature.Stereochemistry
On a completely different note, by the way -- don't EVER use string concatenation to build your SQL queries. That way lies Bobby Tables...Stereochemistry
@CharlesDuffy I'm on a Debian based system. I would never use that on anything with foreign input! Is it really ill-advised even for small temporary script? I will take a look at your link.Belief
Even if it's not security-impacting, it's performance-impacting -- means a cached query parse can't be (re)used.Stereochemistry
Noted! Thanks for the tip.Belief
What's the backend filesystem? If you're on vfat, I could see region locking not working properly.Stereochemistry
BTW, I asked about the sqlite version; since you're using the Python library, you can print sqlite3.sqlite_version to get that.Stereochemistry
sqlite 3.7.16.2 and I must admit I do not know how to check my backend filesystemBelief
Reading the fine manual for the Python module, I'm thinking it's likely that the filesystem isn't involved in the issue -- see the provided answer.Stereochemistry
S
8

This will happen if the write lock can't be grabbed within (by default) a 5-second timeout. In general, make sure your code COMMITs its transactions with sufficient frequency, thereby releasing the lock and letting other processes have a chance to grab it. If you want to wait for longer, you can do that:

db = sqlite.connect(filename, timeout=30.0)

...waits for 30 seconds.

Stereochemistry answered 10/11, 2014 at 2:17 Comment(2)
Excellent! If I have multiple INSERTs should I release the lock between each? Or issue each in a row and then commit?Belief
Ideally, for performance, you want to find a reasonable middle ground in terms of batch size. Too short and you take more time overall, too big and your individual transactions take too long (and hold the lock for an extended period). Of course, sometimes correctness concerns will dictate batch size more strongly than performance does.Stereochemistry

© 2022 - 2025 — McMap. All rights reserved.