shelve
, by default, is backed by the dbm
module, in turn backed by some dbm
implementation available on the system. Neither the shelve
module, nor the dbm
module, make any effort to minimize writes; an assignment of a value to a key causes a write every time. Even when writeback=True
, that just means that new assignments are placed in the cache and immediately written to the backing dbm
; they're written to make sure the original value is there, and the cache entry is made because the object assigned might change after assignment and needs to be handled just like a freshly read object (meaning it will be written again when sync
ed or close
d, in case it changed).
While it's possible some implementation of the underlying dbm
libraries might include some caching, AFAICT, most do try to write immediately (that is, pushing data to the kernel immediately without user-mode buffering), they just don't necessarily force immediate synchronization to disk (though it can be requested, e.g. with gdbm_sync
).
writeback=True
will make it worse, because when it does sync
, it's a major effort (it literally rewrites every object read or written to the DB since the last sync
, because it has no way of knowing which of them might have been modified), as opposed to the small effort of rewriting a single key/value pair at a time.
In short, if you're really concerned about blocking writes, you can't use unthreaded async code without potential blocking, but said blocking is likely short-lived as long as writeback=True
is not involved (or as long as you don't sync
/close
it until performance considerations are no longer relevant). If you need to have truly non-blocking async behavior, all shelve
interactions will need to occur under a lock in worker threads, and either writeback
must be False
(to avoid race conditions pickling data) or if writeback
is True
, you must take care to avoid modifying any object that might be in the cache during the sync
/close
.
shelve
module is not remotely thread-safe (the mechanism behindsync
is to setwriteback
toFalse
, redo all the assignments, then set it back toTrue
, which means anything writing after thesync
begins will be written but not cached, so it won't benefit fromwriteback
protections), so the locking will be important in any threaded scenario. – Westbrook