Packing Anatomy
ZODB FileStorage packing is process of selective copying of data from one file to another one (only transactions that are "younger" then specified age). Before this copying starts some soft of index is built in memory to aid in process. Thus whole ZODB packing contains following steps:
- Building pack index
- Copying transactions to temporary file
- Appending transactions that were performed after packing started
- Replacing original FileStorage with packed one and reopening it in read/write mode
I'm usually monitoring the process by combination of top
, vmstat
/dstat
, watch ls -la var/filestorage
.
As Geir mentioned, you can have separate ZEO client dedicated to packing. This was reasonable as thread you invoked packing from blocked until packing finished. Now there is no need to if you use ZEO. ZEO server provides zeopack
utility that connects directly to ZEO (no need for dedicated ZEO client) and initiates FileStorage packing. One of the benefits is no need for password, just proper permissions to access ZEO control socket.
Packing progress
As packing is performed by ZEO server (even not server but FileStorage itself), possibility of proper communication of progress to ZEO client is limited. ZEO protocol was not designed to communicate that type of information.
IMHO FileStorage itself could be more verbose in communicating through log file what it is doing right now. Some kind of progress could be built in. And if you feel like need the progress indicator, then you can design some kind of feedback channel through logging module back to ZEO-client/Zope-instance to be communicated back to browser.
Performance while packing
As FileStorage packing is quite intensive disk operation, it reduces throughoutput of disk subsystem. Additionally it expunges disk cache (in case of larger FileStorage), that impact disk performance even after packing finished, as caches should be warmed up again. Possible improvements that lead to longer packing time but smaller impact on system in FileStorage are:
- reverting to
O_DIRECT
operations (not to touch file cache)
- reducing disk scheduling priority (
ionice
on Linux) for thread performing the packing
- throttling packing speed