I have been assigned the task of reading a .txt file which is a log of various events and writing some of those events into a dictionary.
The problem is that the file can sometimes get bigger than 3GB in size. This means that the dictionary gets too big to fit into main memory. It seems that Shelve is a good way to solve this problem. However, since I will be constantly modifying the dictionary, I must have the writeback
option enabled. This is where I am concerned - the tutorial says that this would slow down the read/write process and use more memory, but I am unable to find statistics on how the speed and memory are affected.
Can anyone clarify by how much the read/write speed and memory are affected so that I can decide whether to use the writeback option or sacrifice some readability for code efficiency?
Thank you
shelf['key'] = newvalue
), you don't need writeback. If you're modifying mutable types in it (shelf['key'].append(x)
, you need writeback. Of course, you can leave writeback off and always remember to modify and replace values in your shelf, if you prefer. – Yogitemp
variable – Swanskin.sync()
, at which point it's rewritten to disk and freed. So the hit depends on what sort of pattern you're accessing the file in. – Yogi