I made a tmpfs
filesystem in my home directory on Ubuntu using this command:
$ mount -t tmpfs -o size=1G,nr_inodes=10k,mode=0777 tmpfs space
$ df -h space .
File system Size Used Avail. Avail% Mounted at
tmpfs 1,0G 100M 925M 10% /home/user/space
/dev/mapper/ubuntu--vg-root 914G 373G 495G 43% /
Then I wrote this Python program:
#!/usr/bin/env python3
import time
import pickle
def f(fn):
start = time.time()
with open(fn, "rb") as fh:
data = pickle.load(fh)
end = time.time()
print(str(end - start) + "s")
return data
obj = list(map(str, range(10 * 1024 * 1024))) # approx. 100M
def l(fn):
with open(fn, "wb") as fh:
pickle.dump(obj, fh)
print("Dump obj.pkl")
l("obj.pkl")
print("Dump space/obj.pkl")
l("space/obj.pkl")
_ = f("obj.pkl")
_ = f("space/obj.pkl")
The result:
Dump obj.pkl
Dump space/obj.pkl
0.6715312004089355s
0.6940639019012451s
I am confused about this result. Isn't the tmpfs a file system based on RAM and isn't RAM supposed to be notably faster than any hard disk, including SSDs?
Furthermore, I noticed that this program is using over 15GB of RAM when I increase the target file size to approx. 1 GB.
How can this be explained?
The background of this experiment is that I am trying to find alternative caching locations to the hard disk and Redis that are faster and available to multiple worker processes.
cpickle
if in a hurry? – Inurn$ time dd if=/dev/zero of=space/test.img bs=1048576 count=100 100+0 records in 100+0 records out 104857600 bytes (105 MB, 100 MiB) copied, 0.0231555 s, 4.5 GB/s real 0m0.030s user 0m0.000s sys 0m0.030s
– Epimorphosis$ time dd if=/dev/zero of=test.img bs=1048576 count=100 100+0 records in 100+0 records out 104857600 bytes (105 MB, 100 MiB) copied, 0.165582 s, 633 MB/s real 0m0.178s user 0m0.000s sys 0m0.060s
– Epimorphosis0m0.030s
vs0m0.178s
... seems like a clear winner for tmpfs ... – Epimorphosis_pickle
instead ofpickle
does not make any difference to the final time measurements. A library calledcpickle
apparently does not exist in Python3. – Rateable