I am using the shutil.disk_usage() function to find the current disk usage of a particular path (amount available, used, etc.). As far as I can find, this is a wrapper around os.statvfs() calls. I'm finding that it is not giving the answers I'd expect, as comparing to the output of "du" in Linux.
I have obscured some of the paths below for company privacy reasons, but the output and code are otherwise undoctored. I am using Python 3.3.2 64-bit version.
#!/apps/python/3.3.2_64bit/bin/python3
# test of shutils.diskusage module
import shutil
BytesPerGB = 1024 * 1024 * 1024
(total, used, free) = shutil.disk_usage("/data/foo/")
print ("Total: %.2fGB" % (float(total)/BytesPerGB))
print ("Used: %.2fGB" % (float(used)/BytesPerGB))
(total1, used1, free1) = shutil.disk_usage("/data/foo/utils/")
print ("Total: %.2fGB" % (float(total1)/BytesPerGB))
print ("Used: %.2fGB" % (float(used1)/BytesPerGB))
Which outputs:
/data/foo/drivecode/me % disk_usage_test.py
Total: 609.60GB
Used: 291.58GB
Total: 609.60GB
Used: 291.58GB
As you can see, the main problem is I would expect the second amount for "Used" to be much smaller, as it is a subset of the first directory.
/data/foo/drivecode/me % du -sh /data/foo/utils
2.0G /data/foo/utils
As much as I trust "du," I find it hard to believe the Python module would be incorrect either. So perhaps it is just my understanding of Linux filesystems that could be the issue. :)
I wrote a module (based heavily on someone's code here at SO) which recursively gets the disk_usage, which I was using until now. It appears to match the "du" output but is MUCH, much slower than the shutil.disk_usage() function, so I'm hoping I can make that one work.
Thanks much in advance.
shutil.disk_usage
is giving your "disk" usage not "directory" usage. What you get from it ought to be compared withdf -h
instead ofdu -sh
. – Mccaslin