Is it possible to get the compressed and uncompressed sizes of a file on a btrfs file system?
Asked Answered
O

7

31

Is it possible to determine what the compressed size (I assume that is what is listed by ls -l) and uncompressed size of files on a btrfs filesystem with transparent compression enabled?

Omen answered 20/10, 2013 at 0:40 Comment(3)
ls -l will show you the uncompressed size, not the compressed size.Albertinealbertite
@Albertinealbertite Thanks, good to know. Then how can I get the compressed size?Omen
Dunno. My guess would be looking at the extents (maybe via filefrag), but it's not really my area. This question is probably more appropriate on super user than stack overflow, but TBH I'd try a btrfs-specific forum instead (like their IRC room or mailing list).Albertinealbertite
R
30

there is a third party tool that can do this.

https://github.com/kilobyte/compsize

usage:

ayush@devbox:/code/compsize$ sudo compsize /opt
Processed 54036 files, 42027 regular extents (42028 refs), 27150 inline.
Type       Perc     Disk Usage   Uncompressed Referenced  
Data        82%      5.3G         6.4G         6.4G       
none       100%      4.3G         4.3G         4.3G       
zlib        37%      427M         1.1G         1.1G       
lzo         56%      588M         1.0G         1.0G  
Ramsgate answered 12/11, 2017 at 18:37 Comment(1)
This is the correct answer, even it is more involved to get this running. On SuSE Linux Leap 42.3 I had to install the libbtrfs-devel package to get it to compile but then it worked great!Traceable
E
8

In Ubuntu-18

apt install btrfs-compsize
compsize /mnt/btrfs-partition
Exterminatory answered 27/3, 2019 at 10:5 Comment(0)
I
7

I am not able to answer on a file by file basis, and @catlover2 gave the answer for a filesystem. But you should differentiate between the block size on disk, and the size in the (virtual) filesystem, ls and du can not go beyond filesystem, so they give no informations on how many disk blocks are used, and @jiliagre --apparent-size is here useless.

To better illustrate this question, I have made a test with a single 23G file btrfs filesystem; first uncompressed, then lzo compressed. The example file is a virtual machine image and as only a compression level of 0.5. It shows that only df, and btrfs filesystem df can show the compression.

$   lvcreate vg0 test_btrfs -L 30G
Logical volume "test_btrfs" created
$ mkfs.btrfs /dev/vg0/test_btrfs
...
fs created label (null) on /dev/vg0/test_btrfs
    nodesize 16384 leafsize 16384 sectorsize 4096 size 30.00GiB
$ mount /dev/vg0/test_btrfs /tmp/test_btrfs
$ btrfs filesystem df /tmp/test_btrfs
Data, single: total=8.00MiB, used=256.00KiB
System, DUP: total=8.00MiB, used=16.00KiB
System, single: total=4.00MiB, used=0.00
Metadata, DUP: total=1.00GiB, used=112.00KiB
Metadata, single: total=8.00MiB, used=0.00
$ cp bigfile /tmp/test_btrfs
$ btrfs filesystem df /tmp/test_btrfs
Data, single: total=24.01GiB, used=22.70GiB
System, DUP: total=8.00MiB, used=16.00KiB
System, single: total=4.00MiB, used=0.00
Metadata, DUP: total=1.00GiB, used=23.64MiB
Metadata, single: total=8.00MiB, used=0.00
$ btrfs filesystem df /tmp/test_btrfs
... unchanged!
$ cd /tmp/test_btrfs/
$ ls -l bigfile
-rw------- 1 root root 24367940096 May  4 15:03 bigfile
$ du -B1 --apparent-size bigfile
24367940096 bigfile
$ du -B1 bigfile
24367943680 bigfile
$ btrfs filesystem defragment -c bigfile
$ ls -l bigfile
-rw------- 1 root root 24367940096 May  4 15:03 bigfile
$ du -B1 --apparent-size bigfile
24367940096 bigfile
$ du -B1 bigfile
24367943680 bigfile
$ btrfs filesystem df /tmp/test_btrfs
Data, single: total=24.01GiB, used=12.90GiB
System, DUP: total=8.00MiB, used=16.00KiB
System, single: total=4.00MiB, used=0.00
Metadata, DUP: total=1.00GiB, used=39.19MiB
Metadata, single: total=8.00MiB, used=0.00
$ df -BG /tmp/test_btrfs
Filesystem                 1G-blocks  Used Available Use% Mounted on
/dev/mapper/vg0-test_btrfs       30G   13G       16G  47% /tmp/test_btrfs

The question of @gandalf3 is still unanswered, and may be we need to wait for development of btrfs (or to help to develop it!) to get a proper underlying disks block du for a peculiar file. It would be very useful, I find very frustrating when I mount a btrfs fs with compression (without force) not knowing if my files are compressed or not and at which level.

Infertile answered 4/5, 2014 at 14:10 Comment(0)
A
4

I was trying to answer this question too, and here's what I've found:du -s and df produce different numbers. So I did some testing:

  1. I've put a test directory in /home with size about 3TB. It is a partial copy of whole /home directory with typical mix of documents, text files, images and programs

  2. I compressed this directory using .tar.gz which resulted in file of size

# du -s ./test.tar.gz 1672083116 ./test.tar.gz

  1. With this file existing in the file system I did this:

# du -s /home 11017624664 /home

# du --apparent-size -s /home 11010709168 /home

# df /home Filesystem 1K-blocks Used Available Use% Mounted on /dev/md2 31230406656 9128594488 22095200200 30% /home

Which means that we have ((11017624664/(1024**2))/(9128594488/(1024**2))-1)*100 = 20% compression ratio

  1. then I deleted this file and I got this:

# du -s /home 9348284812 /home

# du --apparent-size -s /home 9340957158 /home

# df /home Filesystem 1K-blocks Used Available Use% Mounted on /dev/md2 31230406656 7455549036 23764949364 24% /home

Yielding a compression ratio of 25%. Also from this information I concluded that the test.tar.gz file with real size 1592 G occupied on the disk 1595 G. Also I noted that using --apparent-size flag produces an insignificant difference, probably due to blocksize rounding.

Side note, my fstab line for mounting this partition is:

UUID=be6...07fe /home btrfs defaults,compress=zlib 0 2

Summary:

To check compression ratio on whole partition use these two commands:

du -s /home df /home

Then divide the outputs. I guess that my 25% compression ratio is a typical result to expect from a zlib compressor.

Aspasia answered 29/8, 2016 at 14:20 Comment(0)
M
2

The on-disk size of a file, regardless of the file system type, is given1 by the du command, eg:

$ du -h *
732K    file
512 file1
4.0M    file2
$ du -B1 *
749568  file
512 file1
4091904 file2

The on-disk size is equal to the size of the file plus the size of its metadata, rounded to the file system block size. Non compressed files will usually have a slightly larger on disk size than their actual (bytes count) size.

As already stated, the uncompressed size is shown by ls -l. It can also be reported by du with the --apparent-size option;

$ du --apparent-size -h *
826K    file
64M file1
17M file2
$ du --apparent-size -B 1  *
845708  file
67108864    file1
16784836    file2

Note that -B1 and --apparent-size are GNU specific du extensions.

1 It seems btrfs doesn't follow this rule. If this is really/still true, my understanding is that should be considered a bug or at least, a POSIX non conformance.

Momus answered 3/1, 2014 at 8:28 Comment(4)
btrfs lies to du about the on-disk size :(Wire
It is not a bug, see the entry in the Btrfs FAQ titled: "Why does not du report the compressed size?" Example: There are utilities that determine sparseness of a file by comparing the nominal and block-allocated size, this behaviour might cause bugs if st_blocks contained the amount after compression.Traceable
@RayHulha Ok, not a bug because it is not a broken implementation but a design decision. It is still a non conformance to POSIX and, IMHO, a design flaw with a fallacious explanation: This was done not to break bogus third party programs but ignoring the fact doing it breaks non bogus programs that expect a reliable value in st_blocks. What "number of disk blocks" means is clear, and btrfs doesn't comply with this definition.Momus
The feature is called transparent compression. If the size would show the compressed size it wouldn't be very transparent now would it ?Traceable
R
1

You can create a Btrfs filesystem in a file, mount it, copy the files there and run df:

$ dd if=/dev/zero of=btrfs.data size=1M count=1K
$ mkdir btrfs
$ mount btrfs.data btrfs -o compress
... copy the files to ./btrfs
$ sync
$ cd btrfs
$ btrfs filesystem df .

Example of a single file compressed from 17MiB to 5MiB:

$ cd btrfs
$ ls -l
-rwx------ 1 atom atom 17812968 Oct 27  2015 commands.bin
$ btrfs filesystem df .
Data, single: total=1.01GiB, used=5.08MiB
System, DUP: total=8.00MiB, used=16.00KiB
Metadata, DUP: total=1.00GiB, used=112.00KiB
GlobalReserve, single: total=16.00MiB, used=0.00B
Rescission answered 18/10, 2016 at 1:33 Comment(0)
C
-6

Run btrfs filesystem df /mountpoint.

Example output:

Data: total=2.01GB, used=1.03GB
System, DUP: total=8.00MB, used=4.00KB
System: total=4.00MB, used=0.00
Metadata, DUP: total=1.00GB, used=2.52MB
Metadata: total=8.00MB, used=0.00

The key line starts with Data:; used= is the compressed size, and total= is the total size as if on an uncompressed filesystem. I created a test filesystem, mounted it with the compress_force=zlib option, and copied 1GB of zeroes to a file on the filesystem; at that point the Data: line was Data: total=1.01GB, used=32.53MB (zeroes are quite compressable!). Then I re-mounted the filesystem with compression disabled, copied another GB of zeroes to it, and at that point the Data: line read Data: total=2.01GB, used=1.03GB.

As nemequ mentioned above, ls -l, on the contrary, shows the uncompressed size.

Canonicals answered 8/11, 2013 at 7:44 Comment(2)
That is not what this means. This should mean that two blocks (of 1GB) are used but only 1.03GB of them are used. This only allowed you to see the uncompressed size at the beginning as the existing size was 0Agitator
-1 This answer is plain wrong and based on false assumptions. total= does not quantify data usage. Btrfs reserves large chunks of empty space in advance: this improves data locality and reduces fragmentation. total quantifies the sum of the current chunks, that is, used + the sum of the empty space in the chunks. When you write and there is enough free space in the chunk, used changes, but total does not. When used approaches total, Btrfs puts a new placeholder over some space outside total, adds it to total, and does not touch used.Heaver

© 2022 - 2024 — McMap. All rights reserved.