Pretty file size in Ruby?
Asked Answered
R

9

26

I'm trying to make a method that converts an integer that represents bytes to a string with a 'prettied up' format.

Here's my half-working attempt:

class Integer
  def to_filesize
    {
      'B'  => 1024,
      'KB' => 1024 * 1024,
      'MB' => 1024 * 1024 * 1024,
      'GB' => 1024 * 1024 * 1024 * 1024,
      'TB' => 1024 * 1024 * 1024 * 1024 * 1024
    }.each_pair { |e, s| return "#{s / self}#{e}" if self < s }
  end
end

What am I doing wrong?

Rumpus answered 15/4, 2013 at 23:2 Comment(0)
R
26

How about the Filesize gem ? It seems to be able to convert from bytes (and other formats) into pretty printed values:

example:

Filesize.from("12502343 B").pretty      # => "11.92 MiB"

http://rubygems.org/gems/filesize

Rianna answered 15/4, 2013 at 23:9 Comment(3)
I think the gem location should be here http://rubygems.org/gems/filesizeRaving
Updated the gem location.Rianna
The gem is unmaintained and archived: github.com/dominikh/filesizeKenyatta
C
38

If you use it with Rails - what about standard Rails number helper?

http://api.rubyonrails.org/classes/ActionView/Helpers/NumberHelper.html#method-i-number_to_human_size

number_to_human_size(number, options = {})

?

Carolinecarolingian answered 31/1, 2014 at 0:34 Comment(1)
I needed the line include ActionView::Helpers::NumberHelpe in my script before I could use number_to_human_size.Bilbrey
R
26

How about the Filesize gem ? It seems to be able to convert from bytes (and other formats) into pretty printed values:

example:

Filesize.from("12502343 B").pretty      # => "11.92 MiB"

http://rubygems.org/gems/filesize

Rianna answered 15/4, 2013 at 23:9 Comment(3)
I think the gem location should be here http://rubygems.org/gems/filesizeRaving
Updated the gem location.Rianna
The gem is unmaintained and archived: github.com/dominikh/filesizeKenyatta
P
17

I agree with @David that it's probably best to use an existing solution, but to answer your question about what you're doing wrong:

  1. The primary error is dividing s by self rather than the other way around.
  2. You really want to divide by the previous s, so divide s by 1024.
  3. Doing integer arithmetic will give you confusing results, so convert to float.
  4. Perhaps round the answer.

So:

class Integer
  def to_filesize
    {
      'B'  => 1024,
      'KB' => 1024 * 1024,
      'MB' => 1024 * 1024 * 1024,
      'GB' => 1024 * 1024 * 1024 * 1024,
      'TB' => 1024 * 1024 * 1024 * 1024 * 1024
    }.each_pair { |e, s| return "#{(self.to_f / (s / 1024)).round(2)}#{e}" if self < s }
  end
end

lets you:

1.to_filesize
# => "1.0B"
1020.to_filesize
# => "1020.0B" 
1024.to_filesize
# => "1.0KB" 
1048576.to_filesize
# => "1.0MB"

Again, I don't recommend actually doing that, but it seems worth correcting the bugs.

Polito answered 15/4, 2013 at 23:18 Comment(6)
Thanks a bunch. I see what I was doing wrong. (I can be an idiot)Rumpus
You're very welcome. If bugs are evidence of idiocy, then we're all guilty of it. It was a fine start.Polito
This is awesome. I rather use this instead of the gem.Chavarria
@Darshan-JosiahBarber. This implementation will not always return the expected value, as it assumes that iteration will happen in the order it is written. But Ruby Hashes does not offer any guarantee on the order of the key-value pairs. Took me a while to figure out why I kept getting an incorrect result.Warmblooded
@DennyAbrahamCheriyan As of Ruby 1.9, released eight years ago, Hashes in fact do "enumerate their values in the order that the corresponding keys were inserted." If you are using an ancient version of Ruby, you really should upgrade!Polito
@Darshan-JosiahBarber. Good to know that. I was using 1.8.7Warmblooded
I
11

This is my solution:

def filesize(size)
  units = %w[B KiB MiB GiB TiB Pib EiB ZiB]

  return '0.0 B' if size == 0
  exp = (Math.log(size) / Math.log(1024)).to_i
  exp += 1 if (size.to_f / 1024 ** exp >= 1024 - 0.05)
  exp = units.size - 1 if exp > units.size - 1

  '%.1f %s' % [size.to_f / 1024 ** exp, units[exp]]
end

Compared to other solutions it's simpler, more efficient, and generates a more proper output.

Format

All other methods have the problem that they report 1023.95 bytes wrong. Moreover to_filesize simply errors out with big numbers (it returns an array).

 -       method: [     filesize,     Filesize,  number_to_human,  to_filesize ]
 -          0 B: [        0.0 B,       0.00 B,          0 Bytes,         0.0B ]
 -          1 B: [        1.0 B,       1.00 B,           1 Byte,         1.0B ]
 -         10 B: [       10.0 B,      10.00 B,         10 Bytes,        10.0B ]
 -       1000 B: [     1000.0 B,    1000.00 B,       1000 Bytes,      1000.0B ]
 -        1 KiB: [      1.0 KiB,     1.00 KiB,             1 KB,        1.0KB ]
 -      1.5 KiB: [      1.5 KiB,     1.50 KiB,           1.5 KB,        1.5KB ]
 -       10 KiB: [     10.0 KiB,    10.00 KiB,            10 KB,       10.0KB ]
 -     1000 KiB: [   1000.0 KiB,  1000.00 KiB,          1000 KB,     1000.0KB ]
 -        1 MiB: [      1.0 MiB,     1.00 MiB,             1 MB,        1.0MB ]
 -        1 GiB: [      1.0 GiB,     1.00 GiB,             1 GB,        1.0GB ]
 -  1023.95 GiB: [      1.0 TiB,  1023.95 GiB,          1020 GB,    1023.95GB ]
 -        1 TiB: [      1.0 TiB,     1.00 TiB,             1 TB,        1.0TB ]
 -        1 EiB: [      1.0 EiB,     1.00 EiB,             1 EB,        ERROR ]
 -        1 ZiB: [      1.0 ZiB,     1.00 ZiB,          1020 EB,        ERROR ]
 -        1 YiB: [   1024.0 ZiB,  1024.00 ZiB,       1050000 EB,        ERROR ]

Performance

Also, it has the best performance (seconds to process 1 million numbers):

 - filesize:           2.15
 - Filesize:          15.53
 - number_to_human:  139.63
 - to_filesize:        2.41
Irwinirwinn answered 25/11, 2017 at 14:2 Comment(1)
Very cool, thank you. Is this still better than number_to_human_size, etc? Has rails incorporated these improvements to its helpers?Dunc
T
4

Here is a method using log10:

def number_format(d)
   e = Math.log10(d).to_i / 3
   return '%.3f' % (d / 1000 ** e) + ['', ' k', ' M', ' G'][e]
end

s = number_format(9012345678.0)
puts s == '9.012 G'

https://ruby-doc.org/core/Math.html#method-c-log10

Theta answered 17/11, 2020 at 19:37 Comment(0)
A
1

You get points for adding a method to Integer, but this seems more File specific, so I would suggest monkeying around with File, say by adding a method to File called .prettysize().

But here is an alternative solution that uses iteration, and avoids printing single bytes as float :-)

def format_mb(size)
  conv = [ 'b', 'kb', 'mb', 'gb', 'tb', 'pb', 'eb' ];
  scale = 1024;

  ndx=1
  if( size < 2*(scale**ndx)  ) then
    return "#{(size)} #{conv[ndx-1]}"
  end
  size=size.to_f
  [2,3,4,5,6,7].each do |ndx|
    if( size < 2*(scale**ndx)  ) then
      return "#{'%.3f' % (size/(scale**(ndx-1)))} #{conv[ndx-1]}"
    end
  end
  ndx=7
  return "#{'%.3f' % (size/(scale**(ndx-1)))} #{conv[ndx-1]}"
end
Allyl answered 22/5, 2014 at 20:53 Comment(0)
B
1

FileSize may be dead, but now there is ByteSize.

require 'bytesize'

ByteSize.new(1210000000)       #=> (1.21 GB)
ByteSize.new(1210000000).to_s  #=> 1.21 GB
Bacteriostasis answered 23/8, 2022 at 19:31 Comment(2)
Coo when you posted thisl but not maintained for 3 years now.Kenyatta
It works well, and does not require any changes, since no issues have been posted for the project. I continue to use this gem. If I would ever need a change, I would just fork the project. If you know of an alternative, please post it here.Bacteriostasis
K
1

Unfortunately, filesize is abandoned since September 2018 and archived, while bytesize is unmaintained since June 2021.

An alternative is to use ActiveSupport which is:

A toolkit of support libraries and Ruby core extensions extracted from the Rails framework. Rich support for multibyte strings, internationalization, time zones, and testing.

It's meant to be used in Ruby on Rails but it's possible to use it ouside too.

Indeed, the NumberHelper class offers a ready to go number_to_human_size method that just does what we need:

Formats number as bytes into a more human-friendly representation. Useful for reporting file sizes to users.

To cherry-pick specific Active Support feature and have the minimal footprint, here is what you need to do:

# For Active Support 7+
require 'active_support' # skip this if Active Support 6-
require 'active_support/core_ext/numeric/conversions'

as = ActiveSupport::NumberHelper

as.number_to_human_size(1_000_000_000) # => "954 MB"
as.number_to_human_size(23_897)        # => "23.3 KB"
as.number_to_human_size(1024)          # => "1 KB"
as.number_to_human_size(64)            # => "64 Bytes"

as.number_to_human_size(27_198_870_567) # => "25.3 GB"
as.number_to_human_size(27_198_870_567, precision: 5) # => "25.331 GB"
as.number_to_human_size(27_198_870_567, precision: 2, round_mode: :up) # => "26 GB"
as.number_to_human_size(27_198_870_567_000_000_000_000_000, separator: ',', delimiter: ' ') # => "23 000 ZB"

Active Support loads the minimum dependencies by default thanks to an autoload mechanism, that's why you need this double requirement.

Kenyatta answered 22/3 at 16:37 Comment(0)
C
0

@Darshan Computing's solution is only partial here. Since the hash keys are not guaranteed to be ordered this approach will not work reliably. You could fix this by doing something like this inside the to_filesize method,

 conv={
      1024=>'B',
      1024*1024=>'KB',
      ...
 }
 conv.keys.sort.each { |s|
     next if self >= s
     e=conv[s]
     return "#{(self.to_f / (s / 1024)).round(2)}#{e}" if self < s }
 }

This is what I ended up doing for a similar method inside Float,

 class Float
   def to_human
     conv={
       1024=>'B',
       1024*1024=>'KB',
       1024*1024*1024=>'MB',
       1024*1024*1024*1024=>'GB',
       1024*1024*1024*1024*1024=>'TB',
       1024*1024*1024*1024*1024*1024=>'PB',
       1024*1024*1024*1024*1024*1024*1024=>'EB'
     }
     conv.keys.sort.each { |mult|
        next if self >= mult
        suffix=conv[mult]
        return "%.2f %s" % [ self / (mult / 1024), suffix ]
     }
   end
 end
Camarata answered 1/10, 2013 at 17:29 Comment(4)
is there any reason to have the hash keys ordered? The code that you have provided is not takes into account the order of the hash keys, but takes additional resources to perform unnecessary sorting. Also, benchmark says, that your solution is at least 1.5 times slower, than the original one, fixed by @Darshan-Josiah Barber and updated by PB and EB support.Teetotum
Here is compare of 3 provided solutions(@Darshan-Josiah Barber's +PB and EB, @Steeve McCauley's, and @ChuckCottrill's): n = 10000000 Benchmark.bm do |x| x.report('to_filesize:') { 1.upto(n) do |i| ; (i.to_filesize); end } x.report('format_mb:') { 1.upto(n) do |i| ; (i.format_mb); end } x.report('to_human:') { 1.upto(n) do |i| ; (i.to_human); end } end user system total real to_filesize:130.470000 0.740000 131.210000 (134.722463) format_mb: 86.650000 0.620000 87.270000 ( 89.221210) to_human:199.590000 1.040000 200.630000 (203.380451)Teetotum
@Teetotum Why do you say it doesn't matter the order of the keys? If it's not ordered it could start with 1024^7, if the number is 1 MiB, it wouldn't be >= than mult, and assume it's 'EB'. The keys need to be ordered for the code to work.Irwinirwinn
@FelipeC, Because in ruby Hashes enumerate their values in the order that the corresponding keys were inserted. docs.ruby-lang.org/en/2.0.0/Hash.html So if you have created sorted hash you do not need to sort it again.Teetotum

© 2022 - 2024 — McMap. All rights reserved.