Why are binary files corrupted when zipping them?
Asked Answered
S

2

4

I have a service that delivers zipped files over the web. The zip contains executable files for the Windows platform.

I'm using the RubyZip library to compress the file but the process corrupts the binary. At my local server we're using the zip command via a system call and it works fine.

The zip command is not available at Heroku, and I'm simply out of options.

I'm using this class:

require 'zip/zip'

# This is a simple example which uses rubyzip to
# recursively generate a zip file from the contents of
# a specified directory. The directory itself is not
# included in the archive, rather just its contents.
#
# Usage:
#   directoryToZip = "/tmp/input"
#   outputFile = "/tmp/out.zip"   
#   zf = ZipFileGenerator.new(directoryToZip, outputFile)
#   zf.write()
class ZipFileGenerator

  # Initialize with the directory to zip and the location of the output archive.
  def initialize(inputDir, outputFile)
    @inputDir = inputDir
    @outputFile = outputFile
  end

  # Zip the input directory.
  def write()
    entries = Dir.entries(@inputDir); entries.delete("."); entries.delete("..") 
    io = Zip::ZipFile.open(@outputFile, Zip::ZipFile::CREATE); 

    writeEntries(entries, "", io)
    io.close();
  end

  # A helper method to make the recursion work.
  private
  def writeEntries(entries, path, io)

    entries.each { |e|
      zipFilePath = path == "" ? e : File.join(path, e)
      diskFilePath = File.join(@inputDir, zipFilePath)
      puts "Deflating " + diskFilePath
      if  File.directory?(diskFilePath)
        io.mkdir(zipFilePath)
        subdir =Dir.entries(diskFilePath); subdir.delete("."); subdir.delete("..") 
        writeEntries(subdir, zipFilePath, io)
      else
        io.get_output_stream(zipFilePath) { |f| f.puts(File.open(diskFilePath, "rb").read())}
      end
    }
  end

end
Synchronize answered 12/5, 2012 at 13:43 Comment(0)
H
5

theglauber's answer is correct. As stated in the documentation of the IO class, which is File's superclass:

Binary file mode. Suppresses EOL <-> CRLF conversion on Windows. And sets external encoding to ASCII-8BIT unless explicitly specified.

Emphasis mine. On Windows, native line endings (\r\n) are implicitly converted to line feeds (\n) when a file is opened in text mode, which probably is what is causing the corruption.

There is also the fact that IO#puts ensures the output ends with a line separator (\r\n on Windows), which is undesirable for binary file formats.

You are also not closing the file returned by File.open. Here is an elegant solution that will fix all these problems:

io.get_output_stream(zip_file_path) do |out|
  out.write File.binread(disk_file_path)
end
Heliolatry answered 12/5, 2012 at 15:5 Comment(1)
Thanks, Matheus! That solved the problem. Thanks to @theglauber too. I guess I was too desperate to RTFM :) Valeu!Alterant
M
3

If this is Windows, you may have to open your output file in binary mode.

For example: io = File.open('foo.zip','wb')

Maisel answered 12/5, 2012 at 13:47 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.