Dynamically created zip files by ZipStream in PHP won't open in OSX
Asked Answered
F

7

10

I have a PHP site with a lot of media files and users need to be able to download multiple files at a time as a .zip. I'm trying to use ZipStream to serve the zips on the fly with "store" compression so I don't actually have to create a zip on the server, since some of the files are huge and it's prohibitively slow to compress them all.

This works great and the resulting files can be opened by every zip program I've tried with no errors except for OS X's default unzipping program, Archive Utility. You double click the .zip file and Archive Utility decides it doesn't look a real zip and instead compresses into a .cpgz file.

Using unzip or ditto in the OS X terminal or StuffIt Expander unzips the file with no problem but I need the default program (Archive Utility) to work for the sake of our users.

What sort of things (flags, etc.) in otherwise acceptable zip files can trip Archive Utility into thinking a file isn't a valid zip?

I've read this question, which seems to describe a similar issue but I don't have any of the general purpose bitfield bits set so it's not the third bit issue and I'm pretty sure I have valid crc-32's because when I don't, WinRAR throws a fit.

I'm happy to post some code or a link to a "bad" zip file if it would help but I'm pretty much just using ZipStream, forcing it into "large file mode" and using "store" as the compression method.

Edit - I've tried the "deflate" compression algorithm as well and get the same results so I don't think it's the "store". It's also worth pointing out that I'm pulling down the files one a time from a storage server and sending them out as they arrive so a solution that requires all the files to be downloaded before sending anything isn't going to be viable (extreme example is 5GB+ of 20MB files. User can't wait for all 5GB to transfer to zipping server before their download starts or they'll think it's broken)

Here's a 140 byte, "store" compressed, test zip file that exhibits this behavior: http://teknocowboys.com/test.zip

Falkirk answered 6/4, 2011 at 21:31 Comment(0)
F
10

The problem was in the "version needed to extract" field, which I found by doing a hex diff on a file created by ZipStream vs a file created by Info-zip and going through the differences, trying to resolve them.

ZipStream by default sets it to 0x0603. Info-zip sets it to 0x000A. Zip files with the former value don't seem to open in Archive Utility. Perhaps it doesn't support the features at that version?

Forcing the "version needed to extract" to 0x000A made the generated files open as well in Archive Utility as they do everywhere else.

Edit: Another cause of this issue is if the zip file was downloaded using Safari (user agent version >= 537) and you under-reported the file size when you sent out your Content-Length header.

The solution we employ is to detect Safari >= 537 server side and if that's what you're using, we determine the difference between the Content-Length size and the actual size (how you do this depends on your specific application) and after calling $zipStream->finish(), we echo chr(0) to reach the correct length. The resulting file is technically malformed and any comment you put in the zip won't be displayed, but all zip programs will be able to open it and extract the files.

IE requires the same hack if you're misreporting your Content-Length but instead of downloading a file that doesn't work, it just won't finish downloading and throws a "download interrupted".

Falkirk answered 6/4, 2011 at 23:45 Comment(6)
See also, this similar question: #1680486Hookah
life saver. I had the EXACT same problem (was even using ZipStream).De
Hi ZorroDeLaArena, I am very happy by seeing your question in stack overflow. Basically I am trying to use the ZipStream library to dynamically download and zip the large files from Amazon S3. But no success so far, if you don't mind can you please share your code block for my reference to handle this scenario.Dominickdominie
Really struggling to use the ZipStream library with files stored on S3, I think you are already using this library for handling the files from different server. I feel your code block will help me to handle my situation, so can you please help me. Thanks, Siva...Dominickdominie
I found 2 matches for 'version needed to extract' in the zipstream.php file, so which one you modified and what is the new value you set for that field?Dominickdominie
Have you tried this more recently? I have a symfony 2.8 app with ZipStream using the addFiletoStream method, i've set the "version needed to extract" to 0x000A and it fails when attempting to decompress with Archive Utility on OSX El Capitan. Error is: Unable to expand "test.zip" into "Downloads". (Error 2 - No such file or directory.)Salcido
O
5

use ob_clean(); and flush();

Example :

    $file =  __UPLOAD_PATH . $projectname . '/' . $fileName;

    $zipname = "watherver.zip"
    $zip = new ZipArchive(); 
    $zip_full_path_name = __UPLOAD_PATH . $projectname . '/' . $zipname;
    $zip->open($zip_full_path_name, ZIPARCHIVE::CREATE);
    $zip->addFile($file); // Adding one file for testing
    $zip->close();

    if(file_exists($zip_full_path_name)){
        header('Content-type: application/zip');
        header('Content-Disposition: attachment; filename="'.$zipname.'"');
        ob_clean();
        flush();
        readfile($zip_full_path_name);
        unlink($zip_full_path_name);
    }
Oblation answered 13/11, 2012 at 16:17 Comment(0)
E
2

I've had this exact issue but with a different cause.

In my case the php generated zip would open from the command line, but not via finder in OSX.

I had made the mistake of allowing some HTML content into the output buffer prior to creating the zip file and sending that back as the response.

<some html></....>
<?php

// Output a zip file...

The command line unzip program was evidently tolerant of this but the Mac unarchive function was not.

Etan answered 11/8, 2014 at 2:11 Comment(0)
V
1

No idea. If the external ZipString class doesn't work, try another option. The PHP ZipArchive extension won't help you, since it doesn't support streaming but only ever writes to files.

But you could try the standard Info-zip utility. It can be invoked from within PHP like this:

#header("Content-Type: archive/zip");
passthru("zip -0 -q -r - *.*");

That would lead to an uncompressed zip file directly send back to the client.

If that doesn't help, then the MacOS zip frontend probably doesn't like uncompressed stuff. Remove the -0 flag then.

Valli answered 6/4, 2011 at 21:52 Comment(4)
The passthru's a good idea if nothing else works. Thanks! The advantage to using the library is the files are actually hosted on a separate server and I pull them down temporarily before "zipping". Using "zip" requires all the files be pulled down before sending any data, giving the user a long, "is this website broken?" wait before anything visibly happens, while creating the zip myself lets me pass files as they come in, giving the appearance of constant progress.Falkirk
Ah okay. That's a significant difficulty then. It would require an too elaborate workaround (nfs, sshfs or davfs) to make this work with the zip utility. Maybe you should try to enable the compression for that zipstream class then, at least for testing. Maybe that changes the ZIP format enough to make OSX understand it.Valli
I've actually tried it with the "deflate" algorithm and I get the same results. (I probably should've mentioned that in the original question. Sorry. I'll update it) I think it's some issue in the zip headers but I guess I don't really know. Maybe I can make the same file with "zip" and zipstream and do a binary diff to see what zip does differentlyFalkirk
Maybe you could post a test zip here (just a single uncompressed README.TEST), with a binary/base64 dump. Maybe someone is inclined to poke it with a hexeditor.Valli
E
1

The InfoZip commandline tool I'm using, both on Windows and Linux, uses version 20 for the zip's "version needed to extract" field. This is needed on PHP as well, as the default compression is the Deflate algorithm. Thus the "version needed to extract" field should really be 0x0014. If you alter the "(6 << 8) +3" code in the referenced ZipStream class to just "20", you should get a valid Zip file across platforms.

The author is basically telling you that the zip file was created in OS/2 using the HPFS file system, and the Zip version needed predates InfoZip 1.0. Not many implementations know what to do about that one any longer ;)

Elide answered 21/5, 2011 at 9:13 Comment(0)
S
1

For those using ZipStream in Symfony, here's your solution: https://mcmap.net/q/1162297/-using-zipstream-in-symfony-streamed-zip-download-will-not-decompress-using-archive-utility-on-mac-osx

use Symfony\Component\HttpFoundation\StreamedResponse;
use Aws\S3\S3Client;    
use ZipStream;

//...

/**
 * @Route("/zipstream", name="zipstream")
 */
public function zipStreamAction()
{
    //test file on s3
    $s3keys = array(
      "ziptestfolder/file1.txt"
    );

    $s3Client = $this->get('app.amazon.s3'); //s3client service
    $s3Client->registerStreamWrapper(); //required

    $response = new StreamedResponse(function() use($s3keys, $s3Client) 
    {

        // Define suitable options for ZipStream Archive.
        $opt = array(
                'comment' => 'test zip file.',
                'content_type' => 'application/octet-stream'
              );
        //initialise zipstream with output zip filename and options.
        $zip = new ZipStream\ZipStream('test.zip', $opt);

        //loop keys useful for multiple files
        foreach ($s3keys as $key) {
            // Get the file name in S3 key so we can save it to the zip 
            //file using the same name.
            $fileName = basename($key);

            //concatenate s3path.
            $bucket = 'bucketname';
            $s3path = "s3://" . $bucket . "/" . $key;        

            //addFileFromStream
            if ($streamRead = fopen($s3path, 'r')) {
              $zip->addFileFromStream($fileName, $streamRead);        
            } else {
              die('Could not open stream for reading');
            }
        }

        $zip->finish();

    });

    return $response;
}

If your controller action response is not a StreamedResponse, you are likely going to get a corrupted zip containing html as I found out.

Salcido answered 23/6, 2017 at 11:33 Comment(0)
M
1

It's an old question but I leave what it worked for me just in case it helps someone else. When setting the options you need set Zero header to true and enable zip 64 to false (this will limit the archive to archive to 4 Gb though):

$options->setZeroHeader(true);
$opt->setEnableZip64(false)

Everything else as described by Forer. Solution found on https://github.com/maennchen/ZipStream-PHP/issues/71

Munch answered 8/1, 2020 at 13:16 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.