Create zip file in memory from bytes (text with arbitrary encoding)
Asked Answered
T

2

7

The application i'm developing needs to compress xml files into zip files and send them through http requests to a web service. As I dont need to keep the zip files, i'm just performing the compression in memory. The web service is denying my requests because the zip files are apparently malformed.

I know there is a solution in this question which works perfectly, but it uses a StreamWriter. My problem with that solution is that StreamWriter requires an encoding or assumes UTF-8, and I do not need to know the enconding of the xml files. I just need to read the bytes from those files, and store them inside a zip file, whatever encoding they use.

So, to be clear, this question has nothing to do with encodings, as I don't need to transform the bytes into text or the oposite. I just need to compress a byte[].

I'm using the next code to test how my zip file is malformed:

static void Main(string[] args)
{
    Encoding encoding = Encoding.GetEncoding("ISO-8859-1");

    string xmlDeclaration = "<?xml version=\"1.0\" encoding=\"" + encoding.WebName.ToUpperInvariant() + "\"?>";
    string xmlBody = "<Test>ª!\"·$%/()=?¿\\|@#~€¬'¡º</Test>";
    string xmlContent = xmlDeclaration + xmlBody;
    byte[] bytes = encoding.GetBytes(xmlContent);
    string fileName = "test.xml";
    string zipPath = @"C:\Users\dgarcia\test.zip";

    Test(bytes, fileName, zipPath);
}

static void Test(byte[] bytes, string fileName, string zipPath)
{
    byte[] zipBytes;

    using (var memoryStream = new MemoryStream())
    using (var zipArchive = new ZipArchive(memoryStream, ZipArchiveMode.Create, leaveOpen: false))
    {
        var zipEntry = zipArchive.CreateEntry(fileName);
        using (Stream entryStream = zipEntry.Open())
        {
            entryStream.Write(bytes, 0, bytes.Length);
        }

        //Edit: as the accepted answer states, the problem is here, because i'm reading from the memoryStream before disposing the zipArchive.
        zipBytes = memoryStream.ToArray();
    }

    using (var fileStream = new FileStream(zipPath, FileMode.OpenOrCreate))
    {
        fileStream.Write(zipBytes, 0, zipBytes.Length);
    }
}

If I try to open that file, I get an "Unexpected end of file" error. So apparently, the web service is correctly reporting a malformed zip file. What I have tried so far:

  • Flushing the entryStream.
  • Closing the entryStream.
  • Both flushing and closing the entryStream.

Note that if I open the zipArchive directly from the fileStream the zip file is formed with no errors. However, the fileStream is just there as a test, and I need to create my zip file in memory.

Titivate answered 22/2, 2018 at 12:35 Comment(2)
Not sure if it matters but your test input will be corrupt (€) if the other end does care about encoding and assumes say UTF8. The string in C# is always UTF16 so why not just go ahead and write is a UTF8?Pam
If i'm not wrong, the other end should care about encoding and should use exactly the enconding that the xml declaration states. So, in this case, all the symbols would be encoded using ISO-8859-1, and the other end should decode them using ISO-8859-1 as well.Jerrold
D
12

You are trying to get bytes from MemoryStream too early, ZipArchive did not write them all yet. Instead, do like this:

using (var memoryStream = new MemoryStream()) {
    // note "leaveOpen" true, to not dispose memoryStream too early
    using (var zipArchive = new ZipArchive(memoryStream, ZipArchiveMode.Create, leaveOpen: true)) {
        var zipEntry = zipArchive.CreateEntry(fileName);
        using (Stream entryStream = zipEntry.Open()) {
            entryStream.Write(bytes, 0, bytes.Length);
        }                    
    }
    // now, after zipArchive is disposed - all is written to memory stream
    zipBytes = memoryStream.ToArray();
}
Dumfries answered 22/2, 2018 at 12:44 Comment(2)
This was exactly the reason why my code was not working. Thank you.Jerrold
Is there no need to do a Flush at the end ? I’m not sure, some streams require it, others don’tChrominance
D
-1

If you use a memory stream to load your text you can control the encoding type and it works across a WCF service. This is the implementation i am using currently and it works on my WCF services

    private byte[] Zip(string text)
    {
        var bytes = Encoding.UTF8.GetBytes(text);

        using (var msi = new MemoryStream(bytes))
        using (var mso = new MemoryStream())
        {
            using (var gs = new GZipStream(mso, CompressionMode.Compress))
            {
                CopyTo(msi, gs);
            }

            return mso.ToArray();
        }
    }

    private string Unzip(byte[] bytes)
    {
        using (var msi = new MemoryStream(bytes))
        using (var mso = new MemoryStream())
        {
            using (var gs = new GZipStream(msi, CompressionMode.Decompress))
            {
                CopyTo(gs, mso);
            }

            return Encoding.UTF8.GetString(mso.ToArray());
        }
    }
Duisburg answered 22/2, 2018 at 12:48 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.