How to compress a buffer with zlib?
Asked Answered
D

5

42

There is a usage example at the zlib website: http://www.zlib.net/zlib_how.html

However in the example they are compressing a file. I would like to compress a binary data stored in a buffer in memory. I don't want to save the compressed buffer to disk either.

Basically here is my buffer:

fIplImageHeader->imageData = (char*)imageIn->getFrame();

How can I compress it with zlib?

I would appreciate some code example of how to do that.

Drennan answered 27/12, 2010 at 12:20 Comment(1)
What's wrong with the example you mentioned? It does compress a buffer in memory, only it reads data from a file first. You get data from somewhere else, but the rest is the same - why wouldn't it be?Embarrass
L
38

This is an example to pack a buffer with zlib and save the compressed contents in a vector.

void compress_memory(void *in_data, size_t in_data_size, std::vector<uint8_t> &out_data)
{
 std::vector<uint8_t> buffer;

 const size_t BUFSIZE = 128 * 1024;
 uint8_t temp_buffer[BUFSIZE];

 z_stream strm;
 strm.zalloc = 0;
 strm.zfree = 0;
 strm.next_in = reinterpret_cast<uint8_t *>(in_data);
 strm.avail_in = in_data_size;
 strm.next_out = temp_buffer;
 strm.avail_out = BUFSIZE;

 deflateInit(&strm, Z_BEST_COMPRESSION);

 while (strm.avail_in != 0)
 {
  int res = deflate(&strm, Z_NO_FLUSH);
  assert(res == Z_OK);
  if (strm.avail_out == 0)
  {
   buffer.insert(buffer.end(), temp_buffer, temp_buffer + BUFSIZE);
   strm.next_out = temp_buffer;
   strm.avail_out = BUFSIZE;
  }
 }

 int deflate_res = Z_OK;
 while (deflate_res == Z_OK)
 {
  if (strm.avail_out == 0)
  {
   buffer.insert(buffer.end(), temp_buffer, temp_buffer + BUFSIZE);
   strm.next_out = temp_buffer;
   strm.avail_out = BUFSIZE;
  }
  deflate_res = deflate(&strm, Z_FINISH);
 }

 assert(deflate_res == Z_STREAM_END);
 buffer.insert(buffer.end(), temp_buffer, temp_buffer + BUFSIZE - strm.avail_out);
 deflateEnd(&strm);

 out_data.swap(buffer);
}
Lolanthe answered 27/12, 2010 at 12:55 Comment(4)
I just read an interesting approach that uses a flag to control the flush state and does it in one loop. Also, it's worth noting that this works equally well with std::string in place of the vector, which is nice for sending over the wire or to another function.Enjoin
This seems very complicated. Why do all this instead of just using the compress function?Chetnik
Compress requires you to know the output size and allocate a sufficiently large buffer; this method allows you to realloc() and have a dynamically expanding buffer.Koralle
Nice code but temp_buffer produces an escapes the local scope warning 3 times. You may want to fix that.Microfarad
T
51

zlib.h has all the functions you need: compress (or compress2) and uncompress. See the source code of zlib for an answer.

ZEXTERN int ZEXPORT compress OF((Bytef *dest,   uLongf *destLen, const Bytef *source, uLong sourceLen));
/*
         Compresses the source buffer into the destination buffer.  sourceLen is
     the byte length of the source buffer.  Upon entry, destLen is the total size
     of the destination buffer, which must be at least the value returned by
     compressBound(sourceLen).  Upon exit, destLen is the actual size of the
     compressed buffer.

         compress returns Z_OK if success, Z_MEM_ERROR if there was not
     enough memory, Z_BUF_ERROR if there was not enough room in the output
     buffer.
*/

ZEXTERN int ZEXPORT uncompress OF((Bytef *dest,   uLongf *destLen, const Bytef *source, uLong sourceLen));
/*
         Decompresses the source buffer into the destination buffer.  sourceLen is
     the byte length of the source buffer.  Upon entry, destLen is the total size
     of the destination buffer, which must be large enough to hold the entire
     uncompressed data.  (The size of the uncompressed data must have been saved
     previously by the compressor and transmitted to the decompressor by some
     mechanism outside the scope of this compression library.) Upon exit, destLen
     is the actual size of the uncompressed buffer.

         uncompress returns Z_OK if success, Z_MEM_ERROR if there was not
     enough memory, Z_BUF_ERROR if there was not enough room in the output
     buffer, or Z_DATA_ERROR if the input data was corrupted or incomplete.  In
     the case where there is not enough room, uncompress() will fill the output
     buffer with the uncompressed data up to that point.
*/
Townscape answered 8/4, 2012 at 14:26 Comment(2)
+1. This is the damn EASY solution if you want all the default settings. Even if you don't want the default settings you can modify and use the source of these functions.Gabfest
Unfortunately, before using uncompress() we have to know the size of the uncompressed data to allocate the buffer. If you haven't saved the size previously when you were compressing the data, you are out of luck.Divide
L
38

This is an example to pack a buffer with zlib and save the compressed contents in a vector.

void compress_memory(void *in_data, size_t in_data_size, std::vector<uint8_t> &out_data)
{
 std::vector<uint8_t> buffer;

 const size_t BUFSIZE = 128 * 1024;
 uint8_t temp_buffer[BUFSIZE];

 z_stream strm;
 strm.zalloc = 0;
 strm.zfree = 0;
 strm.next_in = reinterpret_cast<uint8_t *>(in_data);
 strm.avail_in = in_data_size;
 strm.next_out = temp_buffer;
 strm.avail_out = BUFSIZE;

 deflateInit(&strm, Z_BEST_COMPRESSION);

 while (strm.avail_in != 0)
 {
  int res = deflate(&strm, Z_NO_FLUSH);
  assert(res == Z_OK);
  if (strm.avail_out == 0)
  {
   buffer.insert(buffer.end(), temp_buffer, temp_buffer + BUFSIZE);
   strm.next_out = temp_buffer;
   strm.avail_out = BUFSIZE;
  }
 }

 int deflate_res = Z_OK;
 while (deflate_res == Z_OK)
 {
  if (strm.avail_out == 0)
  {
   buffer.insert(buffer.end(), temp_buffer, temp_buffer + BUFSIZE);
   strm.next_out = temp_buffer;
   strm.avail_out = BUFSIZE;
  }
  deflate_res = deflate(&strm, Z_FINISH);
 }

 assert(deflate_res == Z_STREAM_END);
 buffer.insert(buffer.end(), temp_buffer, temp_buffer + BUFSIZE - strm.avail_out);
 deflateEnd(&strm);

 out_data.swap(buffer);
}
Lolanthe answered 27/12, 2010 at 12:55 Comment(4)
I just read an interesting approach that uses a flag to control the flush state and does it in one loop. Also, it's worth noting that this works equally well with std::string in place of the vector, which is nice for sending over the wire or to another function.Enjoin
This seems very complicated. Why do all this instead of just using the compress function?Chetnik
Compress requires you to know the output size and allocate a sufficiently large buffer; this method allows you to realloc() and have a dynamically expanding buffer.Koralle
Nice code but temp_buffer produces an escapes the local scope warning 3 times. You may want to fix that.Microfarad
G
14

You can easily adapt the example by replacing fread() and fwrite() calls with direct pointers to your data. For zlib compression (referred to as deflate as you "take out all the air of your data") you allocate z_stream structure, call deflateInit() and then:

  1. fill next_in with the next chunk of data you want to compress
  2. set avail_in to the number of bytes available in next_in
  3. set next_out to where the compressed data should be written which should usually be a pointer inside your buffer that advances as you go along
  4. set avail_out to the number of bytes available in next_out
  5. call deflate
  6. repeat steps 3-5 until avail_out is non-zero (i.e. there's more room in the output buffer than zlib needs - no more data to write)
  7. repeat steps 1-6 while you have data to compress

Eventually you call deflateEnd() and you're done.

You're basically feeding it chunks of input and output until you're out of input and it is out of output.

Grater answered 27/12, 2010 at 12:54 Comment(0)
M
5

The classic way more convenient with C++ features

Here's a full example which demonstrates compression and decompression using C++ std::vector objects:

#include <cstdio>
#include <iosfwd>
#include <iostream>
#include <vector>
#include <zconf.h>
#include <zlib.h>
#include <iomanip>
#include <cassert>

void add_buffer_to_vector(std::vector<char> &vector, const char *buffer, uLongf length) {
    for (int character_index = 0; character_index < length; character_index++) {
        char current_character = buffer[character_index];
        vector.push_back(current_character);
    }
}

int compress_vector(std::vector<char> source, std::vector<char> &destination) {
    unsigned long source_length = source.size();
    uLongf destination_length = compressBound(source_length);

    char *destination_data = (char *) malloc(destination_length);
    if (destination_data == nullptr) {
        return Z_MEM_ERROR;
    }

    Bytef *source_data = (Bytef *) source.data();
    int return_value = compress2((Bytef *) destination_data, &destination_length, source_data, source_length,
                                 Z_BEST_COMPRESSION);
    add_buffer_to_vector(destination, destination_data, destination_length);
    free(destination_data);
    return return_value;
}

int decompress_vector(std::vector<char> source, std::vector<char> &destination) {
    unsigned long source_length = source.size();
    uLongf destination_length = compressBound(source_length);

    char *destination_data = (char *) malloc(destination_length);
    if (destination_data == nullptr) {
        return Z_MEM_ERROR;
    }

    Bytef *source_data = (Bytef *) source.data();
    int return_value = uncompress((Bytef *) destination_data, &destination_length, source_data, source.size());
    add_buffer_to_vector(destination, destination_data, destination_length);
    free(destination_data);
    return return_value;
}

void add_string_to_vector(std::vector<char> &uncompressed_data,
                          const char *my_string) {
    int character_index = 0;
    while (true) {
        char current_character = my_string[character_index];
        uncompressed_data.push_back(current_character);

        if (current_character == '\00') {
            break;
        }

        character_index++;
    }
}

// https://mcmap.net/q/377894/-how-do-i-print-bytes-as-hexadecimal
void print_bytes(std::ostream &stream, const unsigned char *data, size_t data_length, bool format = true) {
    stream << std::setfill('0');
    for (size_t data_index = 0; data_index < data_length; ++data_index) {
        stream << std::hex << std::setw(2) << (int) data[data_index];
        if (format) {
            stream << (((data_index + 1) % 16 == 0) ? "\n" : " ");
        }
    }
    stream << std::endl;
}

void test_compression() {
    std::vector<char> uncompressed(0);
    auto *my_string = (char *) "Hello, world!";
    add_string_to_vector(uncompressed, my_string);

    std::vector<char> compressed(0);
    int compression_result = compress_vector(uncompressed, compressed);
    assert(compression_result == F_OK);

    std::vector<char> decompressed(0);
    int decompression_result = decompress_vector(compressed, decompressed);
    assert(decompression_result == F_OK);

    printf("Uncompressed: %s\n", uncompressed.data());
    printf("Compressed: ");
    std::ostream &standard_output = std::cout;
    print_bytes(standard_output, (const unsigned char *) compressed.data(), compressed.size(), false);
    printf("Decompressed: %s\n", decompressed.data());
}

In your main.cpp simply call:

int main(int argc, char *argv[]) {
    test_compression();
    return EXIT_SUCCESS;
}

The output produced:

Uncompressed: Hello, world!
Compressed: 78daf348cdc9c9d75128cf2fca495164000024e8048a
Decompressed: Hello, world!

The Boost way

#include <iostream>
#include <boost/iostreams/filtering_streambuf.hpp>
#include <boost/iostreams/copy.hpp>
#include <boost/iostreams/filter/zlib.hpp>

std::string compress(const std::string &data) {
    boost::iostreams::filtering_streambuf<boost::iostreams::output> output_stream;
    output_stream.push(boost::iostreams::zlib_compressor());
    std::stringstream string_stream;
    output_stream.push(string_stream);
    boost::iostreams::copy(boost::iostreams::basic_array_source<char>(data.c_str(),
                                                                      data.size()), output_stream);
    return string_stream.str();
}

std::string decompress(const std::string &cipher_text) {
    std::stringstream string_stream;
    string_stream << cipher_text;
    boost::iostreams::filtering_streambuf<boost::iostreams::input> input_stream;
    input_stream.push(boost::iostreams::zlib_decompressor());

    input_stream.push(string_stream);
    std::stringstream unpacked_text;
    boost::iostreams::copy(input_stream, unpacked_text);
    return unpacked_text.str();
}

TEST_CASE("zlib") {
    std::string plain_text = "Hello, world!";
    const auto cipher_text = compress(plain_text);
    const auto decompressed_plain_text = decompress(cipher_text);
    REQUIRE(plain_text == decompressed_plain_text);
}
Microfarad answered 5/8, 2018 at 16:49 Comment(6)
Note that uLongf destination_length = compressBound(source_length); char destination_data[destination_length]; is not valid C++. It doesn't have VLAs.Rootless
@HolyBlackCat: But the code compiles without warning with C++17.Microfarad
@HolyBlackCat: Alright, I'll use malloc() then. I fixed the code.Microfarad
Or even better std::vector<uint8_t>, or std::unique_ptr<uint8_t[]>.Rootless
On decompression you can't simply use compressBound to compute destination length, as per zlib documentation: "The size of the uncompressed data must have been saved previously by the compressor and transmitted to the decompressor by some mechanism outside the scope of this compression library."Angleworm
@sw0rdf1sh: So sorry, it worked for my simple example thus I assumed it was correct. Feel free to suggest an edit/improvement since I won't refactor this code now that I don't currently need it anymore.Microfarad
E
3

This is not a direct answer on your question about the zlib API, but you may be interested in boost::iostreams library paired with zlib.

This allows to use zlib-driven packing algorithms using the basic "stream" operations notation and then your data could be easily compressed by opening some memory stream and doing the << data operation on it.

In case of boost::iostreams this would automatically invoke the corresponding packing filter for every data that passes through the stream.

Eohippus answered 27/12, 2010 at 12:30 Comment(1)
Just a comment, but this means you have to include the boost iostreams library, boost zlib library, libz, and libzbz2, compared to only libz. So if bundling these libraries is a size issue then it is best to avoid boost in this case. ~BenSholley

© 2022 - 2024 — McMap. All rights reserved.