c++ what in-memory compression library?
Asked Answered
T

6

6

I already googled for in-memory compression and found quite a few libraries that offert this functionality. the zlib seems to be widely used - but it also seems to be quite old. I'm asking here whether there are newer, better alternatives.

The data i want to compress in-memory are memorypools with size of a few megabytes (2-16 MB) and each of those blocks contains data of two different structs as well as some arrays of pointers. inside the blocks, there's no particular order for the structs and the arrays, they are just allocated after another when the application needs to create such an element.

What compression lib would you suggest for this? compression and decompression performance (both) are more important than compression quality.

Also - for compression reasons - would it be better to have separate pools for the two different structs as well as the arrays, such that each datablock to be compressed only contains one kind of data?

This is the first time i intend to use in-memory compression and i know my question is maybe too general to give a good answer - but every hint is welcome!

thx!

Transcendence answered 2/1, 2010 at 17:29 Comment(2)
I would be surprised if any compression library dealt correctly with pointers.Fitter
the pointers in fact are only address offsets to the start of the poolTranscendence
S
10

zlib is good. Proven, performant, and understood by many. It's what I'd use by default in a new system like what you describe. Its age should be seen as one of its greatest assets.

Safko answered 2/1, 2010 at 17:58 Comment(0)
D
3

For something more modern than zlib, libbzip2 might be worth a look. It provides a similar interface to zlib, for compatibility. In a lot of cases, it offers better compression, but at a performance cost.

For something faster than zlib (but which doesn't compress as well..) there's LZO.

Danedanegeld answered 2/1, 2010 at 18:5 Comment(1)
bzip2 is not appropriate where high speed is a requirement.Safko
F
1

It makes no sense to do this on modern operating systems with a virtual memory manager. You'll create a blob of bytes that are not useful for anything, taking space in your virtual memory address space for no good reason. The memory manager won't leave it in RAM for very long, it will notice that the pages occupied by the blob are not being accessed and swap it out to the paging file.

In addition, you'll have to translate the data if it contains pointers. The odds that you'll be able to decompress the data at the exact same virtual memory address, so that the pointers are still valid, are very close to zero. After all, you did this to free up virtual memory space, the hole previously used by the data will be occupied by something else. This translation will probably not be trivial and it will take lots of additional memory.

If you are doing this to avoid OOM, look at operating system support for memory mapped files and consider switching to 64-bit code.

Figurehead answered 2/1, 2010 at 18:50 Comment(2)
the pointers are adress offsets into the memorypools. i didn't really understand the issue with the paging. i need compression because i have a realtime system that creates and reuses a huge ammount of dataTranscendence
Paging is not always available! E.g. I disabled it on my machine since it makes it far more responsive and does not deplete my SSD!Elocution
B
1

If compression/decompression speed is important for you, you should take a look at LZO:

http://www.oberhumer.com/opensource/lzo/

Compared to zlib the code smaller and easier to use as well.

Bontebok answered 2/1, 2010 at 22:44 Comment(0)
A
0

I'm not aware of anything newer/better than zlib... zlib works fine, despite its age. zlib's deflateInit() has an argument that lets you trade off compression speed against compressed size, so you can experiment with that to find the setting that works best for you application.

There are probably C++ wrapper APIs that call the zlib C API for you, if you want something "prettier"... or if there aren't, its easy enough to write your own.

Asis answered 2/1, 2010 at 17:59 Comment(1)
In some (many?) applications, the compression strength knob of zlib is not all that useful. It can make compression take quite a bit longer, but may not reduce the output size as much as simply using a different system (like bzip2, which can do more extreme compression than zlib can at its maximum setting, though at a large cost in speed). Still, good to point it out.Safko
S
0

For compression, the data matters a lot. Compressing arbitrary binary data in memory is a complete waste of time, will slow your performance immensely, and probably will end up making your memory usage higher.

If you really need to have much more memory you should look at using VirtualAlloc or sbrk to control the memory yourself. This way you can address ALL physical memory, not just 2-4gb.

Sisto answered 3/1, 2010 at 5:11 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.