What are the real-world applications of huffman coding?
Asked Answered
O

6

29

I am told that Huffman coding is used as loseless data compression algorithm, but I am also told that real data compress software do not employ Huffman coding, because if the keys are not distributed decentralized enough, the compressed file could be even larger than the orignal file.

This leaves me wondering are there any real-world application of Huffman coding?

Oust answered 4/2, 2010 at 11:55 Comment(2)
someone is telling you porkies.Fecal
Honestly, "are there any real-world non-Huffman compression?" would be a more interesting question (there are, but it would be more interesting) seen the Real-World[TM] success of Huffman and Adaptive Huffman encoding/compression. The one who told you that "real data compress software do not employ huffman" is not right in his own mind.Tanker
F
39

Huffman is widely used in all the mainstream compression formats that you might encounter - from GZIP, PKZIP (winzip etc) and BZIP2, to image formats such as JPEG and PNG.

All compression schemes have pathological data-sets that cannot be meaningfully compressed; the archive formats I listed above simply 'store' such files uncompressed when they are encountered.

Newer arithmetic and range coding schemes are often avoided because of patent issues, meaning Huffman remains the work-horse of the compression industry.

Fecal answered 4/2, 2010 at 12:4 Comment(4)
So you mean huffman is actually the 'basis' if not 'core' of compression industry?Oust
Absolutely. That is exactly what I mean.Fecal
Yeah, your question was like asking "Give me an example of a car made out of steel."Deft
I've never heard of a compression industry. Don't you mean software industry?Deft
T
5

See Wikipedia article on the subject:

Huffman coding today is often used as a "back-end" to some other compression method. DEFLATE (PKZIP's algorithm) and multimedia codecs such as JPEG and MP3 have a front-end model and quantization followed by Huffman coding.

Trujillo answered 4/2, 2010 at 11:57 Comment(3)
What is "back-end"?what is "front-end"?Oust
@jcyang: they are just two different parts of the system. Back-end is probably closer to writing the file and front-end probably near where it reads the file.Deft
'backend' means the encoding of values that have been first preprocessed and possibly compressed with another algorithm. For example, DEFLATE uses LZ77 to encode duplicate sequences, before Huffman to encode those characters not in sequences.Fecal
R
3

There are quite a lot of real-world applications of Huffman Encoding. ZIP is perhaps the most widely used compression tool that uses Huffman Encoding as its basis. The latest of the most efficient lossless compression algorithms, Brotli Compression, released by Google last month also uses Huffman Coding. Apart from that, Brotli also uses LZ77 and a few other fundamental lossless compression algorithms. Refer to Brotli.

Rehm answered 15/10, 2015 at 9:30 Comment(0)
D
2

When one considers compression algorithms there are often benefits and disadvantages to each. It is the nature of compression that given a set of input, there exists better and worse compression algorithms for that data.

Huffman is really, really good at some things. Most notably with data that repeats order a lot and contains a sub-set of the character space. For example english language text files. The english language tends to have the same letters followed by the same other letters.

If your professor or book gave you the impression that Huffman is not used, they are wrong. For example almost all communications with and from the internet are at some point Huffman encoded. (A number of communication protocols use it.) Most image files (jpegs) are Huffman encoded. Most music files (mp3s) are Huffman encoded. There are many other examples.

One reason Huffman is used is because it can be "discovered" via a slightly different algorithm called adaptive Huffman. As you read the file you learn the Huffman code and "compress as you go". This is a simplified overview , but you get the idea.

To solve the use the best algorithm for the situation problem, zip files allow a number of different compressions to be used depending on what the best one is for a given file.

Deft answered 4/2, 2010 at 12:6 Comment(3)
Huffman is not 'discovered' - it's not stream based. There are stream-based 'adaptive' Huffman variations, but they are sufficiently different that nobody would assume you meant an adaptive variation if you simply said 'Huffman'.Fecal
Which internet protocols use it?Fecal
internet protocols was the wrong term -- communication protocols is what I meant. Changing it.Deft
A
2

A very widespread application is the encoding of strings in HPACK, the header compression technique of http/2.

The RFC does directly provide a Huffman Code Table that is optimized for compressing HTTP headers.

Alembic answered 2/4, 2019 at 18:43 Comment(0)
A
0

Huffman code is used to convert fixed length codes into varible length codes, which results in lossless compression. Variable length codes may be further compressed using JPEG and MPEG techniques to get the desired compression ratio.

Angleaangler answered 30/7, 2015 at 15:54 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.