Decoding a JPEG Huffman block (table)
Asked Answered
O

1

15

The following block is nested by Huffman block markers

-HUFF---------------------------------------------------------------------0084-
  10    0    1    2    4    3    4    6    5    6    8    a    9    4    2    3
   0    1    2   11    0    3    4   21    5   12   31    6   41   51   61   13
  22   71   81   91   a1   14   32   b1   d1   f0   15   23   35   42   b2   c1
   7   16   24   33   52   72   73   e1   25   34   43   53   62   74   82   94
  a2   f1   26   44   54   63   64   92   93   c2   d2   55   56   84   b3   45
  83   46   a3   e2
-------------------------------------------------------------------------------

0084 is the length of the table as an integer and is not included in the block here

according to the JPEG standard, the first address aparently makes it an AC table at destination 0 (0x10)

and aparently from there onwards it's a huffman table.

So, how is it decoded?

Orton answered 14/10, 2009 at 1:54 Comment(4)
"0084 is the length of the table as an integer" - you mean as an integer in decimal (base 10). All these numbers are integers. The length was originally 0x0054 - an interger in hex - and you converted it to decimal for us.Yaupon
What should have been obvious by my note was that this was printed using a format specifier in the stdio.h's printf function containing %04i which is the specifier for signed integers with leading zeroes and at least four numerals.Orton
I agree that it must be obvious if one is accustomed to equating the concept of integers with that of whole-number decimals. In the context of a sea of bytes, representing (whole) numbers written in hex, and in the very position where an actual DHT would have the same length, itself represented as two bytes in the same format as the rest of the data, it was less obvious.Yaupon
(In a DHT, the length of the table is the two bytes immediately preceeding the table data.)Yaupon
R
23

The next 16 bytes after the 0x10 tell you how many codes of each length. In your example, there are 0 codes of length 1 bit, 1 code of length 2 bits, 2 codes of length 3 bits, 4 codes of length 4 bits, 3 codes of length 5 bits, and so on.

These are then followed by the values that are encoded by those codes, in order. Again from your example:

Code length | Number | Symbol(s)
------------+--------+----------
1 bit       | 0      |
2 bits      | 1      | 0x01
3 bits      | 2      | 0x02 0x11
4 bits      | 4      | 0x00 0x03 0x04 0x21
5 bits      | 3      | 0x05 0x12 0x31
... etc

You then build a binary tree from the top down, assigning the symbols in order. In this example, you get:

Symbol | Code 
-------+------
0x01   | 00
0x02   | 010
0x11   | 011
0x00   | 1000
0x03   | 1001
0x04   | 1010
0x21   | 1011
...etc
Rennarennane answered 14/10, 2009 at 8:13 Comment(2)
thanks a lot man this has helped in my understanding a lot :)Orton
Edit: Reread the answer, it is first length of codes, then codewords! (Is it always 16 codewords?), then I know the answer to my question! Thx -- Original Comment: How do you get from the Number of Symbols to the actual Symbols. How do I know that the one 2bits symbol is 0x01, and not 0x00, 0x10 or 0x11? ...Okay

© 2022 - 2024 — McMap. All rights reserved.