Character to Glyph mapping table
Asked Answered
P

1

2

I am following the documentation on apple.com.

I managed to get The 'cmap' encoding subtables. I know 100% that platformID, platformSpecificID are correct, but offset is suspicious. Here is the data:

array(3) {
  [0]=>
  array(3) {
    ["platform_id"]=>
    int(0)
    ["specific_id"]=>
    int(3)
    ["offset"]=>
    int(532)
  }
  [1]=>
  array(3) {
    ["platform_id"]=>
    int(1)
    ["specific_id"]=>
    int(0)
    ["offset"]=>
    int(28)
  }
  [2]=>
  array(3) {
    ["platform_id"]=>
    int(3)
    ["specific_id"]=>
    int(1)
    ["offset"]=>
    int(532)
  }
}

Offset for two tables is the same, 532. Can anyone explain me this? And is this offset from current position or from the beginning of the file?

part 2

Ok. So I managed to get to the format tables using this:

private function parseCmapTable($table)
{
    $this->position         = $table['offset'];

    // http://developer.apple.com/fonts/ttrefman/RM06/Chap6cmap.html
    // General table information

    $data   = array
    (
        'version'           => $this->getUint16(),
        'number_subtables'  => $this->getUint16(),
    );

    $sub_tables = array();

    for($i = 0; $i < $data['number_subtables']; $i++)
    {

        // http://developer.apple.com/fonts/ttrefman/RM06/Chap6cmap.html
        // The 'cmap' encoding subtables

        $sub_tables[]   = array
        (
            'platform_id'       => $this->getUint16(),
            'specific_id'       => $this->getUint16(),
            'offset'            => $this->getUint32(),
        );

    }

    // http://developer.apple.com/fonts/ttrefman/RM06/Chap6cmap.html
    // The 'cmap' formats

    $formats                = array();

    foreach($sub_tables as $t)
    {
        // https://mcmap.net/q/1781407/-character-to-glyph-mapping-table/5322267#5322267

        $this->position = $table['offset'] + $t['offset'];

        $format = array
        (
            'format'                    => $this->getUint16(),
            'length'                    => $this->getUint16(),
            'language'                  => $this->getUint16(),
        );

        if($format['format'] == 4)
        {
            $format     += array
            (
                'seg_count_X2'                  => $this->getUint16(),
                'search_range'                  => $this->getUint16(),
                'entry_selector'                => $this->getUint16(),
                'range_shift'                   => $this->getUint16(),
                'end_code[segCount]'            => $this->getUint16(),
                'reserved_pad'                  => $this->getUint16(),
                'start_code[segCount]'          => $this->getUint16(),
                'id_delta[segCount]'            => $this->getUint16(),
                'id_range_offset[segCount]'     => $this->getUint16(),
                'glyph_index_array[variable]'   => $this->getUint16(),
            );

            $backup = $format;

            $format['seg_count_X2']     = $backup['seg_count_X2']*2;
            $format['search_range']     = 2 * (2 * floor(log($backup['seg_count_X2'], 2)));
            $format['entry_selector']   = log($backup['search_range']/2, 2);
            $format['range_shift']      = (2 * $backup['seg_count_X2']) - $backup['search_range'];
        }

        $formats[$t['offset']]  = $format;
    }       

    die(var_dump( $sub_tables, $formats ));

The output:

array(3) {
[0]=>
  array(3) {
    ["platform_id"]=>
    int(0)
    ["specific_id"]=>
    int(3)
    ["offset"]=>
    int(532)
  }
  [1]=>
  array(3) {
    ["platform_id"]=>
    int(1)
    ["specific_id"]=>
    int(0)
    ["offset"]=>
    int(28)
  }
  [2]=>
  array(3) {
    ["platform_id"]=>
    int(3)
    ["specific_id"]=>
    int(1)
    ["offset"]=>
    int(532)
  }
}
array(2) {
  [532]=>
  array(13) {
    ["format"]=>
    int(4)
    ["length"]=>
    int(658)
    ["language"]=>
    int(0)
    ["seg_count_X2"]=>
    int(192)
    ["search_range"]=>
    float(24)
    ["entry_selector"]=>
    float(5)
    ["range_shift"]=>
    int(128)
    ["end_code[segCount]"]=>
    int(48)
    ["reserved_pad"]=>
    int(58)
    ["start_code[segCount]"]=>
    int(64)
    ["id_delta[segCount]"]=>
    int(69)
    ["id_range_offset[segCount]"]=>
    int(70)
    ["glyph_index_array[variable]"]=>
    int(90)
  }
  [28]=>
  array(3) {
    ["format"]=>
    int(6)
    ["length"]=>
    int(504)
    ["language"]=>
    int(0)
  }
}

Now, how do I get from here, to getting character Unicode codes? I tried reading the documentation, but it is too vague for a novice.

http://developer.apple.com/fonts/ttrefman/RM06/Chap6cmap.html

Phyllis answered 16/3, 2011 at 7:3 Comment(0)
G
2

The offset is from the beginning of the table. What your data is saying is that the Mac table (platformId 1) starts at offset 28, while the Unicode (platformId 0) and Windows (platformId 3) mappings share the same table that starts at byte offset 532.

Gumshoe answered 16/3, 2011 at 7:36 Comment(2)
Thank you Gabe. You seem to know this stuff. Can you take a look at the part 2 of this question?Phyllis
@Guy: Rather than turning this question into a completely different one, please ask a second question and post links from each to the other.Gumshoe

© 2022 - 2024 — McMap. All rights reserved.