decode 7-bit GSM
Asked Answered
T

5

7

I found this post on how to encode ascii data to 7-bit GSM character set, how would I decode 7-bit GSM character again (reverse it back to ascii)?

Techno answered 29/10, 2012 at 23:5 Comment(0)
G
2

For Python2:

import binascii
gsm = ("@£$¥èéùìòÇ\nØø\rÅåΔ_ΦΓΛΩΠΨΣΘΞ\x1bÆæßÉ !\"#¤%&'()*+,-./0123456789:;<=>?"
       "¡ABCDEFGHIJKLMNOPQRSTUVWXYZÄÖÑÜ`¿abcdefghijklmnopqrstuvwxyzäöñüà")
ext = ("````````````````````^```````````````````{}`````\\````````````[~]`"
       "|````````````````````````````````````€``````````````````````````")

def gsm_encode(plaintext):
    result = []
    for c in plaintext:
        idx = gsm.find(c)
        if idx != -1:
            result.append(chr(idx))
            continue
        idx = ext.find(c)
        if idx != -1:
            result.append(chr(27) + chr(idx))
    return ''.join(result).encode('hex')

def gsm_decode(hexstr):
    res = hexstr.decode('hex')
    res = iter(res)
    result = []
    for c in res:
        if c == chr(27):
            c = next(res)
            result.append(ext[ord(c)])
        else:
            result.append(gsm[ord(c)])
    return ''.join(result)

code = gsm_encode("Hello World {}")
print(code)
# 64868d8d903a7390938d853a1b281b29
print(gsm_decode(code))
# Hello World {}
Gambrinus answered 30/10, 2012 at 0:40 Comment(0)
F
10

For example:

C7F7FBCC2E03 stands for 'Google'
Python 3.4

def gsm7bitdecode(f):
   f = ''.join(["{0:08b}".format(int(f[i:i+2], 16)) for i in range(0, len(f), 2)][::-1])
   return ''.join([chr(int(f[::-1][i:i+7][::-1], 2)) for i in range(0, len(f), 7)])

print(gsm7bitdecode('C7F7FBCC2E03'))

Google

Forbis answered 16/6, 2015 at 11:31 Comment(1)
Also Working For Python 2.7.11Starve
A
4

There is a very easy solution:

Convert the hex in binary octets Put each octet in a array but in reverse order (the whole octet, not the bits) because that is the way they are sent. Read the string from right to left in 7 bits groups The number is the character code in the GSM 7 bit table

For example:

C7F7FBCC2E03 stands for 'Google'

The string in reverse order is

03-2E-CC-FB-F7-C7

The six octets are

00000011-00101110-11001100-11111011-11110111-11000111

The septets are

000000-1100101-1101100-1100111-1101111-1101111-1000111

Read then from right to left are:

septet-decimal valor-Char in GSM 7bit table

1000111-71-G

1101111-111-o

1101111-111-o

1100111-103-g

1101100-108-l

1100101-101-e

Discard the last 0000000 value

Arissa answered 14/1, 2014 at 14:18 Comment(0)
G
2

For Python2:

import binascii
gsm = ("@£$¥èéùìòÇ\nØø\rÅåΔ_ΦΓΛΩΠΨΣΘΞ\x1bÆæßÉ !\"#¤%&'()*+,-./0123456789:;<=>?"
       "¡ABCDEFGHIJKLMNOPQRSTUVWXYZÄÖÑÜ`¿abcdefghijklmnopqrstuvwxyzäöñüà")
ext = ("````````````````````^```````````````````{}`````\\````````````[~]`"
       "|````````````````````````````````````€``````````````````````````")

def gsm_encode(plaintext):
    result = []
    for c in plaintext:
        idx = gsm.find(c)
        if idx != -1:
            result.append(chr(idx))
            continue
        idx = ext.find(c)
        if idx != -1:
            result.append(chr(27) + chr(idx))
    return ''.join(result).encode('hex')

def gsm_decode(hexstr):
    res = hexstr.decode('hex')
    res = iter(res)
    result = []
    for c in res:
        if c == chr(27):
            c = next(res)
            result.append(ext[ord(c)])
        else:
            result.append(gsm[ord(c)])
    return ''.join(result)

code = gsm_encode("Hello World {}")
print(code)
# 64868d8d903a7390938d853a1b281b29
print(gsm_decode(code))
# Hello World {}
Gambrinus answered 30/10, 2012 at 0:40 Comment(0)
G
2

I found that noiam's solution does not work when the padding length isn't a multiple of seven.

After a bit of work and close examination of GSM 03.38, I modified noiam's efforts to come up with this solution, which works with all the data I have tried it with.

gsm = ("@£$¥èéùìòÇ\nØø\rÅåΔ_ΦΓΛΩΠΨΣΘΞ\x1bÆæßÉ !\"#¤%&'()*+,-./0123456789:;<=>?"
       "¡ABCDEFGHIJKLMNOPQRSTUVWXYZÄÖÑÜ`¿abcdefghijklmnopqrstuvwxyzäöñüà")
ext = ("````````````````````^```````````````````{}`````\\````````````[~]`"
       "|````````````````````````````````````€``````````````````````````")

def gsm7bitdecode(f):
    """
    https://mcmap.net/q/1413560/-decode-7-bit-gsm

    We make sure our hex string has an even number of digits, prepend a
    zero if necessary to make it so.

    Take one pair of hex digits at a time, convert each octet to
    a binary string, then reverse the list of octets, and join these strings
    of binary digit together to create a string of zeros and ones.  This is f.

    Remove the padding zeros from the beginning of f.

    Then starting from the beginning of f, take seven of these bits at a time,
    and convert to an integer.

    Reverse this array.

    We go through these integers, and if the value is not 27 (escape), we
    use that integer as an index into the gsm array for our character.

    If we find an escape character, we look up the following integer in the
    ext array as our character.
    """
    if len(f) == 0:
        return ''
    if len(f) % 2 == 1:
        f = f"0{f}"
    f = ''.join([f"{int(f[i:i+2], 16):08b}" for i in range(0, len(f), 2)][::-1])
    padlen = len(f) % 7
    f = f[padlen::]
    ints = [int(f[i:i+7], 2) for i in range(0, len(f), 7)][::-1]
    result = []
    if ints[0] == 0:
        ints.pop(0)
    for i in ints:
        if i == 27:
            i = next(ints)
            result.append(ext[i])
        else:
            result.append(gsm[i])
    return ''.join(result)
Gurge answered 8/3, 2023 at 21:56 Comment(2)
your solution does unfortunatelly not work: NameError: name 'gsm' is not definedSokul
Sorry, I left them out. Added them back in.Gurge
S
0

I've written such decoder in c for openwrt device:

uint8_t get_data ( char input, uint8_t * output )
{
    if ( input - '0' >= 0 && '9' - input >= 0 ) {
        * output = input - '0';
    } else if ( input - 'a' >= 0 && 'f' - input >= 0 ) {
        * output = input - 'a' + 10;
    } else if ( input - 'A' >= 0 && 'F' - input >= 0 ) {
        * output = input - 'A' + 10;
    } else {
        return 1;
    }
    return 0;
}

uint8_t get_data_pair ( const char * input, uint8_t * output )
{
    uint8_t data;
    if ( get_data ( * input, &data ) != 0 ) {
        return 1;
    }
    * output = data << 4;
    if ( get_data ( * ( input + 1 ), &data ) != 0 ) {
        return 2;
    }
    * output = * output | data;
    return 0;
}

int main ( int argc, char * argv [] )
{
    if ( argc != 2 ) {
        fputs ( "required argument: hex\n", stderr );
        return 1;
    }

    char * hex = argv[1];
    uint16_t data = 0;
    uint8_t data_length = 0;

    while ( *hex != '\0' ) {
        uint8_t new_data;
        if ( get_data_pair ( hex, &new_data ) != 0 ) {
            fprintf ( stderr, "invalid hex: bad pair %.2s\n", hex );
            putchar ( '\n' );
            return 2;
        }
        hex += 2;

        data = new_data << data_length | data;
        data_length += 8;

        while ( data_length >= 7 ) {
            putchar ( data & 0x7f );
            data = data >> 7;
            data_length -= 7;
        }
    }

    putchar ( '\n' );
    return 0;
}
Sleeve answered 1/11, 2015 at 20:17 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.