SMPP, SMS, GSM, Data Encodings, and Locking Shift Tables [closed]

I'm working with a corporation which is attempting to send SMS messages to people in countries all over the world using various languages.

The corporation has a custom-written application which communicates using the SMPP protocol with SMSC's of various Telcos.

We have been told by different telcos which data_encoding to use for submitting SMPP PDU's to the SMSC.

Currently we are using 7-bit GSM), Latin-1, and UCS-2 encodings. We are using the encoding that each Telco has told us to use. The payload of the SMPP PDU is submitted encoded, and the data_coding parameter is set accordingly (0x00 for GSM, 0x03 for Latin-1, and 0x08 for UCS-2).

Question 1: Should it really matter what we encoding that we utilize for submitting SMPP PDU's to the SMSC? Shouldn't the SMSC be able to convert from the submitted SMPP encoding to the appropriate encoding based upon the contents of the data_coding parameter? Shouldn't we be able to submit all messages via SMPP as UCS-2 , set the data_coding parameter to 0x08, and have the Telco take care of the conversion to the SMS PDU for us?

Currently, we send want to send Portuguese language SMS messages. The telco has told us to use the "SMSC Default Alphabet" for SMPP to submit the messages. Pressed further, they said this was the same as the GSM default alphabet This is concerning as the Portuguese Alphabet isn't fully represented by the GSM Default Alphabet. It seems that the telco is simply transliterating the Portuguese letters to English equivalents. The telco informed us that "if you send a SMS with a special character that the SMSC does not recognize (á,ó,ã for instance) the SMSC will encode those characters to the closest character possible." I find this somewhat impossible since the GSM Default Alphabet doesn't support such characters in the first place.

Question 2: How can special characters be submitted, and then not be recognized if one uses the GSM Default Alphabet? Shouldn't all characters submitted as the GSM Default Alphabet conform to the 7-bit, 128 letter alphabet which is defined in the GSM 03.38 standard?

Question 3: Since the telco has requested that we use the "GSM Default Alphabet", we should submit our SMPP payload encoded as 7-bit packed octets, correct?

Our application stores text as UTF-8. Since the Portuguese telco is requesting that we submit SMPP with a payload containing the GSM Default Alphabet, I presume that we will need to convert from UTF-8 to the 7-bit GSM default alphabet. My current strategy involves mapping each UTF-8 character which has a GSM default equivalent (128 characters total) by value, and then transliterating other UTF-8 characters to the closest GSM default alphabet equivalent, and a question mark otherwise.

Question 4: Is this the appropriate way to handle conversion from UTF-8 to the GSM default alphabet? There don't seem to be many other approaches. The application in question uses Ruby in a Unix environment. No existing libraries supporting GSM seem to be available, so a custom library seems to be the only approach.

My research has uncovered details of the GSM locking shift tables to support other languages using only 7-bits. The locking shift tables are specified in the UDH portion of the SMS PDU.

Question 5: How would one send SMS messages using the locking shift tables via SMPP? Does the SMPP PDU payload need to be modified to contain a UDH which specifies the locking shift table? What should the data_coding parameter be set to?

I'd be thrilled if anyone could answer any of these questions authoritatively.

Recommended topics

Hot tags