NodeJS, basE91, & little endian
Asked Answered
M

4

5

How would I got about decoding the string below from basE91 to readable text using NodeJS?

8D7Hh-9D*.n-!DZrG-#DE-$DD-%DC-sl-tl-BEp2m-CE^Ul-DE}CH-EEE-FED-GEC-<l-=l-hE(.K-iEvqS-jEB-kEB-lEB-mEB-Rm-Sm-%E!{Q-&EDgN-(EG:K-)EE-*EE-+EB-xm-ym-GF{}U-HF()Q-IFt%D-JFE-KFB-LFD-[m-]m-mF;JG-nF7]Q-oF2-pFB-qFC-rFB-Wn-Xn-+FD-,FE-.FB- FE-:FD-;FC-2n-3n-

* EDIT *

Using the basE91 table I managed to convert the string above into a hex string

Hex String

5668557210457684246110336890114713568693668683768671151081161086669112501096769948510868691567726969697069687169676010861108104694046751056911811383106696610769661086966109696682109810937693312381386968103784069715875416969426969436966120109121109717012312585727040418173701637687470697570667670689110993109109705974711107055938111170501127066113706711470668711088104370684470694670664770695870685970675011051110

I then fed that into the buffer

var buf = new Buffer(hex, 'hex');
console.log(buf.toString('utf8'));

This gives me:

VhUr►Ev?$a►3h?◄G‼V??f??v?q§►?▬►?f?↕P►?v?Hhi↕Vw&???♠??▬?va►?♦i@Fu►V?↑◄81♠if►v?if►??h!    ?►??1#?8ih►7?♠?§?T▬??&??6?a ►?!►?↨☺#↕XW'♦♦↑↨7☺▬7htpiupfvph?►?1 ►?♣?G◄►p    U??◄↨♣☺↕pf◄7♠q¶pf??►CphDpiFpfGpiXphYpgP◄♣◄►

How do I get that into something I can use? I'm suspecting it's a JSON object...

Monocular answered 3/7, 2012 at 3:47 Comment(11)
might get you started: base91.sourceforge.netNajera
Are you sure about that string? Using the reference decoder that bryanmac linked to I don't get anything that looks like readable text when I decode it, just high-ascii garbage.Teri
Something's definitely wrong with that string. The basE91 homepage specifically notes that '-' (dash) is not part of the encoding alphabet, but there are lots of dashes in your text. (Oh the irony if basE91 isn't compatible with Markup.)Teri
the dash I believe is used to separate it, like a space character. It's a string copied directly from a MMO server. The forums there say it is encoded BasE91 LE and is data used to draw the world map.Monocular
So, Lord of Ultima? As this isn't plain vanilla basE91, it would help if you added a lot more context to your question. Also, basE91 is used to encode binary data so it's not exactly clear how to turn that into readable text. More information about the problem will help in determining the correct output format.Teri
apparently from the forum post BasE91 LE is all that's needed, but I can't figure it out as I don't have any training/experience in binary/hexMonocular
@PastorBones So, you have hex string with UTF8 in it? You sure encodings are ok (I mean source string was UTF8 and all)? Could you please show result of converting basE91-string to hex?Simonize
@elmigranto I updated my post to show the hex.Monocular
@PastorBones your string is not a valid basE91 encoded data. basE91 strings do not have "-" in them. I still tried your input string with original basE91 cli, that is compiled from source, to decode and encode back. Decoder skips - characters and re-encoded string does not have them, naturally. Decoded data is no way a utf-8/utf-16/utf-32 or any iso-8859-* string. Output string gathered from decoding is only valid utf-8 until 10th byte, which is just gibberish. Where did you get this string? Your hex conversion also does not look like an hex representation. Correct is: 014d 4057 4557...Prenotion
@Prenotion The string came directly from a game server...Monocular
This might be helpful: npmjs.org/package/base91Ezzo
T
5

You have two very separate problems which are getting conflated in this discussion.

How can decode basE91 using Javascript/NodeJS/CommnJS?

This is what your question appears to be at a glance and you've gotten various responses in this vein. The answer appears to be: There isn't an existing solution, but basE91 is small and simple enough that you should be able to port it to JS without too much trouble.

Your second less explicit question appears to be:

How can I reverse-engineer Lord of Ultima's game server communication protocol?

You mentioned that your "basE91 LE" string came from an MMO game server, and this forum posting about Lord of Ultima is pretty much the only other hit for "basE91 little endian". (Plus it looks like you posted there a few days ago.) As variously noted, the data you posted is not plain vanilla basE91. BasE91 has a clearly defined set of characters used and '-' and '' (space) are not among them, but both appear in your data. You mentioned in comments that you thought '-' was being used as a separator, and it'd be easier to answer this second question if more information like that was provided up front.

Some notes on this question:

  • BasE91 uses a standard encoding table. If you were making a making an online game and wanted to obfuscate your traffic a bit, it'd be trivial to use a different or scrambled table. That would account for characters that aren't legal basE91 appearing in your data. Plus, this post from that forum has what looks to be exactly that: a different de/encoding table. Have you tried using that table to decode part of your data by hand and see if it makes sense?

  • You request conversion to "readable text", but basE91 is used for transmitting binary data so it's not clear why one would use it for plain text. Even assuming that it is encoding text, there are various common ways to encode plain text and the decoded data doesn't appear to be doing so. More likely, the data is actual binary data and, in a very real sense, basE91 is the textual representation. Without more information about the output you're expecting, it's hard to know how you wanted it translated to text. This is the basE91 encoding of a 1 pixel transparent gif:

    JaQGWo*HBtAARDBtB"B"B"S|QtAAAA$M)Bc4v(#AsAAABtAACABtlBLHBtd
    

    Can that be converted to readable text? You mention that you think it's JSON, can you give us some hints about why you think that? (And again, why use a binary encoder for plain text?)

  • Working from the same forum post, it sounds like you're working with a series of 5-bit coordinates, so maybe those numbers are what you're looking for? However, there's still more to the puzzle because those basE91 groups encode different size numbers. To wit (warning wall of inscrutable text):

    echo '8D7Hh-9D*.n-!DZrG-#DE-$DD-%DC-sl-tl-BEp2m-CE^Ul-DE}CH-EEE-FED-GEC-<l-=l-hE(.K-iEvqS-jEB-kEB-lEB-mEB-Rm-Sm-%E!{Q-&EDgN-(EG:K-)EE-*EE-+EB-xm-ym-GF{}U-HF()Q-IFt%D-JFE-KFB-LFD-[m-]m-mF;JG-nF7]Q-oF2-pFB-qFC-rFB-Wn-Xn-+FD-,FE-.FB- FE-:FD-;FC-2n-3n-' \
    | while IFS='' read -d - a; do echo -n "'$a' => "; echo -n "$a" \
    | ./base91 -d | hexdump | head -1 | cut -d ' ' -f 2-; done
    # head and cut are easier that understanding hexdump's formatting system
    '8D7Hh' => 4d 01 57 84                                    
    '9D*.n' => 4e a1 3b 9f                                    
    '!DZrG' => 4f 41 ec 19                                    
    '#DE' => 50 81                                          
    '$DD' => 51 61                                          
    '%DC' => 52 41                                          
    'sl' => 53                                             
    'tl' => 54                                             
    'BEp2m' => 6d 61 6b 9a                                    
    'CE^Ul' => 6e e1 ed 94                                    
    'DE}CH' => 6f c1 21 1c                                    
    'EEE' => 70 81                                          
    'FED' => 71 61                                          
    'GEC' => 72 41                                          
    '<l' => 73                                             
    '=l' => 74                                             
    'hE(.K' => 8d 61 3b 2b                                    
    'iEvqS' => 8e a1 e3 49                                    
    'jEB' => 8f 21                                          
    'kEB' => 90 21                                          
    'lEB' => 91 21                                          
    'mEB' => 92 21                                          
    'Rm' => 93                                             
    'Sm' => 94                                             
    '%E!{Q' => ad 01 da 43                                    
    '&EDgN' => ae 61 6c 35                                    
    '(EG:K' => af 81 4a 2b                                    
    ')EE' => b0 81                                          
    '*EE' => b1 81                                          
    '+EB' => b2 21                                          
    'xm' => b3                                             
    'ym' => b4                                             
    'GF{}U' => cd c1 f3 53                                    
    'HF()Q' => ce e1 0d 43                                    
    'IFt%D' => cf 01 e9 0e                                    
    'JFE' => d0 81                                          
    'KFB' => d1 21                                          
    'LFD' => d2 61                                          
    '[m' => d3                                             
    ']m' => d4                                             
    'mF;JG' => ed c1 6f 18                                    
    'nF7]Q' => ee 21 ac 43                                    
    'oF2' => ef c1                                          
    'pFB' => f0 21                                          
    'qFC' => f1 41                                          
    'rFB' => f2 21                                          
    'Wn' => f3                                             
    'Xn' => f4                                             
    '+FD' => 0d 62                                          
    ',FE' => 0e 82                                          
    '.FB' => 0f 22                                          
    ' FE' => 71                                             
    ':FD' => 11 62                                          
    ';FC' => 12 42                                          
    '2n' => 13                                             
    '3n' => 14           
    

    There's certainly a pattern there, and even look little-endian if you squint right. But they don't mean anything to me, do they look sensible to you?

Teri answered 14/7, 2012 at 5:23 Comment(4)
Very nice answer, thanx for taking the time. Yes I am expecting coordinates (not sure what 5 bit means), but also data on the object at that location, such as city name, city id, etc. I expect it to be in a json format as the rest of the communication is in json format. I suspect this data was encoded to save bandwidth as the game polls every second. To answer any suspicions I'm simply trying to grab world data for a 3rd party tool to record growth, alliance change, etc. I'm doing it now through single calls which takes forever. Other sites are doing the same and guess they're using this data.Monocular
If you don't mind, could you reference a website where I can readup and learn about byte encoding? I'm un-educated on the topic and everything I google makes no sense as I have no foundation to work from.Monocular
For binary encoding, have a look at the two's complement article on WP. I don't think that this is encoded JSON, it just wouldn't make sense as basE91 is used for converting binary data to a more easily transmissible format, not especially for compression. Plain text would generally be bigger if it were basE91 encoded.Teri
well, there's alot of data that should be coming across like x:y coordinates, city name, city id, and everything else in the popup on this screencap bit.ly/NAxj6n This is why I think it's JSON and I've read about BSON (binary encoded JSON)? How else would I use this binary data? (btw, thanks for the link)Monocular
E
3

There is quite compact Java implementation of basE91 encoder/decoder on github. This should be quite easy to translate to JS, I think.

Direct link to source file.

Equivalency answered 9/7, 2012 at 21:34 Comment(3)
There's also a javascript CommonJS library to decode/encode basE91, but how to then get the bytes to string format?Monocular
Strings in JavaScript have a charCodeAt(index) method that returns ASCII code of a character at given index. Just load the data as string in JS, and then process it char by char.Equivalency
You can also check out NodeJS Buffer Class. It's specifically created for handling binary data.Equivalency
W
0

The short answer is there is no easy / quick way to do this in node.js (and apparently currenty no modules either.)

Adding to @bryanmac 's comment, using base91 as a starting point (The main source file is only 160 lines including copyright!), you could store the data in a node.js Buffer, and once converted from base91 to bytes, use the built in node.js methods to convert to a string.

Whiteeye answered 9/7, 2012 at 21:23 Comment(5)
I've used a CommonJS module to decode the basE91 to bytes, but how to get that into a NodeJS buffer? All the docs I've read say to pass strings, not bytesMonocular
Hmmm node uses the CommonJS module system... Do you have a link to the module? It should be fairly trivial to integrate. As for bytes vs strings... in JS strings are the better choice... The Buffer allows you to work with both (As well as base64), and convert between the formatsWhiteeye
I can't find the original link, but here is a pastebin of it pastebin.com/64nK7acE and I've already converted it to a NodeJS module. It outputs binary, but I still don't know how to get that into a stringMonocular
It seems there is simply an error in the source code :P By adding this.d_v = -1; into the dReset function, the following sample code works: var tb91 = new basE91(); tb91.Make(); tb91.Encode("Hello World!"); var estr = tb91.endEnc(); console.log(estr); tb91.Decode(estr); var dstr = tb91.endDec(); console.log(dstr);Whiteeye
Just to make sure, here is an edited pastebin.com/JbJeFthb, I have added the test code at the end.Whiteeye
S
-1

For "How can decode basE91 using Javascript/NodeJS/CommnJS?"

I migrated the original basE91 into JavaScript, supporting String, Buffer and Stream at the moment. You may have a try: Equim-chan/base91

First we need a table:

const table = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789!#$%&()*+,./:;<=>?@[]^_`{|}~"';

The core part of encode:

// `raw` is the input, asserted as `Buffer`
const len = raw.length;
let ret = '';
let n = 0;
let b = 0;

for (let i = 0; i < len; i++) {
  b |= raw[i] << n;
  n += 8;

  if (n > 13) {
    let v = b & 8191;
    if (v > 88) {
      b >>= 13;
      n -= 13;
    } else {
      v = b & 16383;
      b >>= 14;
      n -= 14;
    }
    ret += table[v % 91] + table[v / 91 | 0];
  }
}

if (n) {
  ret += table[b % 91];
  if (n > 7 || b > 90) ret += table[b / 91 | 0];
}

return ret;  // basE91 encoded string

The core part of decode:

// `raw` is the input, asserted as `String`
const len = raw.length;
const ret = [];
let b = 0;
let n = 0;
let v = -1;

for (let i = 0; i < len; i++) {
  const p = table.indexOf(raw[i]);
  if (p === -1) continue;
  if (v < 0) {
    v = p;
  } else {
    v += p * 91;
    b |= v << n;
    n += (v & 8191) > 88 ? 13 : 14;
    do {
      ret.push(b & 0xff);
      b >>= 8;
      n -= 8;
    } while (n > 7);
    v = -1;
  }
}

if (v > -1) {
  ret.push((b | v << n) & 0xff);
}

return Buffer.from(ret);  // basE91 decoded Buffer

Above is for standard basE91 encoding/decoding, but as @blahdiblah mentioned, obviously you received a non-standard basE91 encoded string from the server (there is no * nor in the stardard table).

Scevour answered 28/5, 2017 at 7:41 Comment(2)
While this link may answer the question, it is better to include the essential parts of the answer here and provide the link for reference. Link-only answers can become invalid if the linked page changes. - From ReviewMaemaeander
@Maemaeander Sorry it is my first answer and I am not yet that familiar with the rules. Edited.Scevour

© 2022 - 2024 — McMap. All rights reserved.