Convert a string to number and back to string?
Asked Answered
C

3

8

I would like to know how I can convert a short ASCII string to a number (int, float, or numeric string). I saw a couple of posts here mentioned perfect hashes which seems like it might be what I need. However, I'm not quite understanding the math for this.

How could you convert an ASCII string into a sequence of numbers and then back to a string?

As a side note, breaking a string down into it's ASCII character numbers is easy enough.

foreach(str_split($string) as $char) $number .= ord($char);

Update

After more reading I came up with this. However, I'm wondering if there are anyways to shorten the number sequence so it's not quite as long.

class intnum
{
    public static $charset = array(
        32 => ' ', 33 => '!', 34 => '"', 35 => '#', 36 => '$',
        37 => '%', 38 => '&', 39 => "'", 40 => '(', 41 => ')',
        42 => '*', 43 => '+', 44 => ',', 45 => '-', 46 => '.',
        47 => '/', 48 => '0', 49 => '1', 50 => '2', 51 => '3',
        52 => '4', 53 => '5', 54 => '6', 55 => '7', 56 => '8',
        57 => '9', 58 => ':', 59 => ';', 60 => '<', 61 => '=',
        62 => '>', 63 => '?', 64 => '@', 65 => 'A', 66 => 'B',
        67 => 'C', 68 => 'D', 69 => 'E', 70 => 'F', 71 => 'G',
        72 => 'H', 73 => 'I', 74 => 'J', 75 => 'K', 76 => 'L',
        77 => 'M', 78 => 'N', 79 => 'O', 80 => 'P', 81 => 'Q',
        82 => 'R', 83 => 'S', 84 => 'T', 85 => 'U', 86 => 'V',
        87 => 'W', 88 => 'X', 89 => 'Y', 90 => 'Z', 91 => '[',
        92 => '\\', 93 => ']', 94 => '^', 95 => '_', 96 => '`',
        97 => 'a', 98 => 'b', 99 => 'c', 100 => 'd', 101 => 'e',
        102 => 'f', 103 => 'g', 104 => 'h', 105 => 'i', 106 => 'j',
        107 => 'k', 108 => 'l', 109 => 'm', 110 => 'n', 111 => 'o',
        112 => 'p', 113 => 'q', 114 => 'r', 115 => 's', 116 => 't',
        117 => 'u', 118 => 'v', 119 => 'w', 120 => 'x', 121 => 'y',
        122 => 'z', 123 => '{', 124 => '|', 125 => '}'
    );

    public static function fromNumber($number)
    {
        $string = '';
        while($number)
        {
            $value = substr($number, 0, 2);
            $number = substr($number, 2);

            if($value < 32)
            {
                $value .= substr($number, 0, 1);
                $number = substr($number, 1);
            }

            $string .= self::$charset[ (int) $value];
        }
        return $string;
    }

    public static function fromString($string)
    {
        $number = '';
        foreach(str_split($string) as $char) $number .= ord($char);
        return $number;
    }
}

$string = 'this is my test string to convert';

$number = intnum::fromString($string);
$string = intnum::fromNumber($number);
Curie answered 10/11, 2011 at 22:45 Comment(2)
Why not use the code you just posted?Yam
@Brad, how do I get the string back?Curie
L
14

A string-to-number encoder as one-liner (PHP 5.3 style):

$numbers = implode(array_map(function ($n) { return sprintf('%03d', $n); },
                          unpack('C*', $str)));

It simply converts every byte into its decimal number equivalent, zero-padding it to a fixed length of 3 digits so it can be unambiguously converted back.

The decoder back to a string:

$str = implode(array_map('chr', str_split($numbers, 3)));

Example text:

Wörks wíth all ストリングズ
087195182114107115032119195173116104032097108108032227130185227131136227131170227131179227130176227130186

Lucre answered 10/11, 2011 at 23:25 Comment(4)
hmmm, doesn't seem to work with the unicode characters you provided.Curie
Works fine, as long as you make sure you interpret the result in the same encoding as it was input.Lucre
See codepad.org/wXA9ViFu (written in PHP 5.2 style, since Codepad doesn't run 5.3 yet).Lucre
Works in PHP 8 ?Houser
R
2

You can't just ORD chars into a string of numbers and expect it to come back because some chars may be on 2 characters and others 3.

For example:

Kang-HO will give you: 10797106103457279

Now how do you know it's not: 10-79-71-0-61-0-34-57-27-9?

You need to either pad all your numbers in 3 number codes and thus get: 107097106103045072079 and then break it apart in blocks of 3 numbers and then ASC them back...

Retrorocket answered 10/11, 2011 at 22:58 Comment(1)
@PhilLello, do you see any way to alter my code above using hex?Curie
S
0

Well, if you want to convert your string into a sequence of integers you must use always a fixed block of numbers. In this case 3 since ASCII uses a 8 bit words, therefore, the maximun possible integer is 2^8-1 = 255.

You should fill the unsed space with 0:

function zero_fill($num){
    if($num <= 9) $num = "00".$num;
    elseif($num <= 99) $num = "0".$num;
    return $num;
}

You can use the function you have created in conjuction with this one, and to recover the string, take block of 3 integers and convert them back to its correspondant ASCII character.

foreach(str_split($numberSeq, 3) as $asciiIntValue) $stringBack .= chr($asciiIntValue);
Scarfskin answered 10/11, 2011 at 23:18 Comment(2)
"11bit words"? 2^11-1 = 127? I think you derailed a little there.Lucre
Was about to write it too, 2^11 = 2048... you meant 2^7 but then again, ASCII has 255 characters buddy so it's 2^8...Retrorocket

© 2022 - 2024 — McMap. All rights reserved.