Convert var_dump of array back to array variable
Asked Answered
P

7

50

I have never really thought about this until today, but after searching the web I didn't really find anything. Maybe I wasn't wording it right in the search.

Given an array (of multiple dimensions or not):

$data = array('this' => array('is' => 'the'), 'challenge' => array('for' => array('you')));

When var_dumped:

array(2) { ["this"]=> array(1) { ["is"]=> string(3) "the" } ["challenge"]=> array(1) { ["for"]=> array(1) { [0]=> string(3) "you" } } }

The challenge is this: What is the best optimized method for recompiling the array to a useable array for PHP? Like an undump_var() function. Whether the data is all on one line as output in a browser or whether it contains the line breaks as output to terminal.

Is it just a matter of regex? Or is there some other way? I am looking for creativity.

UPDATE: Note. I am familiar with serialize and unserialize folks. I am not looking for alternative solutions. This is a code challenge to see if it can be done in an optimized and creative way. So serialize and var_export are not solutions here. Nor are they the best answers.

Praedial answered 20/8, 2010 at 14:33 Comment(7)
Yes, it's possible by parsing it. No, it's not something you'd usually want to bother with, since you're doing something wrong if you really need this. Maybe make a Community Wiki Code Golf question out of this, then there's something to it.Db
It's definitely possible, but it's not going to be trivial since the syntax is not meant to be machine parsable. When you have things like string(8) "Foo"bar" and other weird edge cases, it's going to make it relatively messy to implement in a reliable manor... If there are elegant solutions, I'd love to see them. But realize that most fully working solutions will likely be rather lengthy and have a fair bit of logic inside...Guadalupeguadeloupe
What's wrong with var_export()?Canon
Nothing... except this question is not about using alternatives to var_dump. It's about taking an already var_dumped string and returning it to the state it was in before being var_dumped.Praedial
Is it just me or is the "When var_dumped:" example not actually what would be dumped?Fridell
I've merged in another question to here, just FYI.Jinnah
I think this can help #4346054Toddle
G
74

var_export or serialize is what you're looking for. var_export will render a PHP parsable array syntax, and serialize will render a non-human readable but reversible "array to string" conversion...

Edit Alright, for the challenge:

Basically, I convert the output into a serialized string (and then unserialize it). I don't claim this to be perfect, but it appears to work on some pretty complex structures that I've tried...

function unvar_dump($str) {
    if (strpos($str, "\n") === false) {
        //Add new lines:
        $regex = array(
            '#(\\[.*?\\]=>)#',
            '#(string\\(|int\\(|float\\(|array\\(|NULL|object\\(|})#',
        );
        $str = preg_replace($regex, "\n\\1", $str);
        $str = trim($str);
    }
    $regex = array(
        '#^\\040*NULL\\040*$#m',
        '#^\\s*array\\((.*?)\\)\\s*{\\s*$#m',
        '#^\\s*string\\((.*?)\\)\\s*(.*?)$#m',
        '#^\\s*int\\((.*?)\\)\\s*$#m',
        '#^\\s*bool\\(true\\)\\s*$#m',
        '#^\\s*bool\\(false\\)\\s*$#m',
        '#^\\s*float\\((.*?)\\)\\s*$#m',
        '#^\\s*\[(\\d+)\\]\\s*=>\\s*$#m',
        '#\\s*?\\r?\\n\\s*#m',
    );
    $replace = array(
        'N',
        'a:\\1:{',
        's:\\1:\\2',
        'i:\\1',
        'b:1',
        'b:0',
        'd:\\1',
        'i:\\1',
        ';'
    );
    $serialized = preg_replace($regex, $replace, $str);
    $func = create_function(
        '$match', 
        'return "s:".strlen($match[1]).":\\"".$match[1]."\\"";'
    );
    $serialized = preg_replace_callback(
        '#\\s*\\["(.*?)"\\]\\s*=>#', 
        $func,
        $serialized
    );
    $func = create_function(
        '$match', 
        'return "O:".strlen($match[1]).":\\"".$match[1]."\\":".$match[2].":{";'
    );
    $serialized = preg_replace_callback(
        '#object\\((.*?)\\).*?\\((\\d+)\\)\\s*{\\s*;#', 
        $func, 
        $serialized
    );
    $serialized = preg_replace(
        array('#};#', '#{;#'), 
        array('}', '{'), 
        $serialized
    );

    return unserialize($serialized);
}

I tested it on a complex structure such as:

array(4) {
  ["foo"]=>
  string(8) "Foo"bar""
  [0]=>
  int(4)
  [5]=>
  float(43.2)
  ["af"]=>
  array(3) {
    [0]=>
    string(3) "123"
    [1]=>
    object(stdClass)#2 (2) {
      ["bar"]=>
      string(4) "bart"
      ["foo"]=>
      array(1) {
        [0]=>
        string(2) "re"
      }
    }
    [2]=>
    NULL
  }
}
Guadalupeguadeloupe answered 20/8, 2010 at 14:36 Comment(17)
@Gordon you beat me to it. I was just going back to edit those links in. Thanks!Guadalupeguadeloupe
I think you misunderstood the question. The challenge is to reverse the var_dump into an array. I am familiar with serialize() and unserialize()... and yes, they are by far better options. This is a code challenge. Maybe it's not worth the effort, but I wanted to see if it could be done in an optimized and creative way. I am not looking for an alternative solution.Praedial
@cdburgess: It is strange, what do you want to do exactly?Jaclin
The challenge is to take the output of var_dump and print out the rebuilt array. So going from array(2) { ["this"]=> array(1) {... back to array('this' => array(Praedial
@cdburgess: So the title of your question should be Code Challenge - Convert var_dump back to array/variableJaclin
Looks great. However, When I paste your code into a file, it will not execute.Praedial
Are you on php 5.2? Because that code is written for 5.3+ (If you want to change it back, you'll need to change the $foo = function calls to create_function). I'll whip up the quick change and edit back in...Guadalupeguadeloupe
And I just edited back in a far more robust version of the regexps that should account for strings with serialized tokens inside of them...Guadalupeguadeloupe
PHP Notice: unserialize(): Error at offset 0 of 208 bytes in /home/y/share/htdocs/test.php on line 51 ... however, I am using a slightly different version of the var_dump. $export = 'array(2) { ["this"]=> array(2) { ["is"]=> string(3) "the" [0]=> array(2) { [0]=> string(3) "one" [1]=> string(4) "only" } } ["challenge"]=> array(1) { ["for"]=> array(2) { [0]=> string(3) "you" [1]=> int(2) } } }';Praedial
Are there new lines (like var_dump provides)? Or did you just make it into a single line string (which makes the parsing a lot harder to do as robust)...Guadalupeguadeloupe
@cdburgess: But that's not how PHP outputs a var_dump. There are linebreaks in it. And my solution depends upon those linebreaks. try doing:ob_start(); var_dump($var); $data = ob_get_clean(); and then calling my function with $data...Guadalupeguadeloupe
If it is output to a webpage it does. But thanks for the clarification. I will update the question so it is more clear.Praedial
@cdburgess: Wrap it in <pre> tags. You'll see that there are new lines... Otherwise, it's not truly output of var_dump (Since its output includes new lines, and removing them changes the output)...Guadalupeguadeloupe
@cdburgess: Ok, I added some support for the dump on a single line. Be aware that this may wind up changing the strings if they have any of the "tokens" inside of them (and hence break the serialized output)... It'll be robust if there are new lines, but if there are not it may stumble more...Guadalupeguadeloupe
hi your method is not working on Flickr var_dump array. Warning: strpos() expects parameter 1 to be string, array given in /opt/lampp/htdocs/phpflickr/example.php on line 28 Notice: Array to string conversion in /opt/lampp/htdocs/phpflickr/example.php on line 59 Warning: unserialize() expects parameter 1 to be string, array given in /opt/lampp/htdocs/phpflickr/example.php on line 84Shaylynn
@Guadalupeguadeloupe Hi there, I tried to test your code but could get it to work, would you mind to shed some light? sandbox.onlinephpfunctions.com/code/…Jerrome
Not compatible since PHP 7.2.4Unmoving
A
16

There's no other way than manual parsing depending on the type. I didn't add support for objects, but it's very similar to the arrays one; you just need to do some reflection magic to populate not only public properties and to not trigger the constructor.

EDIT: Added support for objects... Reflection magic...

function unserializeDump($str, &$i = 0) {
    $strtok = substr($str, $i);
    switch ($type = strtok($strtok, "(")) { // get type, before first parenthesis
         case "bool":
             return strtok(")") === "true"?(bool) $i += 10:!$i += 11;
         case "int":
             $int = (int)substr($str, $i + 4);
             $i += strlen($int) + 5;
             return $int;
         case "string":
             $i += 11 + ($len = (int)substr($str, $i + 7)) + strlen($len);
             return substr($str, $i - $len - 1, $len);
         case "float":
             return (float)($float = strtok(")")) + !$i += strlen($float) + 7;
         case "NULL":
             return NULL;
         case "array":
             $array = array();
             $len = (int)substr($str, $i + 6);
             $i = strpos($str, "\n", $i) - 1;
             for ($entries = 0; $entries < $len; $entries++) {
                 $i = strpos($str, "\n", $i);
                 $indent = -1 - (int)$i + $i = strpos($str, "[", $i);
                 // get key int/string
                 if ($str[$i + 1] == '"') {
                     // use longest possible sequence to avoid key and dump structure collisions
                     $key = substr($str, $i + 2, - 2 - $i + $i = strpos($str, "\"]=>\n  ", $i));
                 } else {
                     $key = (int)substr($str, $i + 1);
                     $i += strlen($key);
                 }
                 $i += $indent + 5; // jump line
                 $array[$key] = unserializeDump($str, $i);
             }
             $i = strpos($str, "}", $i) + 1;
             return $array;
         case "object":
             $reflection = new ReflectionClass(strtok(")"));
             $object = $reflection->newInstanceWithoutConstructor();
             $len = !strtok("(") + strtok(")");
             $i = strpos($str, "\n", $i) - 1;
             for ($entries = 0; $entries < $len; $entries++) {
                 $i = strpos($str, "\n", $i);
                 $indent = -1 - (int)$i + $i = strpos($str, "[", $i);
                 // use longest possible sequence to avoid key and dump structure collisions
                 $key = substr($str, $i + 2, - 2 - $i + $i = min(strpos($str, "\"]=>\n  ", $i)?:INF, strpos($str, "\":protected]=>\n  ", $i)?:INF, $priv = strpos($str, "\":\"", $i)?:INF));
                 if ($priv == $i) {
                     $ref = new ReflectionClass(substr($str, $i + 3, - 3 - $i + $i = strpos($str, "\":private]=>\n  ", $i)));
                     $i += $indent + 13; // jump line
                 } else {
                     $i += $indent + ($str[$i+1] == ":"?15:5); // jump line
                     $ref = $reflection;
                 }
                 $prop = $ref->getProperty($key);
                 $prop->setAccessible(true);
                 $prop->setValue($object, unserializeDump($str, $i));
             }
             $i = strpos($str, "}", $i) + 1;
             return $object;

    }
    throw new Exception("Type not recognized...: $type");
}

(Here are a lot of "magic" numbers when incrementing string position counter $i, mostly just string lengths of the keywords and some parenthesis etc.)

Amerce answered 12/5, 2014 at 12:13 Comment(7)
Thanks! I like your approach, but some strings don't get parsed correctly, for example: 'string(6) "ab};cd"' returns d".Jacobson
@georg oh, that was a dumb error and wrote just a strlen() too much at the wrong place. Better? — I just didn't notice it as I always tested with strings of length 1...Amerce
@Amerce It seems bool(true) is not parsed correctly. I had included a fix for that in my edit.Abshire
@Abshire yep, I saw that, but your fix wasn't exactly what it should have been… solution just was to no pass vars again to strtok().Amerce
@Amerce Yes, the bool case looks cleaner now and works. But the float case seems wrong now, try float(1.5).Abshire
@Amerce I tried to test your code with var_dump of various arrays but could get it to work, would you mind to shed some light? sandbox.onlinephpfunctions.com/code/…Jerrome
@Jerrome you are using \r\n linebreaks … you'll need to replace the inputs linebreaks by \n. (or update all the offsets in the code responsible for line counting...)Amerce
E
6

If you want to encode/decode an array like this, you should either use var_export(), which generates output in PHP's array for, for instance:

array(
  1 => 'foo',
  2 => 'bar'
)

could be the result of it. You would have to use eval() to get the array back, though, and that is a potentially dangerous way (especially since eval() really executes PHP code, so a simple code injection could make hackers able to gain control over your PHP script).

Some even better solutions are serialize(), which creates a serialized version of any array or object; and json_encode(), which encodes any array or object with the JSON format (which is more preferred for data exchange between different languages).

Extraction answered 20/8, 2010 at 14:39 Comment(0)
F
5

The trick is to match by chunks of code and "strings", and on strings do nothing but otherwise do the replacements:

$out = preg_replace_callback('/"[^"]*"|[^"]+/','repl',$in);

function repl($m)
{
    return $m[0][0]=='"'?
        str_replace('"',"'",$m[0])
    :
        str_replace("(,","(",
            preg_replace("/(int\((\d+)\)|\s*|(string|)\(\d+\))/","\\2",
                strtr($m[0],"{}[]","(), ")
            )
        );
}

outputs:

array('this'=>array('is'=>'the'),'challenge'=>array('for'=>array(0=>'you')))

(removing ascending numeric keys starting at 0 takes a little extra accounting, which can be done in the repl function.)

ps. this doesn't solve the problem of strings containing ", but as it seems that var_dump doesn't escape string contents, there is no way to solve that reliably. (you could match \["[^"]*"\] but a string may contain "] as well)

Facility answered 20/8, 2010 at 15:14 Comment(3)
This is great! You are one of the few who actually read and undertood the question. Thanks for taking the challenge and providing a working solution. Now what if there is an INT(5) as the value? (i.e. array('you',2)) It will be displayed as int(5) but should return from your function as 5.Praedial
I just took your example to make it work. Replacing int\(\d+\) with the number doesn't sound like much of a challenge. see updated answer.Facility
Superb! Very well done and in small optimized code! FYI: There is a missing comma after "\\2".Praedial
V
1

Use regexp to change array(.) { (.*) } to array($1) and eval the code, this is not so easy as written because You have to deal with matching brackets etc., just a clue on how to find solution ;)

  • this will be helpful if You cant change var_dump to var_export, or serialize
Valency answered 20/8, 2010 at 14:36 Comment(6)
A regexp solution is going to be very difficult because you can have nested braces... So it's more likely to involve a string parser than a regexp (considering you have state to worry about due to the nesting)...Guadalupeguadeloupe
no You do not have to deal with string parser, regexp have some superb functions as ungreed/global flags etc, it can be done with one single regexp with correct setted flags :)Valency
the BBcode parsers are build on top of regexp, and work well without state machne ;) just consider 'array(.) {' and '}' as close/open tags :)Valency
Then show me a single regex that will convert all valid var_dumped data back into native parsable php... I'll admit I'm wrong if you can show me an example of a regex that can deal with: array(1) { ["foo}[bar]"] => string(4) "baz{" }Guadalupeguadeloupe
You're probably right it can't be done by just one regexp, but still, You can use one regexp per "tag" where tag is one of: array(.) ; string(.) ; integer(.) etc. and parse output in correct order (simple types -> arrays) but still it is not possible to "reparse" var_dumped objects and other non-starndard structures, for this we have serialize and other stuffValency
note that cdburgess is looking for a code challenge, so i'm putting some clues on how it can be achieved :)Valency
J
1

Updated to NOT USE create_function, as it is DEPRECATED as of PHP 7.2.0. Instead it is replaced to use anonymous functions:



    function unvar_dump($str) {
        if (strpos($str, "\n") === false) {
            //Add new lines:
            $regex = array(
                '#(\[.*?\]=>)#',
                '#(string\(|int\(|float\(|array\(|NULL|object\(|})#',
            );
            $str = preg_replace($regex, "\n\1", $str);
            $str = trim($str);
        }
        $regex = array(
            '#^\040*NULL\040*$#m',
            '#^\s*array\((.*?)\)\s*{\s*$#m',
            '#^\s*string\((.*?)\)\s*(.*?)$#m',
            '#^\s*int\((.*?)\)\s*$#m',
            '#^\s*bool\(true\)\s*$#m',
            '#^\s*bool\(false\)\s*$#m',
            '#^\s*float\((.*?)\)\s*$#m',
            '#^\s*\[(\d+)\]\s*=>\s*$#m',
            '#\s*?\r?\n\s*#m',
        );
        $replace = array(
            'N',
            'a:\1:{',
            's:\1:\2',
            'i:\1',
            'b:1',
            'b:0',
            'd:\1',
            'i:\1',
            ';'
        );
        $serialized = preg_replace($regex, $replace, $str);
        $func = function($match) {
            return 's:'.strlen($match[1]).':"'.$match[1].'"';
        };
        $serialized = preg_replace_callback(
            '#\s*\["(.*?)"\]\s*=>#', 
            $func,
            $serialized
        );
        $func = function($match) {
            return 'O:'.strlen($match[1]).':"'.$match[1].'":'.$match[2].':{';
        };
        $serialized = preg_replace_callback(
            '#object\((.*?)\).*?\((\d+)\)\s*{\s*;#', 
            $func, 
            $serialized
        );
        $serialized = preg_replace(
            array('#};#', '#{;#'), 
            array('}', '{'), 
            $serialized
        );

        return unserialize($serialized);
    }

    $test = 'array(10) {
      ["status"]=>
      string(1) "1"
      ["transactionID"]=>
      string(14) "1532xxx"
      ["orderID"]=>
      string(10) "1532xxx"
      ["value"]=>
      string(8) "0.73xxx"
      ["address"]=>
      string(1) "-"
      ["confirmations"]=>
      string(3) "999"
      ["transaction_hash"]=>
      string(64) "internxxx"
      ["notes"]=>
      string(0) ""
      ["txCost"]=>
      string(1) "0"
      ["txTimestamp"]=>
      string(10) "1532078165"
    }';
    var_export(unvar_dump($test));


Jorgenson answered 25/7, 2018 at 7:48 Comment(1)
confirm this worked with PHP 7.2.0+, ThanksDispute
J
0

I think you are looking for the serialize function:

serialize — Generates a storable representation of a value

It allows you to save the contents of array in readable format and later you can read the array back with unserialize function.

Using these functions, you can store/retrieve the arrays even in text/flat files as well as database.

Jaclin answered 20/8, 2010 at 14:36 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.