Is there a way to detect circular arrays in pure PHP?
Asked Answered
J

5

19

I'm trying to implement my own serialization / var_dump style function in PHP. It seems impossible if there is the possibility of circular arrays (which there is).

In recent PHP versions, var_dump seems to detect circular arrays:

php > $a = array();
php > $a[] = &$a;
php > var_dump($a);
array(1) {
  [0]=>
  &array(1) {
    [0]=>
    *RECURSION*
  }
}

How would I implement my own serialization type of method in PHP that can detect similarly? I can't just keep track of which arrays I've visited, because strict comparison of arrays in PHP returns true for different arrays that contain the same elements and comparing circular arrays causes a Fatal Error, anyways.

php > $b = array(1,2);
php > $c = array(1,2);
php > var_dump($b === $c);
bool(true)
php > $a = array();
php > $a[] = &$a;
php > var_dump($a === $a);
PHP Fatal error:  Nesting level too deep - recursive dependency? in php shell code on line 1

I've looked for a way to find a unique id (pointer) for an array, but I can't find one. spl_object_hash only works on objects, not arrays. If I cast multiple different arrays to objects they all get the same spl_object_hash value (why?).

EDIT:

Calling print_r, var_dump, or serialize on each array and then using some mechanism to detect the presence of recursion as detected by those methods is an algorithmic complexity nightmare and will basically render any use too slow to be practical on large nested arrays.

ACCEPTED ANSWER:

I accepted the answer below that was the first to suggest temporarily altering the an array to see if it is indeed the same as another array. That answers the "how do I compare two arrays for identity?" from which recursion detection is trivial.

Jodiejodo answered 2/2, 2012 at 1:15 Comment(3)
The answer is going to be: you can't. See check if object/array is a reference. There's no pointer-like reference comparisons possible, so detecting a loop not possible either. Better workaround in your case might be to cast it around through one of the native functions (json_decode(json_encode())) to get rid of references, and only afterwards apply your own serialization thingy.Inheritrix
Now even PHPUnit is using the temporary "marking" method for detecting array recursion.Jodiejodo
Relevant Bug #55564Lynch
D
6

The isRecursiveArray(array) method below detects circular/recursive arrays. It keeps track of which arrays have been visited by temporarily adding an element containing a known object reference to the end of the array.

If you want help writing the serialization method, please update your topic question and provide a sample serialization format in your question.

function removeLastElementIfSame(array & $array, $reference) {
    if(end($array) === $reference) {
        unset($array[key($array)]);
    }
}

function isRecursiveArrayIteration(array & $array, $reference) {
    $last_element   = end($array);
    if($reference === $last_element) {
        return true;
    }
    $array[]    = $reference;

    foreach($array as &$element) {
        if(is_array($element)) {
            if(isRecursiveArrayIteration($element, $reference)) {
                removeLastElementIfSame($array, $reference);
                return true;
            }
        }
    }

    removeLastElementIfSame($array, $reference);

    return false;
}

function isRecursiveArray(array $array) {
    $some_reference = new stdclass();
    return isRecursiveArrayIteration($array, $some_reference);
}



$array      = array('a','b','c');
var_dump(isRecursiveArray($array));
print_r($array);



$array      = array('a','b','c');
$array[]    = $array;
var_dump(isRecursiveArray($array));
print_r($array);



$array      = array('a','b','c');
$array[]    = &$array;
var_dump(isRecursiveArray($array));
print_r($array);



$array      = array('a','b','c');
$array[]    = &$array;
$array      = array($array);
var_dump(isRecursiveArray($array));
print_r($array);
Drucill answered 15/2, 2012 at 4:1 Comment(2)
I think temporarily altering the array is probably the only reasonable way to test if two arrays are the same. This is the first answer to suggest that approach, so I'll accept it.Jodiejodo
Could a circular reference also be detected if the variable referencing it is lost? For example, this causes an infinite loop (until memory exhaustion): $array = [[&$array]]; $copy = $array; unset($array); isRecursiveArray($copy);Lynch
C
0

It's not elegant, but solves your problem (at least if you dont have someone using *RECURSION* as a value).

<?php
$a[] = &$a;
if(strpos(print_r($a,1),'*RECURSION*') !== FALSE) echo 1;
Conakry answered 2/2, 2012 at 2:15 Comment(4)
Just doesn't work if you happen to have the string "*RECURSION*" anywhere in your array... Admittedly rare, but possible.Reeve
Yeap, that what I meant when telling about *RECURSION* as a value. Anyway you can add a check for that case iterating the array and checking for that value and you got a limit of iterations based in how many lines print_r output gave you.Conakry
Well, that detects (in an extremely inefficient way) that there is recursion somewhere maybe... doesn't really solve my use case.Jodiejodo
As Mario said earlier, there is no way to detect references, print_r is using internals that you don't have access, it is inefficient, but is as good as you can get :(Conakry
F
0

Funny method (I know it is stupid :)), but you can modify it and track the "path" to the recursive element. This is just an idea :) Based on the property of the serialized string, when recursion starts in will be the same as the string for the original array. As you can see - I tried it on many different variations and might be something is able to 'fool' it, but it 'detects' all listed recursions. And I did not try recursive arrays with objects.

$a = array('b1'=>'a1','b2'=>'a2','b4'=>'a3','b5'=>'R:1;}}}');
$a['a1'] = &$a;
$a['b6'] = &$a;
$a['b6'][] = array(1,2,&$a);
$b = serialize($a); 
print_r($a);
function WalkArrayRecursive(&$array_name, &$temp){
    if (is_array($array_name)){
        foreach ($array_name as $k => &$v){
           if (is_array($v)){
                if (strpos($temp, preg_replace('#R:\d+;\}+$#', '', 
                               serialize($v)))===0) 
                { 
                  echo "\n Recursion detected at " . $k ."\n"; 
                  continue; 
                }
                WalkArrayRecursive($v, $temp);
            }
        }
    }
}
WalkArrayRecursive($a, $b);

regexp is for the situation when element with recursion is at the 'end' of the array. and, yes, this recursion is related to the whole array. It is possible to make recursion of the subelements, but it is too late for me to think about them. Somehow every element of the array should be checked for the recursion in its subelements. The same way, like above, through the output of the print_r function, or looking for specific record for recursion in serialized string (R:4;} something like this). And tracing should start from that element, comparing everything below by my script. All that is only if you want to detect where recursion starts, not just whether you have it or not.

ps: but the best thing should be, as I think, to write your own unserialize function from serailized string created by php itself.

Faroff answered 2/2, 2012 at 3:42 Comment(0)
O
0

My approach is to have a temp array that holds a copy of all objects that were already iterated. like this here:

// We use this to detect recursion.
global $recursion;
$recursion = [];

function dump( $data, $label, $level = 0 ) {
    global $recursion;

    // Some nice output for debugging/testing...
    echo "\n";
    echo str_repeat( "  ", $level );
    echo $label . " (" . gettype( $data ) . ") ";

    // -- start of our recursion detection logic
    if ( is_object( $data ) ) {
        foreach ( $recursion as $done ) {
            if ( $done === $data ) {
                echo "*RECURSION*";
                return;
            }
        }

        // This is the key-line: Remember that we processed this item!
        $recursion[] = $data;
    }
    // -- end of recursion check

    if ( is_array( $data ) || is_object( $data ) ) {
        foreach ( (array) $data as $key => $item ) {
            dump( $item, $key, $level + 1 );
        }
    } else {
        echo "= " . $data;
    }
}

And here is some quick demo code to illustrate how it works:

$obj = new StdClass();
$obj->arr = [];
$obj->arr[] = 'Foo';
$obj->arr[] = $obj;
$obj->arr[] = 'Bar';
$obj->final = 12345;
$obj->a2 = $obj->arr;

dump( $obj, 'obj' );

This script will generate the following output:

obj (object) 
  arr (array) 
    0 (string) = Foo
    1 (object) *RECURSION*
    2 (string) = Bar
  final (integer) = 12345
  a2 (array) 
    0 (string) = Foo
    1 (object) *RECURSION*
    2 (string) = Bar
Overprint answered 1/5, 2017 at 14:31 Comment(1)
The question is about circular arrays, not objects. If such an array is passed to your dump(), it loops indefinitely.Lynch
V
0

This is my approach. The key is to pass the array by reference to the recursive function simple_var_dump(), and use a tag (in this case "iterating_in_a_higher_level") to distinguish the arrays that are being iterated in a higher nesting level.

#!/usr/bin/php
<?php

   function simple_var_dump(&$var, $depth = 0)
   {
      if (!is_array($var)) {
         if (is_scalar($var)) {
            return (string)$var;
         } else {
            return '?';
         }
      }
      if (isset($var['__iterating_in_a_higher_level__'])) {
         $r = 'array(' . (count($var)-1) . ')';
         return $r . ' *RECURSION*';
      }
      $r = 'array(' . count($var) . ')';
      $var['__iterating_in_a_higher_level__'] = true;
      foreach ($var as $key => &$value) {
         if ($key !== '__iterating_in_a_higher_level__') {
            $r .= "\n" . str_repeat('  ', $depth + 1) . '[' . $key . '] => ' . simple_var_dump($value, $depth + 1);
         }
      }
      unset($var['__iterating_in_a_higher_level__']);
      return $r;
   }

   // example:
   //
   $a = [new stdClass(), &$a, 30, [40, [[&$a]]], [1, true, &$a], []];
   echo simple_var_dump($a) . "\n";

Output:

array(6)
  [0] => ?
  [1] => array(6) *RECURSION*
  [2] => 30
  [3] => array(2)
    [0] => 40
    [1] => array(1)
      [0] => array(1)
        [0] => array(6) *RECURSION*
  [4] => array(3)
    [0] => 1
    [1] => 1
    [2] => array(6) *RECURSION*
  [5] => array(0)
Volumetric answered 8/11, 2021 at 21:35 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.