Why does a PHP array get modified when it's element is reference-assigned?
Asked Answered
H

1

2

When ref-assigning an array's element, the contents of the array are modified:

$arr = array(100, 200);
var_dump($arr);
/* shows:
array(2) {
  [0]=>
  int(100)  // ← ← ← int(100)
  [1]=>
  int(200)
}
*/

$r = &$arr[0];
var_dump($arr);
/* shows:
array(2) {
  [0]=>
  &int(100)  // ← ← ← &int(100)
  [1]=>
  int(200)
}
*/

Live run. (Zend Engine will do fine, while HHVM shows "Process exited with code 153".)

Why is the element modified?

Why do we see &int(100) instead of int(100)?

This seems totally bizarre. What's the explanation for this oddity?

Hospitium answered 8/7, 2013 at 13:45 Comment(7)
I am unable to reproduce this with the provided code. Using PHP 5.4.6Carlina
I've checked it and it's really strange. Because there is no assigment. I've checked on writecodeonline.com/phpMaledict
@Maledict can reproduce here and PHP_VERSION is 5.4.15 hereCarlina
Same behavior even in php 4Maledict
@Orangepill, I'm using 5.3.26 but I'm pretty sure it's not specific to this version.Hospitium
3v4l.org/B2DgIDomash
maybe var_dump() shows also variables that are referenced but I have no idea why I cannot find anything paricular in manualMaledict
Y
7

I have answered this a while back, but cannot find the answer right now. I believe it went like this:

References are simply "additional" entries in the symbol table for the same value. The symbol table can only have values it points to, not values in values. The symbol table cannot point to an index in an array, it can only point to a value. So when you want to make a reference to an array index, the value at that index is taken out of the array, a symbol is created for it and the slot in the array gets a reference to the value:

$foo = array('bar');

symbol | value
-------+----------------
foo    | array(0 => bar)

$baz =& $foo[0];

symbol | value
-------+----------------
foo    | array(0 => $1)
baz    | $1
$1     | bar              <-- pseudo entry for value that can be referenced

Because this is not possible:

symbol | value
-------+----------------
foo    | array(0 => bar)
baz    | &foo[0]          <-- not supported by symbol table

The $1 above is just an arbitrarily chosen "pseudo" name, it has nothing to do with actual PHP syntax or with how the value is actually referenced internally.

As requested in the comments, here how the symbol table usually behaves with references:

$a = 1;

symbol | value
-------+----------------
a      | 1


$b = 1;

symbol | value
-------+----------------
a      | 1
b      | 1


$c =& a;

symbol | value
-------+----------------
a, c   | 1
b      | 1
Yu answered 8/7, 2013 at 14:1 Comment(7)
@deceze, the characters you use in the symbol-value table is confusing... How would you draw the symbol-value table after this line $a = 1; $b = 1; $c =& $a;? (simply need it as a reference to properly understand what you mean here)Hospitium
@deceze. Ic, so after the line $baz =& $foo[0];, instead of baz | $1 $1 | 'bar', you actually meant that we get baz, $1 | 'bar' right?Hospitium
@Hospitium That's another way to look at it. I don't know whether it's more correct to say that $baz refers to the value and $foo[0] is a pseudo link which also refers to the same value; or whether both $baz and $foo[0] refer to a pseudo link which refers to the value. But yeah, you get the idea.Yu
@deceze, hmm, I don't quite get what you mean by a "pseudo link".. Isn't it just $baz and $foo[0] being two different symbols pointing to the same zval container [type='string', value='baz', refcount=2, is_ref=true]?Hospitium
@Hospitium Almost. $foo[0] cannot be a symbol. foo is a symbol which holds an array. Index [0] of that array is a reference to an entry in the symbol table which holds your mentioned zval, with bar also referring to that zval (whether directly or indirectly).Yu
@deceze, I think you're wrong here because arrays themselves have their own symbol tables separate from the global symbol table derickrethans.nl/talks/phparch-php-variables-article.pdf , so [0] is a symbol in the array's symbol table pointing to the zval container 'baz', and bar being the symbol in the global symbol table pointing to that same zval container.Hospitium
@Hospitium That's basically what I said, just adding that arrays use a symbol table internally. That's sort of an irrelevant detail though. The point is that $foo[0] cannot be an entry in the "global" symbol table as is, so there's an intermediate value being introduced that $foo[0] points to instead.Yu

© 2022 - 2024 — McMap. All rights reserved.