Compare PHP Arrays Using Memory References
Asked Answered
M

8

6

Is it possible to see if two array variables point to the same memory location? (they are the same array)

Mantissa answered 5/11, 2010 at 23:33 Comment(8)
Why is this tagged zend-engine by someone else?Benzoate
@Benzoate Because for a qualified answer that goes beyond a mere Yes or No, you will have to have some knowledge of how the Zend Engine, e.g. the thing that drives PHP, handles variables and memory.Burgett
Why do you want to do that? Maybe we can help if you explain your concrete problem.Mucoid
c0rnh0li0 - He just wants to do that. So what if he doesn't have a good reason. It's understand and fixing the issue that is important, not fixing someone's way of thinking.Chancy
@Burgett - Not necessarily. I find my answer reliable and relevant, however, I'm no Zend Engineer myself.Chancy
Actually sometimes what is required is fixing someones thinking. Programmers spend a lot of time reinventing wheels that others have solved. Not sure if thats the case here, but often is.Phenica
Related: https://mcmap.net/q/226161/-detecting-whether-a-php-variable-is-a-reference-referenced/632951Querulous
@TobyAllen It is very annoying when people assume they need to fix my way of thinking on a website. Especially when there is nothing in this question that indicates that the OP is an inexperienced programmer. (I'm aware that your comment is very old, but it is still relevant).Gourmand
B
16

Actually, this can be done. Through a php extension.

File: config.m4

PHP_ARG_ENABLE(test, whether to enable test Extension support, [ --enable-test   Enable test ext support])

if test "$PHP_TEST" = "yes"; then
  AC_DEFINE(HAVE_TEST, 1, [Enable TEST Extension])
  PHP_NEW_EXTENSION(test, test.c, $ext_shared)
fi

File: php_test.h

#ifndef PHP_TEST_H
#define PHP_TEST_H 1

#define PHP_TEST_EXT_VERSION "1.0"
#define PHP_TEST_EXT_EXTNAME "test"

PHP_FUNCTION(getaddress4);
PHP_FUNCTION(getaddress);

extern zend_module_entry test_module_entry;
#define phpext_test_ptr &test_module_entry

#endif

File: test.c

#ifdef HAVE_CONFIG_H
#include "config.h"
#endif

#include "php.h"
#include "php_test.h"

ZEND_BEGIN_ARG_INFO_EX(func_args, 1, 0, 0)
ZEND_END_ARG_INFO()

static function_entry test_functions[] = {
    PHP_FE(getaddress4, func_args)
    PHP_FE(getaddress, func_args)
    {NULL, NULL, NULL}
};

zend_module_entry test_module_entry = {
#if ZEND_MODULE_API_NO >= 20010901
    STANDARD_MODULE_HEADER,
#endif
    PHP_TEST_EXT_EXTNAME,
    test_functions,
    NULL,
    NULL,
    NULL,
    NULL,
    NULL,
#if ZEND_MODULE_API_NO >= 20010901
    PHP_TEST_EXT_VERSION,
#endif
    STANDARD_MODULE_PROPERTIES
};

#ifdef COMPILE_DL_TEST
ZEND_GET_MODULE(test)
#endif

PHP_FUNCTION(getaddress4)
{
    zval *var1;
    zval *var2;
    zval *var3;
    zval *var4;
    char r[500];
    if( zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "aaaa", &var1, &var2, &var3, &var4) == FAILURE ) {
      RETURN_NULL();
    }
    sprintf(r, "\n%p - %p - %p - %p\n%p - %p - %p - %p", var1, var2, var3, var4, Z_ARRVAL_P(var1), Z_ARRVAL_P(var2), Z_ARRVAL_P(var3), Z_ARRVAL_P(var4) );
    RETURN_STRING(r, 1);
}

PHP_FUNCTION(getaddress)
{
    zval *var;
    char r[100];
    if( zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "a", &var) == FAILURE ) {
      RETURN_NULL();
    }
    sprintf(r, "%p", Z_ARRVAL_P(var));
    RETURN_STRING(r, 1);
}

Then all you have to do is phpize it, config it, and make it. Add a "extension=/path/to/so/file/modules/test.so" to your php.ini file. And finally, restart the web server, just in case.

<?php
  $x = array("123"=>"123");
  $w = $x;
  $y = $x;
  $z = &$x;
  var_dump(getaddress4($w,$x,$y,$z));
  var_dump(getaddress($w));
  var_dump(getaddress($x));
  var_dump(getaddress($y));
  var_dump(getaddress($z));
?>

Returns(at least for me, your memory addresses will probably be different)

string '
0x9efeb0 - 0x9effe0 - 0x9ef8c0 - 0x9efeb0
0x9efee0 - 0x9f0010 - 0x9ed790 - 0x9efee0' (length=84)

string '0x9efee0' (length=8)

string '0x9f0010' (length=8)

string '0x9ed790' (length=8)

string '0x9efee0' (length=8)

Thanks to Artefacto for pointing this out, but my original code was passing the arrays by value, so thereby was recreating arrays including the referenced-one, and giving you bad memory values. I have since changed the code to force all params to be passed by reference. This will allow references, arrays, and object, to be passed in unmolested by the php engine. $w/$z are the same thing, but $w/$x/$y are not. The old code, actually showed the reference breakage and the fact that the memory addresses would change or match when all variables were passed in vs multiple calls to the same function. This was because PHP would reuse the same memory when doing multiple calls. Comparing the results of the original function would be useless. The new code should fix this problem.

FYI - I'm using php 5.3.2.

Broadwater answered 26/11, 2010 at 8:21 Comment(20)
interesting answer, i was always curios about how php extensions work. I dont thee the getaddress() function body in hereVida
It's in test.c, use the scroll back to get to see all the text.Broadwater
Considering the simple request by the OP, I really don't see this practical since (as in my code) it can be found via PHP code as well. However, +1 for getting us some nice C code on how PHP extensions work :)Chancy
That may be, but this is the only sure way to answer the OP's question, yours (and others) methods manipulate the array, this could be very very slow if working with large lists, where you are O(n) this is O(1). But then again, I don't see any practical use for knowing if its the same array in memory.Broadwater
+1 @Jeremy: best answer so far IMHO, it beats mine. Practical or not, this one answers the question perfectly: not only there is a way, but you demonstrate it with this fine example. Thumbs up.Faustena
+1, but you should be using Z_ARRVAL_P instead of breaking the zval abstraction (and yes, sometimes the field names do change, like in 5.3 refcount and is_ref).Banas
You are right, I didn't know about the Z* preprocessor defines. PHP isn't exactly very forthcoming with information on its internals, and I'll be the first say that I'm not an expert in it. It was easier to get structure value than it was to find some obscure preprocessor. I will change the code to reflect this though, thank you.Broadwater
and btw, you don't need assign by reference in your example. Doing $z = $x; would give the same result. You could also compare the address of the zval itself, since two arrays cannot share the same hash table.Banas
@Artefacto: And no, to your second one ($z = $x) gives you a new hash value (aka new array). In the case of your last comment, no, because references have a new zval but its hash values are equal. See updated example.Broadwater
@Jeremy If you did $x = array(); $z = $x;, then $z and $x would indeed point to the same zval * until a separation would be forced (copy-on-write mechanism). Your example is slightly different -- when you do $w = $x;, $w and $x point to the same zval * (refcount=2, is_ref=0). But when you do $z =& $x;, you're forcing a separation because $z and $x are references and $w is not and a zval cannot have two different values for the is_ref field. So the 1st zval ($w/$x) is copied and its refcount decremented. The new copy ($z/$x; after) has refcount set to 2, and is_ref=1.Banas
not true though, my zval * is on the stack (note the declaration), not only that, if they were linked going in though, why do they NOT have the same HashTable memory address, if not the same array. Even their bucket addresses are completely different. And from TESTING, your statement although should be accurate, is completely false, see above code.Broadwater
@Jeremy I don't see what the zval * being on the stack is relevant here; what matters is the value of var, not its address. What I said was true. Let me explain. After your assignments, you have 2 zvals -- $y/$w with refcount=2, is_ref=0 and $x/$z with refcount=2, is_ref=1. The thing is, since your function receives by value, for $y and $w the refcount is simply incremented when entering the function (and you see 3). For $x and $z, there has to be a separation, because the function must end up with a non-reference. So you see refcount=1, is_ref=0 inside the function.Banas
@Jeremy The addresses being the same is coincidence. When calling the function with the is_ref variables, you force a separation. The new zval that results is destroyed when the function ends. When a new separation is done on the second call with a reference and a new separation is made, the new zval gets the same memory block from the zend memory manager.Banas
@Jeremy You might also be interested in my answer.Banas
Even so, you can't just check the zval *'s because their address values are different because they are different variables. The OPs question was not that the variables are the same, but if the arrays are the same in memory, the only way to really verify that is to check the array to which it is being pointed to. In my second validation, you can see that $x/$z point to one array and $w/$y point to a completely new array. Where as $w/$y have the same zval, $x/$z do not have the same zval.Broadwater
Coincidence or not, this answers the OPs question, where is shows the arrays being held in the same memory address. Even if refcount or reference being separated, they still share the same memory for the array.Broadwater
@Jeremy No. They do not share the same memory address, because they do not exist at the same time. Checking the zval * or checking the HashTable * is completely indifferent -- hash tables are not shared between zvals. But zval * pointers are shared between several variables. Symbol tables are hash tables that store store, as values, this pointers, the keys being the names of the variables.Banas
So you are saying for an array every time a function is called or a variable is assigned to, a HashTable is copied? That is horribly inefficient. Looking at the HashTable format. It would have to copy all the array including all of the buckets in order to NOT pass between zvals. But here, let me verify that then.Broadwater
@Jeremy Yes, that's what I'm saying. The hashtables (unlike objects and resource) do not store a refcount, so they cannot be shared between zvals. Of course, if you don't use references, you don't run into this problem, as the only side effect of calling the function is incrementing the refcount of the zval-argument. You can see e.g. this bogus bug report as an example of a performance degradation caused by separating an array zval when passing it to a function.Banas
Yes, you are correct with my current code. I'm going to update my code so that it is correct then.Broadwater
H
9

References in PHP are a means to access the same variable content by different names. They are not like C pointers; for instance, you cannot perform pointer arithmetic using them, they are not actual memory addresses, and so on.

Conclusion: No, you can not

From: http://www.php.net/manual/en/language.references.whatare.php

Halophyte answered 23/11, 2010 at 7:7 Comment(3)
You also might wanna have a look at: php.net/manual/en/function.spl-object-hash.phpHalophyte
on another note with this, php (lang) dose not have access to the Engine Memory you have to remeber php(lang) is parsed by the engine so at no point is your script machine code it just runs the Engine machine code so your script file does not use memory like a program only the engine does and i doubt they will ever give access for a script to access raw memory as you could trash server or right a virus into a server and then seared host could not use it.Teetotaler
-1: The answer is incorrect. Although references are not pointers, and one cannot do pointer arithmetic, they still have a feature which make them references, so the "no, you can not" is incorrect, and in fact, "you can". See my response on how it works.Chancy
F
8

Your question is actually a bit misleading. "point to the same memory location" and "are the same array" (which to me means is a reference to, at least in PHP) are not the same thing.

Memory locations refers to pointers. Pointers are not available in PHP. References are not pointers.

Anyway, if you want to check if $b is in fact a reference of $a, this is the closest you can get to an actual answer:

function is_ref_to(&$a, &$b) {
    if (is_object($a) && is_object($b)) {
        return ($a === $b);
    }

    $temp_a = $a;
    $temp_b = $b;

    $key = uniqid('is_ref_to', true);
    $b = $key;

    if ($a === $key) $return = true;
    else $return = false;

    $a = $temp_a;
    $b = $temp_b;
    return $return; 
}

$a = array('foo');
$b = array('foo');
$c = &$a;
$d = $a;

var_dump(is_ref_to($a, $b)); // false
var_dump(is_ref_to($b, $c)); // false
var_dump(is_ref_to($a, $c)); // true
var_dump(is_ref_to($a, $d)); // false
var_dump($a); // is still array('foo')
Faustena answered 24/11, 2010 at 3:18 Comment(10)
Good point, updated the answer. Just a clarification, PHP5 objects are references, not pointers (because pointer actually points to a memory address, while a reference doesn't).Faustena
The ONLY way to see if two array variables point to the same memory location is by using pointer arithmetic. PHP does NOT support this, so it's NOT possible. And this script does NOT prove that the same memory location is used. As Hamish already explained.. @stereofrog, your link has also nothing to do with memory locations. It only shows whether 2 variables are aliases or not. This does NOT prove anything about their memory locations, in fact.. aliases can use different memory blocks.Jamie
@Inga: the question is misleading, so I did my best to answer it. I know C, I know Zend Engine, and I know what you say is right. But OP talks about memory locations, then says "if they are the same array". In PHP, "if they are the same array" means references. My answer does not say how to check memory locations (because it's impossible), it's all about references.Faustena
Yes.. Basically the question should be: Is it possible to see if two array variables point to the same memory location? without the (they are the same array)-remark. And then the answer is; no.Jamie
@stereofrog: you might also take php5 objects (which are pointers), no, they are not pointers.Jamie
@stereofrog: Objects are not references, they are objects. Objects are passed by reference.Faustena
@netcoder, if you try to write a function above, you will see that they are not passed by reference.Gogetter
@Inga Johansson - That is absolutely untrue. netcoder showed exactly how it would work out, as I did myself.Chancy
@Inga Yes, in PHP 5 objects are pointers. The variable only holds an id of the object, which does the turn of an address. The object is stored elsewhere. This doesn't mean "objects are passed by reference"; objects are (by default) passed by value like everything else, it's that what's passed by value is a kind of pointer.Banas
@netcoder, What do you think about the function below at https://mcmap.net/q/225549/-compare-php-arrays-using-memory-references ? I tested it and it seems to work. If it works, isn't it much cleaner since it doesn't use uniqid?Querulous
C
3
        function check(&$a,&$b){
            // perform first basic check, if both have different values
            // then they're definitely not the same.
            if($a!==$b)return false;
            // backup $a into $c
            $c=$a;
            // get some form of opposite to $a
            $a=!$a;
            // compare $a and $b, if both are the same thing,
            // this should be true
            $r=($a===$b);
            // restore $a from $c
            $a=$c;
            // return result
            return $r;
        }

        $a=(object)array('aaa'=>'bbb'); $b=&$a;
        echo check($a,$b) ? 'yes' : 'no'; // yes
        $c='aaa'; $d='aaa';
        echo check($c,$d) ? 'yes' : 'no'; // no
        $e='bbb'; $f='ccc';
        echo check($e,$f) ? 'yes' : 'no'; // no

The function "check" was created in 2 mins or so. It assumes that if you change a reference's value, a second reference would have the newly add value as well. This function works on variables only. You can use it against constant value, function returns (unless by reference) etc.

Edit: During testing, I had some initial confusion. I kept reusing the same variable names ($a and $b) which resulted in all the conditionals being "yes". Here's why:

$a='aaa'; $b=&$a;     // a=aaa b=aaa
$a='ccc'; $b='ddd';   // a=ddd b=ddd   <= a is not ccc!

To correct the issue, I gave them a different name:

$a='aaa'; $b=&$a;     // a=aaa b=aaa
$c='ccc'; $d='ddd';   // c=ccc d=ddd   <= c is now correct

Edit: Why the answer is "yes" and not "no"

PHP does not reveal pointer information through scripting (neither pointer manipulation etc). However, it does allow alias variables (references), done by using the reference operator '&'. Feature is typically found in pointers, which explains the general confusion. That said, pointers are not aliases.

However, if we see the original question, the person wanted to know if $a is the same as $b, not where in the memory $a (or $b) is found. Whereas the earlier requirement applies to both references and pointers, the later one only applies to pointers.

Chancy answered 25/11, 2010 at 14:47 Comment(0)
B
2

First, your question is vague. It can mean several different things:

  • Do the variables have the same content? For this, you can use ===.
  • Do the variables use internally the same memory?
  • Are these variables in the same reference set? I.e., given two variables, $a and $b, if I change $a, will it change $b?

The answer to the second answer is not easy to determine. Jeremy Walton's answer has one significant problem -- his function receives by value, so if you pass it a reference, you force a separation and get the address of a new temporary value. You could make the function receive the parameter by reference, but then you'd have the opposite problem -- if you passed a value (with refcount >= 2), you would also force a separation.

More importantly, the second question is an irrelevant internal detail. Consider the following script:

$a = 1;
$b = $a; //addresses of $a and $b are the same
function force_sep(&$a) { }
force_sep($b);
//force_sep is a no-op, but it forced a separation; now addresses are not equal

So the important question is the third one. Unfortunately, there is no straightforward way to determine this. This has been requested several times; see e.g. this request.

However, there are a few options:

  • You could to receive the name of the variable and look it up in the symbol table. This is also what makes xdebug_debug_zval much more interesting than the flawed debug_zval_dump. This is a simple lookup in EG(active_symbol_table) for simple variables (but would get more complex if you wanted to include object properties and dimensions etc.), and this would also allow you to implement a solution for the 2nd question.
  • You could also modify Jeremy Walton's answer to make the function receive by reference (you'd need an arginfo structure) and receive the two values at the same time. Receiving them at the same time can avoid false positives due to reused memory addresses (though whether it's a problem depends on the usage of the function; on the other hand, Jeremy Walton's function always suffers from this problem when receiving references -- I can elaborate on this if necessary, but see my comment under his answer).
  • netcoder's answer, although hackish, also works. The idea is to receive two variables by reference, change one, and see if the other one changed, restoring the values in the end.
Banas answered 27/11, 2010 at 22:24 Comment(0)
G
2

Reference comparison in PHP

I know the question is old, but this is still relevant - which is why I ended up here. There are probably several ways to test this, but I came up with a couple of other methods.

PHP 7.4 reference equality test

ReflectionReference provides a reference id for array elements:

function is_same(&$a, &$b): bool {
  $_ = [ &$a, &$b ];
  return
    \ReflectionReference::fromArrayElement($_, 0)->getId() ===
    \ReflectionReference::fromArrayElement($_, 1)->getId();
}

PHP version 5, 7 and 8

This function will spot an actual reference, by relying on the fact that PHP serialization detects circular references. The downside is that for big arrays it will temporary need memory and time to serialize the data. For big arrays it may be better to use the pragmatic array equality test below.

function is_same(&$a, &$b) {
    $_ = [ &$a, &$b ];
    // PHP >= 7.4
    if (\class_exists(\ReflectionReference::class)) {
      return
        \ReflectionReference::fromArrayElement($_, 0)->getId() ===
        \ReflectionReference::fromArrayElement($_, 1)->getId();
    }

    // Faster, for objects
    if (\is_object($a) && \is_object($b) && $a === $b) return true;

    // Stop if they aren't identical, this is much faster.
    if ($a !== $b) return false;

    // Resources can't be serialized
    if (\is_resource($a) && \is_resource($b) && "".$a === "".$b) return true;

    // Serialization supports references, so we utilize that
    return \substr(\serialize($_), -5) === 'R:2;}';
}

Memory friendly PHP < 7.4 array reference checking

This test should do the deed without wasting too much memory. A side effect is that PHP uses copy-on-write to save memory on arrays - so when this function appends to the array, it will trigger that mechanism.

function array_is_same(array &$a, array &$b): bool {
  // Fastest test first
  if ($a !== $b) {
    return false;
  }
  // Then the reference test
  try {
    // Need a unique key
    while (
      array_key_exists($key = '#! '.mt_rand(PHP_INT_MIN, PHP_INT_MAX), $a) || 
      array_key_exists($key, $b)
    );
    $a[$key] = true;
    return isset($b[$key]);
  } finally {
    // cleanup
    unset($a[$key], $b[$key]);
  }
}
Gourmand answered 25/2, 2021 at 23:6 Comment(0)
T
-1
function var_name(&$ref){
    foreach($GLOBALS as $key => $val){
       if($val === $ref) return $key;
    }
}

This is untested but what i know of php, vars are added to the GLOBALS as they are are loaded into the system, so the first occurance where they are identical should be the original var, but if you have 2 Variables Exactly the same i'm not sure how it would react

Teetotaler answered 24/11, 2010 at 14:10 Comment(9)
This will not work. The === operator checks for type, but not for reference. Which means if you have $a = 1; $b = 1; $c = &$a, then the following is true: $a === $b === $c, even if $b is not a reference. The only ways to know if a variable is a reference is to a) look at the code; or b) modify it and see if the original variable changes. As for $GLOBALS, it doesn't have anything to do with this. A reference declared in a function is still a reference, but will not be part of $GLOBALS.Faustena
Thanks for pointing out what i said numpty, but === does not check type, it check that its exactly the same, so type and content. but again i said i was not sure how it would work if you had 2 Exactly the sameTeetotaler
@Teetotaler - The identical operator works like that only when comparing objects. For the rest, "aaa"==="aaa" is true (eventhough they are different constant values). As to the GLOBALS idea you mentioned, that only works when in the global scope (afaik).Chancy
"aaa"==="aaa" is the same exactly the same what your trying to prove would be "1" === 1 or "1" === "hello" and both return false go learn phpTeetotaler
Barkermn01 - I find it offensive that you rudely issue an answer without even verifying your own facts. Now go learn programming basics and how two different constants are actually stored in two different locations. The problem above isn't content and type, but location. The identical operator works using location on objects, but does not work the same way on other variables. Here's an example: $o1=new stdClass(); $o2=new stdClass(); var_dump($o1===$o2); <= that results in false. $s1="aaa"; $s2="aaa"; var_dump($s1===$s2); <= that results in true.Chancy
$o1 = new stdClass(); $o2 = new stdClass(); var_dump($o1 === $o2); //bool(false) $c1 = new stdClass(); $c2 = &$c1; var_dump($c1 === $c2); //bool(true) $o1 = "aaa"; $o2 = "aaa"; var_dump($o1 === $o2); //bool(true) So That meens my code does work as if there the same object its false but if one is an refference to the object it does workTeetotaler
NO it does NOT. Your code does not even check the object type and if you haven't noticed, the identity operator works differently on objects than other variables. The identity checks contents, not reference, so your code ultimately fails in certain circumstances.Chancy
I'm quite incredulous you start with "what i know of php" next you go insulting people that they don't know any PHP. This is my last answer since you don't even want to understand. Idiots are those that don't want to learn, not those that eventually do. Remember there are 3 people saying the same thing and you're saying otherwise...Chancy
oo so you though i was i noob, i have been programming php since php3 i have kept up to date with it its called modisty but aparently any out out side of the UK is rude they don't notice it And your saying it dose not work run that code i have given you in a apache php system i think you will find it works as you said i should test it i did i even tested it with your codeTeetotaler
G
-2
 $a["unqiue-thing"] = 1;
 if($b["unique-thing"] == 1) // a and b are the same
Gogetter answered 6/11, 2010 at 0:43 Comment(9)
Just because a = b and b = c means a = c does not mean they point to the same memory block.Burgett
@Kirk: isn't this the answer you're looking for?Gogetter
@stereofrog Kind of. Feels really dirty though. Is there a more official way to get memory references?Mantissa
this seems to be the only way. Check out isReference method here simpletest.svn.sourceforge.net/viewvc/simpletest/simpletest/…Gogetter
what on earth are you talking about? what on earth does this have to do with memory locations?Countdown
@Hamish: changing one variable affects another one, that is, two variables are referring to the same thing. This is exactly what OP has asked about.Gogetter
@stereofrog this answer is wrong in so many ways. Firstly, if you assign an array by value, it is actually a reference until a value is changed in one array. That is, your solution will actually cause an array to 'de-reference' itself in some situations. Also, to do it properly, you'd need to check your test value for uniqueness, then overwrite the value, then test equality, then rewrite the old value back. It's full of fail.Countdown
@Hamish: sorry, that doesn't make any sense to me. You might be better off posting your own answer, so that OP and community can benefit from your point of view.Gogetter
@sterofrot I've already upvoted the correct answer. It does make sense if you're familiar with how PHP manages memory under the hood. If you create $a as an array, then do $b = $a, $b will in fact be reference $b - as in, no extra memory is assigned. If you then do $b[] = "new item" only then does PHP copy the entire array and make the change. You can test this by checking memory usage of a large array if you like. memory usage is only bumped up after changing the second array, not during assignment. Try it.Countdown

© 2022 - 2024 — McMap. All rights reserved.