PHP Performance : Copy vs. Reference
Asked Answered
A

6

8

Hey there. Today I wrote a small benchmark script to compare performance of copying variables vs. creating references to them. I was expecting, that creating references to large arrays for example would be significantly slower than copying the whole array. Here is my benchmark code:

<?php
    $array = array();

    for($i=0; $i<100000; $i++) {
        $array[] = mt_rand();
    }

    function recursiveCopy($array, $count) {
        if($count === 1000)
            return;

        $foo = $array;
        recursiveCopy($array, $count+1);
    }

    function recursiveReference($array, $count) {
        if($count === 1000)
            return;

        $foo = &$array;
        recursiveReference($array, $count+1);
    }

    $time = microtime(1);
    recursiveCopy($array, 0);
    $copyTime = (microtime(1) - $time);
    echo "Took " . $copyTime . "s \n";


    $time = microtime(1);
    recursiveReference($array, 0);
    $referenceTime = (microtime(1) - $time);
    echo "Took " . $referenceTime . "s \n";

    echo "Reference / Copy: " . ($referenceTime / $copyTime);

The actual result I got was, that recursiveReference took about 20 times (!) as long as recursiveCopy.

Can somebody explain this PHP behaviour?

Anthropography answered 28/10, 2010 at 13:37 Comment(3)
Aside from the incorrect recursion, why bother recursing at all? Why not just setup a for loop and unset the vars after each iteration (which will have FAR less overhead than a recursive call, and not eat up all of your memory)... But in the end, the difference will be so small that in 99.9999% of the cases it makes more sense to use the semantically appropriate assignment (reference where you need one, normal where you don't) rather than trying to micro-optimize.Designation
This wasn't about trying to optimize stuff, I was just curious. And I used recursion, because I didn't want to unset the vars and loop, etc... It was much faster to write a recursion, and I didn't care about the overhead because it was the same in both.Anthropography
Just avoid large arrays to get good performance. That's all.Firmament
W
17

PHP will very likely implement copy-on-write for its arrays, meaning when you "copy" an array, PHP doesn't do all the work of physically copying the memory until you modify one of the copies and your variables can no longer reference the same internal representation.

Your benchmarking is therefore fundamentally flawed, as your recursiveCopy function doesn't actually copy the object; if it did, you would run out of memory very quickly.

Try this: By assigning to an element of the array you force PHP to actually make a copy. You'll find you run out of memory pretty quickly as none of the copies go out of scope (and aren't garbage collected) until the recursive function reaches its maximum depth.

function recursiveCopy($array, $count) {
    if($count === 1000)
        return;

    $foo = $array;
    $foo[9492] = 3; // Force PHP to copy the array
    recursiveCopy($array, $count+1);
}
Whatley answered 28/10, 2010 at 13:46 Comment(0)
D
3

in recursiveReference you're calling recursiveCopy... this doesn't make any sense, in this case you're calling recursiveReference just once. correct your code, rund the benchmark again and come back with your new results.

in addition, i don't think it's useful for a benchmark to do this recursively. a better solution would be to call a function 1000 times in a loop - once with the array directly and one with a reference to that array.

Demy answered 28/10, 2010 at 13:44 Comment(1)
Sorry, I don't really know how that got there, that was a copy & paste wtf... In the version I ran that wasn't there.Anthropography
S
3

You don't need to (and thus shouldn't) assign or pass variables by reference just for performance reasons. PHP does such optimizations automatically.

The test you ran is flawed because of these automatic optimizations. In ran the following test instead:

<?php
for($i=0; $i<100000; $i++) {
    $array[] = mt_rand();
}

$time = microtime(1);
for($i=0; $i<1000; $i++) {
    $copy = $array;
    unset($copy);
}
$duration = microtime(1) - $time;
echo "Normal Assignment and don't write: $duration<br />\n";

$time = microtime(1);
for($i=0; $i<1000; $i++) {
    $copy =& $array;
    unset($copy);
}
$duration = microtime(1) - $time;
echo "Assignment by Reference and don't write: $duration<br />\n";

$time = microtime(1);
for($i=0; $i<1000; $i++) {
    $copy = $array;
    $copy[0] = 0;
    unset($copy);
}
$duration = microtime(1) - $time;
echo "Normal Assignment and write: $duration<br />\n";

$time = microtime(1);
for($i=0; $i<1000; $i++) {
    $copy =& $array;
    $copy[0] = 0;
    unset($copy);
}
$duration = microtime(1) - $time;
echo "Assignment by Reference and write: $duration<br />\n";
?>

This was the output:

//Normal Assignment without write: 0.00023698806762695
//Assignment by Reference without write: 0.00023508071899414
//Normal Assignment with write: 21.302103042603
//Assignment by Reference with write: 0.00030708312988281

As you can see there is no significant performance difference in assigning by reference until you actually write to the copy, i.e. when there is also a functional difference.

Sparse answered 26/7, 2011 at 16:31 Comment(0)
D
1

Generally speaking in PHP, calling by reference is not something you'd do for performance reasons; it's something you'd do for functional reasons - ie because you actually want the referenced variable to be updated.

If you don't have a functional reason for calling by reference then you should stick with regular parameter passing, because PHP handles things perfectly efficiently that way.

(that said, as others have pointed out, your example code isn't exactly doing what you think it is anyway ;))

Delisadelisle answered 28/10, 2010 at 13:55 Comment(1)
This is trying to measure assign by reference vs assign by value. Calling anything doesn't really enter into the picture (other than the test is recursive).. Otherwise +1Designation
W
0
  1. In recursiveReference() function you call recursiveCopy() function. It it what you really intended to do?
  2. You do nothing with $foo variable - probably it was supposed to be used in further method call?
  3. Passing variable by reference should generally save stack memory in case of passing large objects.
Whisenant answered 28/10, 2010 at 13:47 Comment(0)
E
0

recursiveReference is calling recursiveCopy. Not that that would necessarily harm performance, but that's probably not what you're trying to do.

Not sure why performance is slower, but it doesn't reflect the measurement you're trying to make.

Evanne answered 28/10, 2010 at 13:47 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.