Memory leak?! Is Garbage Collector doing right when using 'create_function' within 'array_map'?
Asked Answered
D

3

4

I found following solution here on StackOverflow to get an array of a specific object property from array of objects: PHP - Extracting a property from an array of objects

The proposed solution is to use array_map and within create a function with create_function as following:

$catIds = array_map(create_function('$o', 'return $o->id;'), $objects);

What happens?: array_map runs through each array element in this case a stdClass object. First it creates a function like this:

function($o) {
    return $o->id;
}

Second it calls this function for the object in the current iteration. It works, it works nearly same like this similar solution:

$catIds = array_map(function($o) { return $o->id; }, $objects);

But this solution is only running in PHP version >= 5.3 because it uses the Closure concept => http://php.net/manual/de/class.closure.php

Now the real problem:

The first solution with create_function increases the memory, because the created function will be written to the memory and not be reused or destroyed. In the second solution with Closure it will.

So the solutions gives the same results but have different behaviors with respect to the memory.

Following example:

// following array is given
$objects = array (
    [0] => stdClass (
        [id] => 1
    ),
    [1] => stdClass (
        [id] => 2
    ),
    [2] => stdClass (
        [id] => 3
    )
)

BAD

while (true)
{
    $objects = array_map(create_function('$o', 'return $o->id;'), $objects);
    // result: array(1, 2, 3);

    echo memory_get_usage() ."\n";

    sleep(1);
}

4235616
4236600
4237560
4238520
...

GOOD

while (true)
{
    $objects = array_map(function($o) { return $o->id; }, $objects);
    // result: array(1, 2, 3);

    echo memory_get_usage() ."\n";

    sleep(1);
}

4235136
4235168
4235168
4235168
...

I spend so many time to find this out and now I want to know, if it's a bug with the garbage collector or do I made a mistake? And why it make sense to leave the already created and called function in memory, when it'll never be reuse?

Here is a running example: http://ideone.com/9a1D5g

Updated: When I recursively search my code and it's dependencies e.g. PEAR and Zend then I found this BAD way too often.

Updated: When two functions are nested, we proceed from the inside out in order to evaluate this expression. In other words, it is first starting create_function (once) and that returning function name is the argument for the single call of array_map. But because GC forget to remove it from memory (no pointer left to the function in memory) and PHP not be able to reuse the function already located in memory let me think that there is an error and not only a thing with "bad performance". This specific line of code is an example in PHPDoc and reused in so many big frameworks e.g. Zend and PEAR and more. With one line more you can work around this "bug", check. But I'm not searching for a solution: I'm searching for the truth. Is it a bug or is it just my approach. And latter I could not decide yet.

Disassembly answered 12/9, 2014 at 12:29 Comment(3)
"PHP is inefficient with memory." - every developer ever.Northumbrian
@Northumbrian OK, than it's just a fact I have to realize?!Disassembly
Your update contains an error. I think it's quite clear now that create_function() does not return a lambda-style function, it only returns the name of that function in a string. See the manual: nl3.php.net/manual/en/function.create-function.php It states: Returns a unique function name as a string, or FALSE on error.Donyadoodad
D
10

In the case of create_function() a lambda-style function is created using eval(), and a string containing its name is returned. That name is then passed as argument to the array_map() function.

This differs from the closure-style anonymous function where no string containing a name is used at all. function($o) { return $o->id; } IS the function, or rather an instance of the Closure class.

The eval() function, inside create_function(), executes a piece of PHP code which creates the wanted function. Somewhat like this:

function create_function($arguments,$code) 
{
  $name = <_lambda_>; // just a unique string
  eval('function '.$name.'($arguments){$code}');
  return $name;
}

Note that this is a simplification.

So, once the function is created it will persist until the end of the script, just like normal functions in a script. In the above BAD example, a new function is created like this on every iteration of the loop, taking up more and more memory.

You can, however, intentionally destroy the lambda-style function. This is quite easy, just change the loop to:

while (true)
{
    $func = create_function('$o', 'return $o->id;');
    $objects = array_map($func, $objects);
    echo memory_get_usage() ."\n";
    sleep(1);
}

The string containting the reference (= name) to the function was made expliciet and accessible here. Now, every time create_function() is called, the old function is overwritten by a new one.

So, no, there's no 'Memory leak', it is meant to work this way.

Of course the code below is more efficient:

$func = create_function('$o', 'return $o->id;');

while (true)
{
    $objects = array_map($func, $objects);
    echo memory_get_usage() ."\n";
    sleep(1);
}

And should only be used when closure-style anonymous function are not supported by your PHP version.

Donyadoodad answered 15/9, 2014 at 9:30 Comment(15)
Yes, thx, that's same I found out, but for my understanding that way to write the result of create_function in a variable should be the same as put it directly as parameter in array_map, isn't it? In the one-liner-solution the create_function call is called one time and the returning result maybe lambda-style is the parameter for a single call of array_map. Why is it created more than once. That makes no sense, does it?Disassembly
It does make sense, and no, it's not the same. To create a anonymous function, create_function() only needs one call. That's not what you do. On every iteration of the loop you create a new anonymous function, and you never destroy it explicitly. The name of the anonymous function is transferred to array_map by reference, NOT the anonymous function itself.Donyadoodad
No, I call the create_function function once but inline the parameter of array_map. For my understanding it should be the same. In both solutions it should be created on function and the name of the function is the value for array_map's first parameter. Why array_map is creating the function multiple times instead of just calling it multiple times, that's the question here. And the other question is, why these instances of functions aren't be deleted maybe after the complete array_map call like it works with the other solutions e.g. Closure call oder working with a variable first.Disassembly
For better understanding: array_map(create_function(..., ...), ...); should be the same like $var = create_function(..., ...); array_map($var, ...);Disassembly
No, on every iteration of the while () { } loop, the arguments of array_map() are evaluated. This means that create_function() is executed on every iteration, ... and so also function($o) { return $o->id; }. The difference is in what they return. The first returns an name reference to the function created, the other one an instance of the closure class (which is the function itself). Also note that when you do $var = ... many times, that any old value that $var might have is explicitly unset. This is how the function gets destroyed.Donyadoodad
array_map(create_function(..., ...), ...); does indeed do the same as $var = create_function(..., ...); array_map($var, ...);, unless you run the code several times, like you do.Donyadoodad
Yeah, that's the point, if the garbage collector would remove the created function from memory, the problem won't exists, wouldn't it? The context with while (true) {} just shows me the existing problem. My program rans into an 'Out of memory' error after hours.Disassembly
The memory is released when the script, that created all your create_function() functions, ends. You run your script indefinately?Donyadoodad
OK, but makes it sense. In that case with writing it in a variable before, it's clear: On every iteration the variable will be overridden. But there is no possibility to override it in my context with the inline solution. There is no possibility to reused the function, because of a missing pointer. That's why I mean garbage collection has to remove it directly after the array_map call, because there is no active pointer to the function and isn't the missing pointer that argument for GC to remove object from memory?Disassembly
Yes, you can reuse the inline function quite easily. The function name could be attached to another global variable inside the function. Just because it's an argument to a function doesn't mean it cannot be reused. Anyway, that's not the point, it is just the way that create_function() works.Donyadoodad
That is what I wrote before. But without writing it in a variable but make it inline, I think then, and just in this case, you cannot reuse the function, because it's output is the direct input of array_map and should then be removed through the garbage collector. And that special variant of doing it makes me confused. The others are clear.Disassembly
Addition: If you mean something like this $objects = array_map($func = create_function('$o', 'global $func; /* do stuff with $func */ return $o->id;'), $objects); then it's not my variant, then it's the variant with the variable one line before. Because you have a pointer to this function.Disassembly
OK, but is that happening inside array_map. In your example you also have a pointer to the created function called $globalFunction. Then the garbage collection shouldn't remove the method. But I'm pretty sure, that array_map isn't doing something like this. And therefore there is no pointer and further the GC has to remove it?!Disassembly
I haven't got much left to say, nor are comments the place for such an extended discussion. I've changed my answer slightly to give you more insight into what happens inside the create_function() function.Donyadoodad
I'm very grateful for the discussion with you and your commitment. I'm considering to give you the points at least for that. However, the answer is not yet satisfying me. I already have a rough idea how create_function and also how ` array_map` works. But all that is still no explanation for why array_map not only call create_function repeatedly instead of creation it multiple times, isn't it? I also updated my question. See last sentences.Disassembly
A
3

Don't use create_function() if you can avoid it. Particularly not repeatedly. Per the big yellow Caution box in the PHP manual:

...it has bad performance and memory usage characteristics.

Aixenprovence answered 15/9, 2014 at 9:19 Comment(1)
Thx, good point, but on same URL in Example #3 Using anonymous functions as callback functions it is recommended to use it like this.Disassembly
D
2

OK, I think the problem is, that first solution with create_function is running on older versions of PHP and the second solution doesn't increase the memory unnecessary. But let's have a look at first solution. The create_function method is called inside the array_map, namely for each while iteration. If we want a solution to work with older PHP versions and without increasing memory we have to do following to the older function instance on each while iteration:

$func = create_function('$o', 'return $o->id;');
$catIds = array_map($func, $objects);

That's all. So simple.

But it also isn't answering the question at all. What remains is the question if it is a bug with PHP or a feature. For my understanding that way to write the result of create_function in a variable SHOULD be the same as put it directly as parameter in array_map, isn't it?

Disassembly answered 15/9, 2014 at 9:7 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.