Why are PHP function calls *so* expensive?
Asked Answered
I

5

41

A function call in PHP is expensive. Here is a small benchmark to test it:

<?php
const RUNS = 1000000;

// create test string
$string = str_repeat('a', 1000);
$maxChars = 500;

// with function call
$start = microtime(true);
for ($i = 0; $i < RUNS; ++$i) {
    strlen($string) <= $maxChars;
}
echo 'with function call: ', microtime(true) - $start, "\n";

// without function call
$start = microtime(true);
for ($i = 0; $i < RUNS; ++$i) {
    !isset($string[$maxChars]);
}
echo 'without function call: ', microtime(true) - $start;

This tests a functionally identical code using a function first (strlen) and then without using a function (isset isn't a function).

I get the following output:

with function call:    4.5108239650726
without function call: 0.84017300605774

As you can see the implementation using a function call is more than five (5.38) times slower than the implementation not calling any function.

I would like to know why a function call is so expensive. What's the main bottleneck? Is it the lookup in the hash table? Or what is so slow?


I revisited this question, and decided to run the benchmark again, with XDebug completely disabled (not just profiling disabled). This showed, that my tests were fairly convoluted, this time, with 10000000 runs I got:

with function call:    3.152988910675
without function call: 1.4107749462128

Here a function call only is approximately twice (2.23) as slow, so the difference is by far smaller.


I just tested the above code on a PHP 5.4.0 snapshot and got the following results:

with function call:    2.3795559406281
without function call: 0.90840601921082

Here the difference got slightly bigger again (2.62). (But on the over hand the execution time of both methods dropped quite significantly).

Interosculate answered 11/9, 2010 at 15:53 Comment(10)
That's a big assumption. How sure are you that strlen is 30% heavier then isset?Grano
do profiling, not "benchmarks"Silage
@Paco: This is a question out of interest. It is purely theoretical.Interosculate
@Col: It would be really nice if you could tell me where I can find the code responsible for this. Even better together with an explanation which parts take longest. (To say it more clear: I'm not stupid. If I knew there the responsible code is, I wouldn't have asked here. I'm am asking here because I don't know and I hope that somebody else does know.)Interosculate
@nicki: If you're really interested in this, you may want to browse the source code of PHP. You can find it here. Note, however, that it is PHP and extensions are written in C, so if you do not know C or C++ (or a very similar language), you probably won't understand a lot of the code.Heida
@nikic I recant my statement, I thought I found something in a benchmark (turned out I was just measuring the overhead of a functional call). BUT in my testing I think I stumbled upon something. You don't happen to have XDebug installed do you? If you do, disable it temporary and watch your benchmark become ALOT saner. Mine went from 1.605, 1.5897, 0.0355 (with XDebug) to 0.0363, 0.0323, 0.0129 (without XDebug). I assume the HUGE runtime difference is due to the large amount of data that must be written to disk by XDebug.Doan
@Kendall: I have XDebug installed, but haven't got profiling enabled. Thus there shouldn't be much slowdown in my test from XDebug. Your second benchmark if so different from mine probably because your machine is way much faster. Try using 100000000 runs and see if the data is more comparable. (In benchmark under a seconds the kernel scheduling may largely influence the results I have heard.)Interosculate
@Kendall: After all, you were right. I just tested without XDebug and got very different results (see edit). Thanks very much for pointing out.Interosculate
The author is now a core developer of PHP. I hope people here are a bit more understanding that sometimes people ask questions to learn.Ragman
This question is now nearly 7 years old and both PHP and hardware performance in general has improved by orders of magnitude, as have cache sizes. my runs of your test on PHP 7 on modern hardware actually have the version without function calls being (slightly) more expensive than the version with.Kuth
S
47

Function calls are expensive in PHP because there's lot of stuff being done.

Note that isset is not a function (it has a special opcode for it), so it's faster.

For a simple program like this:

<?php
func("arg1", "arg2");

There are six (four + one for each argument) opcodes:

1      INIT_FCALL_BY_NAME                                       'func', 'func'
2      EXT_FCALL_BEGIN                                          
3      SEND_VAL                                                 'arg1'
4      SEND_VAL                                                 'arg2'
5      DO_FCALL_BY_NAME                              2          
6      EXT_FCALL_END                                            

You can check the implementations of the opcodes in zend_vm_def.h. Prepend ZEND_ to the names, e.g. for ZEND_INIT_FCALL_BY_NAME and search.

ZEND_DO_FCALL_BY_NAME is particularly complicated. Then there's the the implementation of the function itself, which must unwind the stack, check the types, convert the zvals and possibly separate them and to the actual work...

Swarthy answered 11/9, 2010 at 18:21 Comment(1)
Thanks Artefacto, this really helps me. I will have a look at those definitions. +1Interosculate
K
8

Is the overhead for calling a user function really that big? Or rather is it really that big now? Both PHP and computer hardware have advanced in leaps and bounds in the nearly 7 years since this question was originally asked.

I've written my own benchmarking script below which calls mt_rand() in a loop both directly and via a user-function call:

const LOOPS = 10000000;

function myFunc ($a, $b)
{
    return mt_rand ($a, $b);
}

// Call mt_rand, simply to ensure that any costs for setting it up on first call are already accounted for
mt_rand (0, 1000000);

$start = microtime (true);
for ($x = LOOPS; $x > 0; $x--)
{
    mt_rand (0, 1000000);
}
echo "Inline calling mt_rand() took " . (microtime(true) - $start) . " second(s)\n";

$start = microtime (true);
for ($x = LOOPS; $x > 0; $x--)
{
    myFunc (0, 1000000);
}
echo "Calling a user function took " . (microtime(true) - $start) . " second(s)\n";

Results on PHP 7 on a 2016 vintage i5 based desktop (More specifically, Intel® Core™ i5-6500 CPU @ 3.20GHz × 4) are as follows:

Inline calling mt_rand() took 3.5181620121002 second(s) Calling a user function took 7.2354700565338 second(s)

The overhead of calling a user function appears to roughly double the runtime. But it took 10 million iterations for it to become particularly noticeable. This means that in most cases the differences between inline code and a user function are likely to be negligible. You should only really worry about that kind of optimisation in the innermost loops of your program, and even then only if benchmarking demonstrate a clear performance problem there. Anything else would be a that would yield little to no meaningful performance benefit for added complexity in the source code.

If your PHP script is slow then the odds are almost certainly that it's going to be down to I/O or poor choice of algorithm rather than function call overhead. Connecting to a database, doing a CURL request, writing to a file or even just echoing to stdout are all orders of magnitude more expensive than calling a user function. If you don't believe me, have mt_rand and myfunc echo their output and see how much slower the script runs!

In most cases the best way to optimise a PHP script is to minimise the amount of I/O it has to do (only select what you need in DB queries rather than relying on PHP to filter out unwanted rows, for example), or get it to cache I/O operations though something such as memcache to reduce the cost of I/O to files, databases, remote sites, etc

Kuth answered 1/2, 2017 at 14:21 Comment(1)
On a Xeon E5 server running PHP 5.6.30, inline took 15.9 second(s) and the user function took 30.9 second(s).Measure
C
5

I would contend that they are not. You're not actually testing a function call at all. You're testing the difference between a low-level out of bounds check (isset) and walking through a string to count the number of bytes (strlen).

I can't find any info specific to PHP, but strlen is usually implemented something like (including function call overhead):

$sp += 128;
$str->address = 345;
$i = 0;
while ($str[$i] != 0) {
    $i++;
}
return $i < $length;

An out of bounds check would typically be implemented something like:

return $str->length < $length;

The first one is looping. The second one is a simple test.

Cynth answered 25/7, 2013 at 8:7 Comment(5)
strlen is just a lookup of a struct member. It doesn't loop.Interosculate
Perhaps, but the original test is still not testing the time for a function call. isset is a language construct. Things like empty, isset, etc. all perform much better than other parts of the language. A valid test would simply be a function and an inlined loop. These tests are meaningless.Cynth
I just performed some tests on my MBP. A function call on this overloaded laptop takes 0.00000107109547s. That's not expensive. What's expensive is poorly factored code that got that way because someone THINKS a function call is expensive, so they avoid using functions.Cynth
I'm aware that isset is not a function - that's the whole point of using it in the example: It tests functionally equivalent code once with a function call (strlen) and once without (isset).Interosculate
But why not test the same thing in the function loop as is being tested in the non-function loop? It adds a mismatch that makes the test invalid. A real test would be testing the difference between F() { isset($a); } and isset($a); At the very least, the strlen version is pushing the result onto the stack and then performing a lte op, which is outside the scope of the test.Cynth
S
4

I think rich remer's answer is actually pretty accurate. You're comparing apples to oranges with your original example. Try this one instead:

<?php
$RUNS = 100000;
// with function call
$x = "";
$start = microtime(true);
for ($i = 0; $i < $RUNS; ++$i) {
    $x = $i.nothing($x);
}
echo 'with function call: ', microtime(true) - $start, "\n<br/>";

// without function call
$x = "";
$start = microtime(true);
for ($i = 0; $i < $RUNS; ++$i) {
    $x = $i.$x;
}
echo 'without function call: ', microtime(true) - $start;

function nothing($x) {
    return $x;
}

The only difference in this example is the function call itself. With 100,000 runs (as given above) we see a <1% difference in using the function call from our output:

with function call: 2.4601600170135 
without function call: 2.4477159976959

Of course this all varies upon what your function does and what you considered expensive. If nothing() returned $x*2 (and we replaced the non-function call of $x = $i.$x with $x = $i.($x*2) we'd see ~4% loss in using the function call.

Selfeffacement answered 31/10, 2016 at 21:8 Comment(0)
D
3

Function calls are expensive for the reason perfectly explained by @Artefacto above. Note that their performance is directly tied to the number of parameters/arguments involved. This is one area that I've paid close attention to while developing my own applications framework. When it makes sense and possible to avoid a function call, I do.

One such example is a recent replacement of is_numeric() and is_integer() calls with a simple boolean test in my code, especially when several calls to these functions may be made. While some may think that such optimizations are meaningless, I've noticed a dramatic improvement in the responsiveness of my websites through this type of optimization work.

The following quick test will be TRUE for a number and FALSE for anything else.

if ($x == '0'.$x) { ... }

Much faster than is_numeric() and is_integer(). Again, only when it makes sense, it's perfectly valid to use some optimizations.

Dian answered 28/5, 2014 at 5:18 Comment(5)
That test is liable to be quite slow if $x could be an object with a __toString method. Do you have a method that's still fast in that situation?Casemate
@Casemate I use the boolean check ($var*1) for typecasting from an xml. It was about twice as fast as is_numeric.Rendition
Also the ($x == '0'.$x) method fails for values like '12.3k' being assumed numeric. For full context and your interest, this is the optimized function-less typecasting method I use for an xml parser class: ((($var = $data*1) && --$data+1 === $var) || (string)($var = (float)$data) == $data) ? $data === "0" ? (int)$var:$var: ($data === 'TRUE' ? TRUE: ($data === 'FALSE' ? FALSE: $data)); /cc: @CasemateRendition
@Rendition I just ran a benchmark on that - it's vastly faster than is_numeric on my machine, but it produces errors for certain operand types, such as arrays. (It's also about twice as slow as is_numeric on our production server, which has some function optimization I don't understand.)Casemate
@Casemate OK, interesting. Definitely not intended for mixed types. :) Works well for the context of auto-typecasting strings.Rendition

© 2022 - 2024 — McMap. All rights reserved.