Parentheses altering semantics of function call result
Asked Answered
P

2

52

It was noted in another question that wrapping the result of a PHP function call in parentheses can somehow convert the result into a fully-fledged expression, such that the following works:

<?php
error_reporting(E_ALL | E_STRICT);

function get_array() {
   return array();
}

function foo() {
   // return reset(get_array());
   //              ^ error: "Only variables should be passed by reference"

   return reset((get_array()));
   //           ^ OK
}

foo();

I'm trying to find anything in the documentation to explicitly and unambiguously explain what is happening here. Unlike in C++, I don't know enough about the PHP grammar and its treatment of statements/expressions to derive it myself.

Is there anything hidden in the documentation regarding this behaviour? If not, can somebody else explain it without resorting to supposition?


Update

I first found this EBNF purporting to represent the PHP grammar, and tried to decode my scripts myself, but eventually gave up.

Then, using phc to generate a .dot file of the two foo() variants, I produced AST images for both scripts using the following commands:

$ yum install phc graphviz
$ phc --dump-ast-dot test1.php > test1.dot
$ dot -Tpng test1.dot > test1.png
$ phc --dump-ast-dot test2.php > test2.dot
$ dot -Tpng test2.dot > test2.png

In both cases the result was exactly the same:

Parse tree of snippets 1 and 2

Pluralism answered 17/7, 2011 at 20:31 Comment(16)
It looks like that this is exclusively to expressions in form of a single function call.Oddity
Array() with uppercase A? afaik, the language construct is written array()Microdont
PHP, hence not case-sensitive.Beauvoir
@knittl: It's not case-sensitive, and I prefer Array.Pluralism
@wrikken @tomalak: only variables (user code) are case sensitive? didn't know that! learnt something new todayMicrodont
@knittl: Yea, pretty much just variable names.Pluralism
The reasons why only a single function call can have this, is that only either a variable or a single function returning by reference can be correct input for reset. A variable obviously will always work by reference, which leaves us with the functioncall which is only checked at execution because of the possibility to have something like $variablewithafunctionname(). Why the () would make reset not complain... That would mean at the time reset gets its input it is a reference (refcount > 1), which would mean the expression (get_array()) leaves some zval in memory...Beauvoir
Digging a bit further, the strict warning is comming out of the VM part/runtime. The fatal errors (not in the Q's example, one would be: return reset((get_array()?:0));) is already at compile time and the wording is much more harsh: "Fatal error: Only variables can be passed by reference" (and wrong, if a function returns a reference it's all fine). Many flags are checked prior giving the strict notice, I smell somewhere therein it lies but I do not know much about PHP internals: php-trunk/Zend/zend_vm_execute.h line 10853~Oddity
@Wrikken: For the zval idea: It needs no refcount at all, a normal variable wouldn't have any as well. So just a zval (refcount >= 0) should do it.Oddity
Hmm, point there. And a return from a function has a minimum refcount of 1.. Dang.Beauvoir
Of course the reset((get_array()?:0)); is an error at compile time because of the 0. try something like reset((get_array()?:$var));: that can have proper outcomes, but still yields a fatal.Beauvoir
Some reference handling internals of PHP explained in deep details, if someone can found things there: derickrethans.nl/talks/phparch-php-variables-article.pdfDidactic
@regilero: Good heavens; I'll definitely have to give that a read when I get a chance. Thanks!Pluralism
@Wrikken: My fault, there is no refcount = 0 for a var, it's always 1 minimum. debug_zval_dump(get_array()); gives one refcount btw., using parenthesis makes no difference but this can be misleading.Oddity
@Wrikken: debug_zval_dump always gives at least one refcount, the one from the function parameter. And effectively debug_zval_dump(get_array()); and debug_zval_dump((get_array())); gives the same result, except the first one generates a STRICT notice.Didactic
Yup, and changing the return of a get_array to a reference does not yield any usable results either. I don't think we can get any usable info in PHP, so some brave soul will have to delve through the spaghetti that is the PHP C-source to get a definitive answer.Beauvoir
T
32

This behavior could be classified as bug, so you should definitely not rely on it.

The (simplified) conditions for the message not to be thrown on a function call are as follows (see the definition of the opcode ZEND_SEND_VAR_NO_REF):

  • the argument is not a function call (or if it is, it returns by reference), and
  • the argument is either a reference or it has reference count 1 (if it has reference count 1, it's turned into a reference).

Let's analyze these in more detail.

First point is true (not a function call)

Due to the additional parentheses, PHP no longer detects that the argument is a function call.

When parsing a non empty function argument list there are three possibilities for PHP:

  • An expr_without_variable
  • A variable
  • (A & followed by a variable, for the removed call-time pass by reference feature)

When writing just get_array() PHP sees this as a variable.

(get_array()) on the other hand does not qualify as a variable. It is an expr_without_variable.

This ultimately affects the way the code compiles, namely the extended value of the opcode SEND_VAR_NO_REF will no longer include the flag ZEND_ARG_SEND_FUNCTION, which is the way the function call is detected in the opcode implementation.

Second point is true (the reference count is 1)

At several points, the Zend Engine allows non-references with reference count 1 where references are expected. These details should not be exposed to the user, but unfortunately they are here.

In your example you're returning an array that's not referenced from anywhere else. If it were, you would still get the message, i.e. this second point would not be true.

So the following very similar example does not work:

<?php

$a = array();
function get_array() {
   return $GLOBALS['a'];
}

return reset((get_array()));
Tippet answered 18/7, 2011 at 12:30 Comment(1)
Fantastic. And I realise now that the AST isn't particularly irrelevant. Thank you :)Pluralism
A
1

A) To understand what's happening here, one needs to understand PHP's handling of values/variables and references (PDF, 1.2MB). As stated throughout the documentation: "references are not pointers"; and you can only return variables by reference from a function - nothing else.

In my opinion, that means, any function in PHP will return a reference. But some functions (built in PHP) require values/variables as arguments. Now, if you are nesting function-calls, the inner one returns a reference, while the outer one expects a value. This leads to the 'famous' E_STRICT-error "Only variables should be passed by reference".

$fileName = 'example.txt';
$fileExtension = array_pop(explode('.', $fileName));
// will result in Error 2048: Only variables should be passed by reference in…

B) I found a line in the PHP-syntax description linked in the question.

expr_without_variable = "(" expr ")"

In combination with this sentence from the documentation: "In PHP, almost anything you write is an expression. The simplest yet most accurate way to define an expression is 'anything that has a value'.", this leads me to the conclusion that even (5) is an expression in PHP, which evaluates to an integer with the value 5.

(As $a = 5 is not only an assignment but also an expression, which evalutes to 5.)

Conclusion

If you pass a reference to the expression (...), this expression will return a value, which then may be passed as argument to the outer function. If that (my line of thought) is true, the following two lines should work equivalently:

// what I've used over years: (spaces only added for readability)
$fileExtension = array_pop( ( explode('.', $fileName) ) );
// vs
$fileExtension = array_pop( $tmp = explode('.', $fileName) );

See also PHP 5.0.5: Fatal error: Only variables can be passed by reference; 13.09.2005

Autosuggestion answered 17/7, 2011 at 20:31 Comment(9)
but from this doc page: php.net/manual/en/language.references.pass.php it seems expressions cannot be used "as the result is undefined". I wonder if the whole parenthesis trick is not just bypassing internal checks, and may become in long term an undefined application result thing.Didactic
Well this post is highly speculative. In the absence of a documentation (and I've searched for more than one hour, knowing how to use search engines), this is the best I can provide. My idea was, to commonly create a documentation for that behavior as a SO-wiki-entry.Autosuggestion
FWIW, (5) would be an expression in pretty much all C-like languages.Pluralism
-1 "IMO, that means, any function in PHP will return a reference." is not true. The answer has some passages that are true but the conclusions never follow from those passages.Sitnik
I think "Bug" is not a classification at all. It's a non-precise term that often is congruent being a "Feature". Would you classify the described behaviour as a "Fault" in the PHP programming language?Oddity
@Oddity grin, no it's a nice feature—also questioned on the PHP bug-report: "Should we make use of this feature when programming PHP or not?"Autosuggestion
@feeela: That was hakre's comment, was it not?Pluralism
@Tomalak Geret'kal Not really sure what you mean, but yes I've cited hakres comment on the PHP-bug-page… (as linked above by himself).Autosuggestion
@feeela: Just seemed like you were quoting something to hakre without realising that it was hakre who'd written it in the first place; seemed odd! :)Pluralism

© 2022 - 2024 — McMap. All rights reserved.