Why don't PHP attributes allow functions?
Asked Answered
S

5

42

I'm pretty new to PHP, but I've been programming in similar languages for years. I was flummoxed by the following:

class Foo {
    public $path = array(
        realpath(".")
    );
}

It produced a syntax error: Parse error: syntax error, unexpected '(', expecting ')' in test.php on line 5 which is the realpath call.

But this works fine:

$path = array(
    realpath(".")
);

After banging my head against this for a while, I was told you can't call functions in an attribute default; you have to do it in __construct. My question is: why?! Is this a "feature" or sloppy implementation? What's the rationale?

Substantialize answered 18/10, 2010 at 14:48 Comment(3)
@Substantialize well, you could have a look at the source code and judge for yourself whether it's sloppy or a feature (or both). I guess it's in zend_object.c but I am not that familiar with the Zend Engine, so you might have to dig a bit. I added zend-engine to the tags list. Maybe it attracts some more knowledgable people.Gaygaya
Referenced from phpsadness.com/sad/37Blakey
PHP has a feature called attributes, but this isn't them. This is a property initialization.Afrika
H
54

The compiler code suggests that this is by design, though I don't know what the official reasoning behind that is. I'm also not sure how much effort it would take to reliably implement this functionality, but there are definitely some limitations in the way that things are currently done.

Though my knowledge of the PHP compiler isn't extensive, I'm going try and illustrate what I believe goes on so that you can see where there is an issue. Your code sample makes a good candidate for this process, so we'll be using that:

class Foo {
    public $path = array(
        realpath(".")
    );
}

As you're well aware, this causes a syntax error. This is a result of the PHP grammar, which makes the following relevant definition:

class_variable_declaration: 
      //...
      | T_VARIABLE '=' static_scalar //...
;

So, when defining the values of variables such as $path, the expected value must match the definition of a static scalar. Unsurprisingly, this is somewhat of a misnomer given that the definition of a static scalar also includes array types whose values are also static scalars:

static_scalar: /* compile-time evaluated scalars */
      //...
      | T_ARRAY '(' static_array_pair_list ')' // ...
      //...
;

Let's assume for a second that the grammar was different, and the noted line in the class variable delcaration rule looked something more like the following which would match your code sample (despite breaking otherwise valid assignments):

class_variable_declaration: 
      //...
      | T_VARIABLE '=' T_ARRAY '(' array_pair_list ')' // ...
;

After recompiling PHP, the sample script would no longer fail with that syntax error. Instead, it would fail with the compile time error "Invalid binding type". Since the code is now valid based on the grammar, this indicates that there actually is something specific in the design of the compiler that's causing trouble. To figure out what that is, let's revert to the original grammar for a moment and imagine that the code sample had a valid assignment of $path = array( 2 );.

Using the grammar as a guide, it's possible to walk through the actions invoked in the compiler code when parsing this code sample. I've left some less important parts out, but the process looks something like this:

// ...
// Begins the class declaration
zend_do_begin_class_declaration(znode, "Foo", znode);
    // Set some modifiers on the current znode...
    // ...
    // Create the array
    array_init(znode);
    // Add the value we specified
    zend_do_add_static_array_element(znode, NULL, 2);
    // Declare the property as a member of the class
    zend_do_declare_property('$path', znode);
// End the class declaration
zend_do_end_class_declaration(znode, "Foo");
// ...
zend_do_early_binding();
// ...
zend_do_end_compilation();

While the compiler does a lot in these various methods, it's important to note a few things.

  1. A call to zend_do_begin_class_declaration() results in a call to get_next_op(). This means that it adds a new opcode to the current opcode array.
  2. array_init() and zend_do_add_static_array_element() do not generate new opcodes. Instead, the array is immediately created and added to the current class' properties table. Method declarations work in a similar way, via a special case in zend_do_begin_function_declaration().
  3. zend_do_early_binding() consumes the last opcode on the current opcode array, checking for one of the following types before setting it to a NOP:
    • ZEND_DECLARE_FUNCTION
    • ZEND_DECLARE_CLASS
    • ZEND_DECLARE_INHERITED_CLASS
    • ZEND_VERIFY_ABSTRACT_CLASS
    • ZEND_ADD_INTERFACE

Note that in the last case, if the opcode type is not one of the expected types, an error is thrown – The "Invalid binding type" error. From this, we can tell that allowing the non-static values to be assigned somehow causes the last opcode to be something other than expected. So, what happens when we use a non-static array with the modified grammar?

Instead of calling array_init(), the compiler prepares the arguments and calls zend_do_init_array(). This in turn calls get_next_op() and adds a new INIT_ARRAY opcode, producing something like the following:

DECLARE_CLASS   'Foo'
SEND_VAL        '.'
DO_FCALL        'realpath'
INIT_ARRAY

Herein lies the root of the problem. By adding these opcodes, zend_do_early_binding() gets an unexpected input and throws an exception. As the process of early binding class and function definitions seems fairly integral to the PHP compilation process, it can't just be ignored (though the DECLARE_CLASS production/consumption is kind of messy). Likewise, it's not practical to try and evaluate these additional opcodes inline (you can't be sure that a given function or class has been resolved yet), so there's no way to avoid generating the opcodes.

A potential solution would be to build a new opcode array that was scoped to the class variable declaration, similar to how method definitions are handled. The problem with doing that is deciding when to evaluate such a run-once sequence. Would it be done when the file containing the class is loaded, when the property is first accessed, or when an object of that type is constructed?

As you've pointed out, other dynamic languages have found a way to handle this scenario, so it's not impossible to make that decision and get it to work. From what I can tell though, doing so in the case of PHP wouldn't be a one-line fix, and the language designers seem to have decided that it wasn't something worth including at this point.

Hammon answered 22/10, 2010 at 20:4 Comment(4)
Thank you! The answer to when to evaluate points out the obvious flaw in PHP's attribute default syntax: you shouldn't be able to assign to it at all, it should be set in the object constructor. Ambiguity resolved. (Do objects try to share that constant?) As for static attributes, there is no ambiguity and they could be allowed any expression. That's how Ruby does it. I suspect they didn't remove object attrib defaults because, lacking a class constructor, there's no good way to set a class attrib. And they didn't want to have separate allowances for object vs class attrib defaults.Substantialize
@Schwern: Happy to help! This is something that I had be curious about in the past but never thought to check out in detail, so this was a good opportunity to figure out what exactly was going on. In regard to the assignment, allowing this kind of assignment avoids forcing you to create a constructor if you don't "need" one...which I feel would be a terrible justification, though in the case of PHP, not a shocking one. I think each instance will replicate default property values on creation, but I might be mistaken, so it's possible that they do try to share.Hammon
In any case, the savings gained by doing so (given the limited data you can assign in the first place) would be minimal, so I'm not sure it would be worth having this setup. As far as your comments about resolving the ambiguity go, I'm inclined to agree.Hammon
There must be a PHP core developer around here on SO. Who else would deal out a -1 to this answer?Kinematics
O
23

My question is: why?! Is this a "feature" or sloppy implementation?

I'd say it's definitely a feature. A class definition is a code blueprint, and not supposed to execute code at the time of is definition. It would break the object's abstraction and encapsulation.

However, this is only my view. I can't say for sure what idea the developers had when defining this.

Orellana answered 18/10, 2010 at 14:50 Comment(12)
+1 I agree, for example if i say: public $foo = mktime() will it save the time from when the class is parsed, constructed, or when its tried to accessed static.Bruckner
As mentioned, it is not defined when the expression will be evaluated. However, you should be able to assign a closure to an attribute - which could return the time without ambiguity - but that yields a syntax error as well.Shoulder
So its a bit of BDSM language design in an otherwise very permissive language and implemented as a syntax error?Substantialize
@Substantialize ahahahahahahaha! I guess you can put it that way :)Orellana
Sorry, I tried to edit it to be less argumentative but ran out of time. What I wanted to say ways: I'd like to see a citation for that rationale. That level of BDSM seems wildly out of place in a dynamic language and in PHP in particular. Also, how does executing code at definition time break either abstraction or encapsulation? A class definition doesn't have to be exactly the same every run.Substantialize
@Substantialize good points. I don't have a citation for the rationale - as said, it makes architectural sense from my point of view and it's probably safe to say this was a conscious decision, seeing as function results can be used almost everywhere else. But I do not know what discussion there was internally when this was decided. If I get around to it, I'll do some digging on the PHP internals mailing listOrellana
@Bruckner That's like removing all the knives and stoves from the kitchen so none of the chefs cut themselves or get burnt. Its very safe, but you can't get much cooking done. Trust your chefs not to be complete idiots.Substantialize
@Substantialize I did some searching on the internals mailing list marc.info/?l=php-internals but didn't find anything. I bet it's somewhere in there but one has to find the right words....Orellana
@Orellana Thank you for your efforts.Substantialize
@Schwern: the thing that runs code when instance of the class is created is called "constructor" ;) That's the reason PHP does not allow to run code in other places - you run it in the constructor. Each language has structure, this is a part of it.Mashie
@Mashie what about static classes? (abstract final classes) they don't have a constructorChristcrossrow
@Christcrossrow if you have a static class, having data in it is usually not the best idea, but if you absolutely need it, then you have to have some code that initializes it, which would be a kind of constructor for it. Though I personally would then just recommend making it a regular class.Mashie
S
7

You can probably achieve something similar like this:

class Foo
{
    public $path = __DIR__;
}

IIRC __DIR__ needs php 5.3+, __FILE__ has been around longer

Sacellum answered 18/10, 2010 at 14:57 Comment(2)
Good point. This works because it's a magic constant and will be replaced at the time of parsingOrellana
Thank you, but the example was only for illustration.Substantialize
N
5

It's a sloppy parser implementation. I don't have the correct terminology to describe it (I think the term "beta reduction" fits in somehow...), but the PHP language parser is more complex and more complicated than it needs to be, and so all sorts of special-casing is required for different language constructs.

Nectareous answered 18/10, 2010 at 14:51 Comment(5)
Do other languages allow this? I'm curious because I honestly don't know. If I remember correctly, Pascal/Delphi doesn't.Orellana
@Pekka: Static languages usually don't, since a class in them is almost always only a compiler construct. But with dynamic languages, the class is created when the definition is executed so there's no reason that they can't use the return value of the function at that time as the value for the attribute.Nectareous
@Ignacio cheers. Okay, that's true. I still think it's a good thing overall, because it enforces good OOP principles.Orellana
@pekka Perl 6 can do this, here (dl.dropbox.com/u/7459288/Perl%206%20Examples/Person.p6 ) is an example.Lipoid
Yes, other dynamic languages allow this. Ruby, Perl 5 (via many means), Perl 6 and Python (I'm pretty sure). Either the PHP language designers got hit on the head and thought they were programming Java, or its an implementation limitation.Substantialize
Y
2

My guess would be that you won't be able to have a correct stack trace if the error does not occur on an executable line... Since there can't be any error with initializing values with constants, there's no problem with that, but function can throw exceptions/errors and need to be called within an executable line, and not a declarative one.

Yesteryear answered 18/10, 2010 at 14:52 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.