Why can't you inherit from a not-yet-defined class which inherits from a not-yet-defined class?
Asked Answered
P

1

28

I research about class compilation, it's sequence, and logic.

If I declare a class before a simple parent:

 class First extends Second{}
 class Second{}

This will work OK. See live example across PHP versions.

But if the parent class also has some not-yet-declared parents (extends or implements), as in this example:

class First extends Second{}
class Second extends Third{}
class Third{}

I will have an error:

Fatal error: Class 'Second' not found ...

See live example across PHP versions.

So, why in the second example it can't find Second class? Maybe php can't compile this class because it need also to compile Third class, or what?

I am trying to find out why in first example, PHP compile class Second, but if it will have some parent classes, it won't. I researched a lot, but nothing exactly.

  • I'm not trying to write code in this way, but in this example I try to understand how compilation and its sequence works.
Pallua answered 14/4, 2015 at 13:42 Comment(3)
You have it the wrong way around. Second should extend First, and Third should extend Second. At least, that is how it is normally done.Cenozoic
Why voted to close the question?I've made a research about this, and nothing clear. I think there should be exact answerPallua
I think this is an interesting question, actually. It probably has to do with the way PHP resolves dependencies, but given that it's consistent in PHP 4, 5, 7, and HHVM, it's presumably something more fundamental than an implementation detail in the Engine. (See 3v4l.org/9WJFq vs 3v4l.org/ZCVWQ)Acaroid
C
34

So, PHP uses something called "late binding". Basically, inheritance and class definition doesn't happen until the end of the file's compilation.

There are a number of reasons for this. The first is the example you showed (first extends second {} working). The second reason is opcache.

In order for compilation to work correctly in the realm of opcache, the compilation must occur with no state from other compiled files. This means that while it's compiling a file, the class symbol table is emptied.

Then, the result of that compilation is cached. Then at runtime, when the compiled file is loaded from memory, opcache runs the late binding which then does the inheritance and actually declares the classes.

class First {}

When that class is seen, it's immediately added to the symbol table. No matter where it is in the file. Because there's no need for late binding anything, it's already fully defined. This technique is called early binding and is what allows you to use a class or function prior to its declaration.

class Third extends Second {}

When that's seen, it's compiled, but not actually declared. Instead, it's added to a "late binding" list.

class Second extends First {}

When this is finally seen, it's compiled as well, and not actually declared. It's added to the late binding list, but after Third.

So now, when the late binding process occurs, it goes through the list of "late bound" classes one by one. The first one it sees is Third. It then tries to find the Second class, but can't (since it's not actually declared yet). So the error is thrown.

If you re-arrange the classes:

class Second extends First {}
class Third extends Second {}
class First {}

Then you'll see it works fine.

Why do this at all???

Well, PHP is funny. Let's imagine a series of files:

<?php // a.php
class Foo extends Bar {}

<?php // b1.php
class Bar {
    //impl 1
}

<?php // b2.php
class Bar {
    //impl 2
}

Now, which end Foo instance you get will depend on which b file you loaded. If you required b2.php you'll get Foo extends Bar (impl2). If you required b1.php, you'll get Foo extends Bar (impl1).

Normally we don't write code this way, but there are a few cases where it may happen.

In a normal PHP request, this is trivial to deal with. The reason is that we can know about Bar while we are compiling Foo. So we can adjust our compilation process accordingly.

But when we bring an opcode cache into the mix, things get much more complicated. If we compiled Foo with the global state of b1.php, then later (in a different request) switched to b2.php, things would break in weird ways.

So instead, opcode caches null out the global state prior to compiling a file. So a.php would be compiled as if it was the only file in the application.

After compilation is done, it's cached into memory (to be reused by later requests).

Then, after that point (or after it's loaded from memory in a future request), the "delayed" steps happen. This then couples the compiled file to the state of the request.

That way, opcache can more efficiently cache files as independent entities, since the binding to global state occurs after the cache is read from.

The source code.

To see why, let's look at the source code.

In Zend/zend_compile.c we can see the function that compiles the class: zend_compile_class_decl(). About half way down you'll see the following code:

if (extends_ast) {
    opline->opcode = ZEND_DECLARE_INHERITED_CLASS;
    opline->extended_value = extends_node.u.op.var;
} else {
    opline->opcode = ZEND_DECLARE_CLASS;
}

So it initially emits an opcode to declare the inherited class. Then, after compilation occurs, a function called zend_do_early_binding() is called. This pre-declares functions and classes in a file (so they are available at the top). For normal classes and functions, it simply adds them to the symbol table (declares them).

The interesting bit is in the inherited case:

if (((ce = zend_lookup_class_ex(Z_STR_P(parent_name), parent_name + 1, 0)) == NULL) ||
    ((CG(compiler_options) & ZEND_COMPILE_IGNORE_INTERNAL_CLASSES) &&
    (ce->type == ZEND_INTERNAL_CLASS))) {
    if (CG(compiler_options) & ZEND_COMPILE_DELAYED_BINDING) {
        uint32_t *opline_num = &CG(active_op_array)->early_binding;

        while (*opline_num != (uint32_t)-1) {
            opline_num = &CG(active_op_array)->opcodes[*opline_num].result.opline_num;
        }
        *opline_num = opline - CG(active_op_array)->opcodes;
        opline->opcode = ZEND_DECLARE_INHERITED_CLASS_DELAYED;
        opline->result_type = IS_UNUSED;
        opline->result.opline_num = -1;
    }
    return;
}

The outer if basically tries to fetch the class from the symbol table and checks if it doesn't exist. The second if checks to see if we're using delayed binding (opcache is enabled).

Then, it copies the opcode for declaring the class into the delayed early binding array.

Finally, the function zend_do_delayed_early_binding() is called (usually by an opcache), which loops through the list and actually binds the inherited classes:

while (opline_num != (uint32_t)-1) {
    zval *parent_name = RT_CONSTANT(op_array, op_array->opcodes[opline_num-1].op2);
    if ((ce = zend_lookup_class_ex(Z_STR_P(parent_name), parent_name + 1, 0)) != NULL) {
        do_bind_inherited_class(op_array, &op_array->opcodes[opline_num], EG(class_table), ce, 0);
    }
    opline_num = op_array->opcodes[opline_num].result.opline_num;
}

TL;DR

Order doesn't matter for classes which don't extend another class.

Any class that is being extended must be defined prior to the point it's implemented (or an autoloader must be used).

Constructionist answered 14/4, 2015 at 14:28 Comment(4)
Since it seems to be valid answer we need to remove other asnwers and comments that are not correct, i really want to have refrence of this and workings it will be better if you can add a link on this(late binding in php, opcache)Sisyphean
@Constructionist Why are those late bindings used for implementing interfaces and extending classes? What is the advantage? Or is this too much for this SO thread?Solnit
@ircmaxell, Great, wide-open answer, that's what I was looking for.Pallua
@Solnit added a section on that.Constructionist

© 2022 - 2024 — McMap. All rights reserved.