PHP formal semantics?
Asked Answered
D

5

8

I am tasked with learning PHP, but there are many things I don't understand. For example, the concept of "variable functions" is not one I've seen anywhere else. There are many other examples, but for brevity, I found PHPWTF, which has many examples of PHP's idiosyncrasies.

Most other languages I've used have either a formal specification (e.g., Haskell 2010) or at least a research paper on their formal semantics (e.g., this for Javascript). However, I can't find anything comparable for PHP.

There is an official "language reference". However, it is very informal, reads like a wiki, and is missing entire sections (e.g., the section on syntax doesn't define the syntax at all). Confirming what I suspected, this guy tells me that there is no official specification, nor even a defined syntax.

Wikipedia has an article on "PHP syntax and semantics", but it only touches on the syntax, and barely mentions semantics.

One paper I've found on PHP is this paper on its assignment semantics. This is a very small fragment of the language and probably not much use to me without some context. There is also this paper on 'SaferPHP', which presumably has to work with some definition of PHP, though I couldn't see any.

Interpreters/compilers provide a semantics, so I thought to look at these. However, the Zend source is intimidating (though it does provide useful test cases), and HipHop runs to 2.7 million LoC. (I find it amazing that people have poured enormous effort into writing compilers for a language without ever writing something like a specification.)

I thought of looking at type systems for PHP for guidance, much like TypeScript provides some guidance for JavaScript. I found these tantalising slides on Hack, an optional type system for PHP. However, it's just slides, and the project seems to be an internal one at Facebook at this time.

Does anyone know of anything better than these poor man's semantics? Or does everyone just "learn by example"?

Dovelike answered 15/12, 2013 at 14:39 Comment(4)
PHP's semantics are defined AFAIK by the behavior of the official implementation.Gutbucket
"Formal Semantics"? Don't make me laugh. As Tim said, it is defined by what it does. And what it does grew "organically" (a euphemism for "ooh, lets add this cool feature", which as far as I can tell is was largely motivated by other-language-envy).Volga
@IraBaxter: yes, it seems that way. Though I think the same is true of Javascript, and it has been amenable to some formalization. Perhaps PHP could also be.Dovelike
PHP is best summed up by its creator Rasmus Lerdorf: "I have absolutely no idea how to write a programming language, I just kept adding the next logical step on the way."Jagir
R
2

This answer comes a bit after your initial question, but now we finally have a formal semantics for PHP. Check it out: http://www.phpsemantics.org. A paper about it has been recently published in the ECOOP 2014 proceedings, if you are interested you can find the link in the webpage I linked. Regards.

Rust answered 25/7, 2014 at 13:48 Comment(0)
B
3

It seems that you're not after an official standard (which might be useful, for example, to someone writing an independent conforming implementation), but for a presentation of the language that will allow you to make coherent sense of it. Unfortunately there cannot be such a thing, because PHP does not have a coherent formal model behind it. It has grown organically and is now saddled with inconsistencies, most notoriously in function and method naming but also in little details like what counts as true and false, and other similarly worrisome details.

The best one can do to approach PHP, in my opinion, is to get a good feel for the core features and libraries, for the "gotcha's" that you need to watch out for, and (in order to read existing code without distraction) for the anti-patterns that are all too common in real-world PHP scripts. My guess is that it's best to learn PHP under the tutelage of people who know how to work with it effectively, but I didn't have that luxury. (Regarding the documentation: It took me forever before I noticed that you can use square brackets to index into strings. The feature may be mentioned somewhere in the documentation, but not, back then at least, anyplace where it belongs.)

This article gives a nice tour of the kind of things that make a semantic model of the kind you want impossible. (You may want to skip the opening rant and go straight to the discussion of PHP features.) There are many, many other similar texts. Quote: "PHP was originally designed explicitly for non-programmers (and, reading between the lines, non-programs); it has not well escaped its roots."

Don't get me wrong: I work with PHP, and although it's not my favorite language, I wouldn't say I hate it. I would say that to work effectively with it, one must be aware of its nature and limitations. If you're coming to this from Haskell, you're in for quite a shock.

Breastbeating answered 15/12, 2013 at 14:57 Comment(3)
It definitely seems to have bad/complex parts. I was hoping for something like "Featherweight PHP" (like "Featherweight Java", or "The Essence of PHP" (like "The Essence of Javascript"). Something which strips away the complex features in order to explain the essential semantics rigorously. Javascript is lampooned in similar ways to PHP, but it turned out to be possible to strip it down to a minimal language for rigorous analysis.Dovelike
I was just about to ask the same question as the OP. (I have some background in PHP and wanted to deepen my understand of the language). I think your answer is missing the point. Yes, there may not be any "formal" semantics as such, but there still is a certain way in which certain critical things behave - thinks like argument passing, iterators, scopes, etc. And although a fair share of PHP programmers makes due without that knowledge, I'm still fairly certain it is documented in some way somewhere.Obedient
@Obedient I'm not making the point that there's no documentation, but that the behaviors to be documented are not coherent in the sense that they fall under a unifying model that you can "get". Every aspect is different. Scopes work one way, namespaces another. There are plenty of overviews, but any documentation is going to be a big list of some sort, not a unifying model. I could be wrong about this, but I don't believe I'm missing the point.Breastbeating
R
2

This answer comes a bit after your initial question, but now we finally have a formal semantics for PHP. Check it out: http://www.phpsemantics.org. A paper about it has been recently published in the ECOOP 2014 proceedings, if you are interested you can find the link in the webpage I linked. Regards.

Rust answered 25/7, 2014 at 13:48 Comment(0)
H
1

Doesn't directly address your inquiry, but explains some of the magic behind PHP variables.

http://webandphp.com/how-php-manages-variables

Housel answered 15/12, 2013 at 15:12 Comment(1)
For now, I've accepted this as the answer, as it actually provides useful information, while the other answers just dismiss the question (on questionable grounds).Dovelike
C
1

Interesting question. I'd regard the manual as the official language reference; I appreciate it isn't quite "formal reference" in the sense you are seeking, but I don't know how much such a thing would be widely desired as something to learn from.

I'm not familiar with PHPWTF, but I'd guess it is in the same mould as the blog post Fractal Of Bad Design (linked by @alexis earlier). I can't peer into the mind of either author, but it seems to me that they are written from the perspective of wanting PHP to be bad. Religious wars frequently dominate on the internet and in programming — the browser you prefer, the IDE/editor you use, your operating system and your choice of framework have all had the same ferocious, partisan and unyielding treatment. Programming languages are, sadly, no different.

It is certainly true that PHP does have a number of design inconsistencies, in particular about how nulls are treated, and in the ordering of parameters in standard functions. However, it is also true that PHP has been hugely successful, despite all that. It spent a long time in the reliability doldrums in 5.0 and 5.1, 5.2 was stable but arguably not enterprise, and it's finally coming of age in 5.3 onwards.

Whilst this might be my biases emerging, I sense a consensus amongst users I read on Stack Overflow that all of the popular languages have their place. This is partly a response to the reality that the ones we dislike won't go away, and partly perhaps that learning .net, Java, Perl, Ruby, PHP, Python etc is pretty much always a good thing. Maybe we have also collectively tired of the flame-wars over each (Java is bloated, PHP is inconsistent, Microsoft is vendor lock-in, Rails is unstable, and so forth).

I've veered rather off-topic, but I tend to regard this particular viewpoint as worth reading, especially for those who would be traditionally minded to disagree with it in relation to PHP.

To address the purpose of your question, how should you learn? Well, learning by example is an excellent approach - one just needs to know which examples to learn. Searching for "PHP tutorial" and "PHP beginner" will — perhaps as is the case with any language — offer a mix of excellent and dreadful material. One might argue that PHP's low barriers to entry have given rise to a large stock of insecure and badly written "how to" articles, and I've certainly seen quite a few!

I think the solution is to look directly at code from well-engineered projects, and to learn from there. Such as:

  • Symfony2 (and Components)
  • Zend Framework
  • Guzzle
  • Propel
  • Doctrine

Ah, nearly forgot; this website is also a good place to start.


Post Script: they may be referred to by a different name in other languages, but I expect they all have variable functions. In JavaScript for example, it's object[myFunc]();, where myFunc is a string.

Chess answered 15/12, 2013 at 20:28 Comment(5)
+1 for the link to phptherightway, it looks like a nice resource! That said, the OP was not asking for good documentation but for a "formal semantics" for PHP. Phptherightway highlights the best features of PHP (not the worst, as the link I provided), but I didn't see anything that would qualify as a formal model of the language.Breastbeating
Thanks for your thoughts. I agree that a formal definition was part of the question, but so was 'does everyone just "learn by example"'? I interpreted that, rightly or wrongly, as "what is a good way to learn". FWIW, I think formal/BNF specs are terrible to learn from, but then it probably varies from one individual to another.Chess
I can't imagine learning from the BNF specs, but how about a text description saying "this is an expression, this is a statement, this is an lvalue, this is a reference, this is what you get as you combine them"? The python documentation does a good job with that, for example. But what reasonable formal model can capture the fact that in PHP, func_get_args() cannot be used as a function argument?Breastbeating
I wasn't aware of that limitation, but the manual says that that limitation no longer exists, from 5.3 onwards.Chess
Glad to hear that! My point will no longer be true for any PHP version that fixes the bulk of this kind of quirks (and there are, or were, way too many).Breastbeating
D
0

It's not exactly a formal semantics, but, after all these years, the HHVM project has produced a PHP specification!

Dovelike answered 31/7, 2014 at 8:55 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.