Why do functional pseudos such as :not() and :has() allow quoted arguments?
Asked Answered
B

1

32

Apparently, as I've discovered while commenting on another answer, jQuery (rather its underlying selector engine Sizzle) lets you quote the argument to the :not() selector as well as the :has() selector. To wit:

$('div:not("span")')
$('span:has("span")')

In the Selectors standard, quotes are always representative of a string and never of a selector or a keyword, so quoting the argument to :not() is always invalid. This will not change in Selectors 4.

You can also see that it's non-standard syntax by adding an unsupported CSS selector such as :nth-last-child(1) causing the selector to fail completely:

$('div:not("span"):nth-last-child(1)')
$('span:has("span"):nth-last-child(1)')

Is there any good reason, technical or otherwise, for allowing quotes here? The only possibilities that come to mind are:

  • Consistency with :contains() which allows both quoted and unquoted arguments, as seen in the old Selectors spec. Except :contains() accepts strings/keywords, not selectors...

  • Consistency with the implementation of custom pseudos using $.expr[':'], which always allows quoted and unquoted arguments.

  • Consistency and ease of porting to their method counterparts .not() and .has() (just remove or split the outer quotes and change colons to periods?).

But I can't find any sources to support or oppose them. In fact, the ability to quote selector arguments itself isn't documented anywhere either, nor does there appear to be any difference between quoting and not quoting the argument:

$('div:not(span)')
$('span:has(span)')
Brod answered 18/9, 2012 at 10:56 Comment(14)
This is likely a quirk of Sizzle, not of jQuery itself.Eire
@Brod Sorry my example was bad, I think sole purpose of quotes is escaping.Halfcaste
It's probably for consistency in the implementation of $.expr[':'].Dyslogia
Not an answer, but jQuery says themselves that they borrow from the CSS selector spec rather than implement it faithfully. Maybe John Resig will stop by with an answer.Valuable
@Explosion Pills: True. In that case, they really should change "CSS3 Compliant" on the home page to "CSS3 Compatible" or something similar as well ;)Brod
Looking at the test suite, I'd say this is nothing that is supposed to work (not a single test for quoted :not or :has) github.com/jquery/sizzle/blob/master/test/unit/selector.jsElburr
@zzzzBov: Could be a side effect of its implementation too, but why do selectors like :eq() and :nth-child() explicitly disallow quotes then? Result of another implementation detail?Brod
@Prinzhorn: No wonder it isn't documented.Brod
Do any jQuery developers have accounts here? I'm pretty certain they're the only ones likely to be able to post an accurate answer, rather than supposition, which is all anyone not involved could post, I think.Johanajohanan
@David Thomas: I know John Resig does, and visits on occasion.Brod
Presumably it would be considered an abuse of your diamond-powers to contact him and request an answer (assuming that his site-registered contact details aren't public)?Johanajohanan
@David Thomas: That is correct. If he's alright with being contacted to answer questions, he'll probably put up his contact details somewhere public and say that it's OK to do so. Then I won't need to have special powers to ask :)Brod
As of jQuery 1.8.2, :eq(...) and the like now accept quotes. From browsing the changelog, this appears to be a side-effect. If curiosity gets the better of me, I may wind up running a git-bisect on Sizzle to find out when exactly this happened.Impugn
Okay, I've bisected Sizzle using a phantomjs script. It does indeed appear that support for :eq("3") was added as a side-effect of a bugfix for a bugfix for :first and :last selectors.` Notes added to my answer.Impugn
I
30

This isn't specific to :not(...) and :has(...) selectors- actually, all pseudos in Sizzle allow for quoted arguments. The pattern for pseudos' arguments is defined as:

pseudos = ":(" + characterEncoding + ")(?:\\((?:(['\"])((?:\\\\.|[^\\\\])*?)\\2|([^()[\\]]*|(?:(?:" + attributes + ")|[^:]|\\\\.)*|.*))\\)|)"

Which can be found on line 91 of sizzle.js as of 831c9c48...

Let's add some indentation to that, to make it a bit more readable. Unfortunately, this is still a regexp, so "a bit more readable" still leaves a lot to be desired:

pseudos = (
    ":(" + characterEncoding + ")" +
    "(?:" +
    "\\(" + // literal open-paren
        "(?:" +

                "(['\"])" + // literal open-quote
                    "((?:\\\\.|[^\\\\])*?)" + // handle backslash escaping
                "\\2" + // close-quote

            "|" + // - OR -

                "(" +
                    "[^()[\\]]*" +
                    "|" +
                    "(?:" +
                        "(?:" + attributes + ")" +
                        "|" +
                        "[^:]" +
                        "|" +
                        "\\\\." +
                    ")*" +
                    "|" +
                    ".*" +
                ")" +

        ")" +
    "\\)" + // literal close-paren
    "|" + // ie, 'or nothing'
")"
);

The main take-away from this is: either single or double-quotes can be used around the argument in a pseudo-attribute. Backslash escaping is properly handled, and so any arbitrary string could be passed in as an argument. Note that the "string" part winds up in the same match index as the "selector" part in the above regexp; so, in short, that is why they are treated equally: because the pseudos pattern does not distinguish between the two. edit: as of jQuery 1.8.2, arguments with and without quotes are more-explicitly equivalent. I cannot seem to find this code in the jQuery git repository [help would be appreciated], but the version of 1.8.2 hosted by google, having the sha1sum of a0f48b6ad5322b35383ffcb6e2fa779b8a5fcffc, has a "PSEUDO": function on line 4206, which does explicitly detect a difference between "quoted" and "unquoted" arguments, and ensures they both wind up in the same place. This logic does not distinguish between the type of pseudo ("positional" or not) which the argument is for.

As Sizzle uses Javascript strings to kick off the selection process, there is no distinction between "string" and "selector" when arguments are passed in to functions. Making that kind of distinction would be possible, but as far as I am aware, what is actually desired is always easily determined from the most basic of context (ie: what type of pseudo is being used), so there is no real reason to make the distinction. (please correct in comments if there are any ambiguous situations which I am unaware of- I'd like to know!).

So then, if the lack of distinction between strings and selectors is a mere implementation detail, why do pseudos such as :eq(...) explicitly reject such selections?

The answer is simple: it doesn't, really. At least, not as of jQuery 1.8.1. [edit: as of jQuery 1.8.2, it doesn't at all. The arguments of "positional" pseudos can be quoted just like anything else. The below notes regarding the implementation details of 1.8.1 are left as a historical curiosity]

Functions such as :eq(...) are implemented as:

"eq": function( elements, argument, not ) {
    var elem = elements.splice( +argument, 1 );
    return not ? elements : elem;
}

At the time that :eq(...) receives the argument, it is still in the form of a bare argument (quotes and all). Unlike :not(...), this argument doesn't go through a compile(...) phase. The "rejection" of the invalid argument is actually due to the shortcut-casting via +argument, which will result in NaN for any quoted string (which in turn, never matches anything). This is yet another implementation detail, though in this case a "correctly" behaving one (again, as far as I am aware. Are there situations where non-numeric arguments to such functions should in fact match?)

edit: As of jQuery 1.8.2, Things have been refactored somewhat, and "positional" pseudos no-longer receive the "raw" argument. As a result, quoted arguments are now accepted in :eq(...) and the like. This change appears to have been a side-effect of another bugfix, as there is no mention of support for quoted arguments in the changelog for af8206ff.., which was intended to fix an error in handling :first and :last, jQuery bug #12303. This commit was found using git bisect and a relatively simple phantomjs script. It is notable that after the Sizzle rewrite in e89d06c4.., Sizzle would not merely fail silently for selectors such as :eq("3"), it would actually throw an exception. That should be taken as yet more evidence that :eq("3") support is not intended behaviour.

There are indeed rationales regarding custom filters, whose arguments could in some cases be thought of as strings, and sometimes as selectors, no matter what they superficially look like, depending on the method in which they are evaluated... but that much is approaching the pedantic. It should suffice to say that not having a distinction at the least makes things simpler when calling functions which, no matter what they may represent, expect a string representation.

In short, the whole situation can be thought of as an implementation detail, and is rooted in the fact that selectors are passed around as strings in the first place (how else would you get them into Sizzle?).

Impugn answered 19/9, 2012 at 22:47 Comment(7)
Haven't you heard of raw strings? Or maybe it's just me and Python...Setula
+1 for taking the time to format the regex and try explaining it (which is a hell of a task). The bounty is yours for the taking.Setula
I think I actually need to step through it all again. Your comment prompted me to give it another glance, and I think I may have mis-read, despite all my formatting efforts. I also haven't actually stepped through the tokenization process since jQuery 1.8.2 was released, so I really ought to make sure the reasoning is still accurate.Impugn
@YatharthROCK: I'm not sure what the mention of raw strings is meant to be in reference to. Can you expand on your meaning? If you mean that I should have used an alternative to "\\(", etc, in the above: the intent was to mimic exactly the content of the original regex string. The only change which I made to it was to split and re-indent the components (in this way, it should be easy to find the corresponding point in the original string). Otherwise, I'd probably use PCRE /x syntax to simply show an equivalent regex, rather than the "same one".Impugn
Raw strings are strings that are interpreted exactly as they are typed — there's no need to escape characters again. They come in handy while dealing with regexes. By [my comment](#12476095, I meant that if you had used raw string, you wouldn't have to escape all the backslashes. It would look a lot better.Setula
@YatharthROCK: JavaScript has no syntax like Python's raw strings or C#'s verbatim strings.Brod
@Brod Ohh... I didn't know. I actually thought the code here was just pseudo-code. Didn't realize the question was about jQuery...Setula

© 2022 - 2024 — McMap. All rights reserved.