Default/correct context for HTML href attributes in Sightly
Asked Answered
H

3

10

I'm using Sightly and while investigating a bug in my application I noticed a behaviour I didn't expect.

Some of the links would render with ampersands in the query string escaped twice. Example:

<a href="http://www.google.com?a=1&amp;amp;b=2&amp;amp;c=3">
    link with explicit attribute context
</a>

Upon closer inspection, it turned out we had an org.apache.sling.rewriter.Transformer implementation escaping special characters in all href attributes running in AEM.

Coupled with Sightly XSS protection, this resulted in double escapes.

While investigating this further, I disabled the transformer and noticed a strange behaviour in Sightly itself.

The attribute context and the default context in href attributes don't match

Given the following three elements, I'd expect them to render the href value in the same way (with the query string escaped, consistent with W3C standards)

<a href="${'http://www.google.com?a=1&b=2&c=3'}">no explicit context, expression used</a>
<a href="http://www.google.com?a=1&b=2&c=3">no explicit context</a>
<a href="${'http://www.google.com?a=1&b=2&c=3' @ context='attribute'}">
    explicit attribute context
</a>

However, only the last one performs the escaping and I get

<a href="http://www.google.com?a=1&b=2&c=3">no explicit context, expression used</a>
<a href="http://www.google.com?a=1&b=2&c=3">no explicit context</a>
<a href="http://www.google.com?a=1&amp;amp;b=2&amp;amp;c=3">
    explicit attribute context
</a>

For some reason, the the last one, using context='attribute' (the only one that does something with the & characters) escapes the ampersands twice, yielding invalid links.

This can be achieved with arbitrary element and attribute names so I think I can safely assume this is not some rewriter kicking in.

<stargate data-custom="${'http://www.google.com?a=1&b=2&c=3' @ context='attribute'}">
    attribute context in custom tag
</stargate>

Outputs:

<stargate data-custom="http://www.google.com?a=1&amp;amp;b=2&amp;amp;c=3">
    attribute context in custom tag
</stargate>

Furthermore, the Display Context Specification gave me the impression that the context, when rendering an attribute, would be picked up automatically as attribute

To protect against cross-site scripting (XSS) vulnerabilities, Sightly automatically recognises the context within which an output string is to be displayed within the final HTML output, and escapes that string appropriately.

Is the observed behaviour here to be expected or am I looking at a potential bug in Sightly?

Which context should I be using here? All contexts apart from attribute ignore the fact that query strings should be escaped in href. attribute on the other hand appears to be doing this twice. What's going on?

I'm using Adobe Granite Sightly Template Engine (compatibility)io.sightly.bundle 1.1.72

The uri context does not escape query strings in the way expected in HTML5 href attributes

I did also try using

<a href="${'http://www.google.com?a=1&b=2&c=3' @ context='uri'}">explicit uri context</a>

But it fails to escape the & chars, resulting in invalid HTML5.

<a href="http://www.google.com?a=1&b=2&c=3">explicit uri context</a>

Result of validation as HTML5:

Error Line 70, Column 35: & did not start a character reference. (& probably should have been escaped as &.)

<a href="http://www.google.com?a=1&b=2&c=3">explicit uri context</a>

The html context correctly renders links with multiple query parameters in href attributes

It seems the only context I could possibly use here at the moment is html (text escapes & twice, just like attribute)

<a href="${'http://www.google.com?a=1&b=2&c=3' @ context='html'}">explicit html context</a>

yields

<a href="http://www.google.com?a=1&amp;b=2&amp;c=3">explicit html context</a>

Changing to this context would allow me to get the right value in the href, as rendered by the browser. However, it doesn't seem to have the correct semantics.

To quote the description of the html context from the Sightly spec:

Use this in case you want to output HTML - Removes markup that may contain XSS risks

House answered 11/3, 2016 at 11:15 Comment(7)
What version of the org.apache.sling.scripting.sightly bundle do you have on your system? I suspect version 1.0.2, but just want to confirm this.Macrogamete
@RaduCotescu that is correct. We're using 1.0.2, would an upgrade help?House
No, the behaviour hasn't changed. For src and href attributes Sightly uses the uri XSS escaping context [0].Macrogamete
@RaduCotescu I see. I'll use html as a hack to make it work for now but I think this needs a wider discussion. I'll raise a ticket in the Sling Jira as soon as I have a moment to spare.House
I think there's no need. I'll post an answer to your question since the comment field doesn't allow the full length of what I want to write.Macrogamete
@RaduCotescu excellent, thanksHouse
Let us continue this discussion in chat.Macrogamete
M
2

For src and href attributes Sightly uses the uri XSS escaping context 1, 2.

Furthermore, the following markup is HTML5 valid using the validator from 3:

<!DOCTYPE html>
<html>
<head>
    <title>Title</title>
</head>
<body>
    <a href="http://www.google.com?a=1&b=2&c=3">explicit uri context</a>
</body>
</html>

Can you please point me to the spec regarding HTML 5 query strings escaping for HTML attributes?

Macrogamete answered 14/3, 2016 at 11:6 Comment(2)
It didn't occur to me to question the validator. I'm using one deployed within our own infrastructure to avoid sending my client's markup outside secure networks. Good catch. I'll investigate further.House
This was a bug in the validator we were using (we had it updated). Thanks againHouse
M
4

The href attribute uses the uri context rather than the attribute context. The attribute context is meant to be used for HTML attributes such as title, id, data-*, etc... Concerning your three examples:

<a href="${'http://www.google.com?a=1&b=2&c=3'}">link without explicit context, expression used</a>
<a href="http://www.google.com?a=1&b=2&c=3">link without explicit context</a>
<a href="${'http://www.google.com?a=1&b=2&c=3' @ context='attribute'}">link with explicit attribute context</a>

The first is using the uri context. The seconds isn't using Sightly at all. The third is misusing the attribute context.

The unsafe context should be avoided if at all possible.

Sightly doesn't currently escape the ampersand in the uri context as you would like. You should submit an Adobe Daycare ticket or contact the Apache Sling distribution list with your request.

Molybdate answered 11/3, 2016 at 20:36 Comment(0)
M
2

For src and href attributes Sightly uses the uri XSS escaping context 1, 2.

Furthermore, the following markup is HTML5 valid using the validator from 3:

<!DOCTYPE html>
<html>
<head>
    <title>Title</title>
</head>
<body>
    <a href="http://www.google.com?a=1&b=2&c=3">explicit uri context</a>
</body>
</html>

Can you please point me to the spec regarding HTML 5 query strings escaping for HTML attributes?

Macrogamete answered 14/3, 2016 at 11:6 Comment(2)
It didn't occur to me to question the validator. I'm using one deployed within our own infrastructure to avoid sending my client's markup outside secure networks. Good catch. I'll investigate further.House
This was a bug in the validator we were using (we had it updated). Thanks againHouse
A
0

You can use 'unsafe' context whenever everything else fails.

Aubervilliers answered 11/3, 2016 at 14:35 Comment(3)
I can achieve the desired result with attribute, there's absolutely no need for me to resort to unsafe. I'm asking about the canonical way to render this kind of links and if the observed default behaviour is correct or erroneous.House
I don't think there is anything canonical here. If 'url' doesn't work for you, use whatever works.Aubervilliers
That's what I did for the time being. But I'd rather discuss this with the people who know Sightly better than me and consider an improvement over this approach and maybe an improvement to Sightly itself.House

© 2022 - 2024 — McMap. All rights reserved.