Today, in a blog post entitled More options to help websites preview their content on Google Search, Google announced new behaviour for the Google search engine. The part that interests me is that Googlebot will now interpret the HTML attribute data-nosnippet
like this:
A new way to help limit which part of a page is eligible to be shown as a snippet is the "
data-nosnippet
" HTML attribute onspan
,div
, andsection
elements. With this, you can prevent that part of an HTML page from being shown within the textual snippet on the [Google search engine results page].For example:
<p><span data-nosnippet>Harry Houdini</span> is undoubtedly the most famous magician ever to live.</p>
I am surprised that they chose to use an attribute beginning with the prefix data-
. This is what the HTML living standard by WHATWG says about data-
attributes (emphasis mine):
A custom data attribute is an attribute in no namespace whose name starts with the string "
data-
" [...]Custom data attributes are intended to store custom data, state, annotations, and similar, private to the page or application, for which there are no more appropriate attributes or elements.
As a web developer, I always thought that the point of the data-
prefix was to give web developers a namespace intended just for their CSS and scripts to manipulate. A custom HTML attribute without the data-
prefex is not future-proof, it may suddenly have meaning in browsers of the future or in search engine bots of the future.
It looks like Googlebot is breaking this convention, and is now choosing to look for and interpret the data-nosnippet
HTML attribute. As web developers, we can no longer be confident that data-
attributes are "private to the page or application", maybe Google will do this again for another data-
attribute in the future!
- Is my interpretation correct?
- Is Googlebot the first to interpret
data-
attributes this way, or has the ship sailed and are browsers and bots interpretingdata-
attributes already?