Marking up a search result list with HTML5 semantics
Asked Answered
C

4

26

Making a search result list (like in Google) is not very hard, if you just need something that works. Now, however, I want to do it with perfection, using the benefits of HTML5 semantics. The goal is to define the defacto way of marking up a search result list that potentially could be used by any future search engine.

For each hit, I want to

  • order them by increasing number
  • display a clickable title
  • show a short summary
  • display additional data like categories, publishing date and file size

My first idea is something like this:

<ol>
  <li>
    <article>
      <header>
        <h1>
          <a href="url-to-the-page.html">
            The Title of the Page
          </a>
        </h1>
      </header>
      <p>A short summary of the page</p>
      <footer>
        <dl>
          <dt>Categories</dt>
          <dd>
            <nav>
               <ul>
                  <li><a href="first-category.html">First category</a></li>
                  <li><a href="second-category.html">Second category</a></li>
                </ul>
            </nav>
          </dd>
          <dt>File size</dt>
          <dd>2 kB</dd>
          <dt>Published</dt>
          <dd>
            <time datetime="2010-07-15T13:15:05-02:00" pubdate>Today</time>
          </dd>
        </dl>
      </footer>
    </article>
  </li>
  <li>
    ...
  </li>
  ...
</ol>

I am not really happy about the <article/> within the <li/>. First, the search result hit is not an article by itself, but just a very short summary of one. Second, I am not even sure you are allowed to put an article within a list.

Maybe the <details/> and <summary/> tags are more suitable than <article/>, but I don't know if I can add a <footer/> inside that?

All suggestions and opinions are welcome! I really want every single detail to be perfect.

Christianly answered 15/7, 2010 at 11:31 Comment(6)
“I really want every single detail to be perfect.” Good on you, but you’re talking about semantics, i.e. meaning. There’s no such thing as perfect meaning. Meaning is just an agreement between people that something represents something else.Jural
That makes sense. What I want is a perfect template for such an agreement on how to mark up a search result list. It should be perfectly clear between people (or robots) that it is a search result list and nothing else.Christianly
“It should be perfectly clear between people (or robots) that it is a search result list and nothing else.” As it's not predefined what a search result list should look like, I think many people can just guess it's such a thing, especially if you make it look like one (but that's a CSS issue), but you can never be sure a robot will dissect it as a search result list. It could represent a list of articles on your site as well, no matter which HTML 5 elements you use.Contraption
But now I am aiming to find the best way of making it as close a perfect solution as possible. I want to make the solution solid enough to show it to the world, making others willing to adopt it. Eventually, I want it to become the defacto way of marking up a search result list, even known to robot developers. I know I probably won't reach to that point, but anyway that is my ambition.Christianly
That's a big, but noble ambition. BTW, please use @user-name to address people in comments, so they are notified. See How do comment replies work?Contraption
Adding to Marcel's (and other people's) points, semantic html will not be as helpful in your case. You can, however, use other things that are more suited. Microformat and JSON-LD come to my mind (fun note: both of those were defined on the same year the question was posted :)).Fibered
C
28

1) I think you should stick with the article element, as

[t]he article element represents a self-contained composition in a document, page, application, or site and that is intended to be independently distributable or reusable [source]

You merely have a list of separate documents, so I think this is fully appropriate. The same is true for the front page of a blog, containing several posts with titles and outlines, each in a separate article element. Besides, if you intend to quote a few sentences of the articles (instead of providing summaries), you could even use blockquote elements, like in the example of a forum post showing the original posts a user is replying to.

2) If you're wondering if it's allowed to include article elements inside a li element, just feed it to the validator. As you can see, it is permitted to do so. Moreover, as the Working Draft says:

Contexts in which this element may be used:

Where flow content is expected.

3) I wouldn't use nav elements for those categories, as those links are not part of the main navigation of the page:

only sections that consist of major navigation blocks are appropriate for the nav element. In particular, it is common for footers to have a short list of links to various pages of a site, such as the terms of service, the home page, and a copyright page. The footer element alone is sufficient for such cases, without a nav element. [source]

4) Do not use the details and/or summary elements, as those are used as part of interactive elements and are not intended for plain documents.

UPDATE: Regarding if it's a good idea to use an (un)ordered list to present search results:

The ul element represents a list of items, where the order of the items is not important — that is, where changing the order would not materially change the meaning of the document. [source]

As a list of search results actually is a list, I think this is the appropriate element to use; however, as it seems to me that the order is important (I expect the best matching result to be on top of the list), I think that you should use an ordered list (ol) instead:

The ol element represents a list of items, where the items have been intentionally ordered, such that changing the order would change the meaning of the document. [source]

Using CSS you can simply hide the numbers.

EDIT: Whoops, I just realized you already use an ol (due to my fatique, I thought you used an ul). I'll leave my ‘update’ as is; after all, it might be useful to someone.

Contraption answered 19/7, 2010 at 14:24 Comment(5)
Thank you for those clueful opinions! From that point of view, <article> is a good choice for the summary. You are right about <nav> as well. I know, however, that <li> may contain more or less any tags according to the doctype. My question is more like if it is a good way of using lists.Christianly
Is there any semantic benefit to the <ol> in this situation? Articles are scoped to their parent sectioning element so they're already grouped. Are they assumed to be ordered already, like <p>? If so, the only thing the <ol> offers is the 'start' attribute for paging.Outdistance
whatwg.org/specs/web-apps/current-work/multipage/… One of the examples is comments on a blog post, since the ordering of comments could be essential, we could assume that <article>s are already ordered.Outdistance
@Jaffa: I don't agree: comments are usually a kind of waterfall of posts, sorted by date by nature (if I'm clear enough, my vocabulary is not that great at the moment); a search result list is a list in a specific order, with the best result (#1) at the top of the list. Also see the OP's requirement to “order them by increasing number”.Contraption
It's unfortunate that <ol> doesn't allow some other tags to be children of it, like <article>. It could then be understood as an ordered list of articles. <li> in this case seems as arbitrary as the pervasive <div> tags HTML5 is supposed to move us away from.Marianomaribel
R
6

I'd markup it up this way (without using any RDFa/microdata vocabularies or microformats; so only using what the plain HTML5 spec gives):

<ol start="1">

  <li id="1">
    <article>
     <h1><a href="url-to-the-page.html" rel="external">The Title of the Page</a></h1>
     <p>A short summary of the page</p>
     <footer>
       <dl>
         <dt>Categories</dt>
         <dd><a href="first-category.html">First category</a></dd>
         <dd><a href="second-category.html">Second category</a></dd>
         <dt>File size</dt>
         <dd>2 <abbr title="kilobyte">kB</code></dd>
         <dt>Published</dt>
         <dd><time datetime="2010-07-15T13:15:05-02:00">Today</time></dd>
        </dl>
      </footer>
    </article>
  </li>

  <li id="2">
    <article>
     …
    </article>
  </li>

</ol>

start attribute for ol

If the search engine uses pagination, you should give the start attribute to the ol, so that each li reflects the correct ranking position.

id for each li

Each li should get id atribute, so that you can link to it. The value should be the rank/position.

One could think that the id should be given to the article instead, but I think this would be wrong: the rank/order could change by time. You are not referring to a specific result but to a result position.

Remove the header

It is not needed if it contains only the heading (h1).

Add rel="external" to the link

The link to each search result is an external link (leading to a different website), so it should get the rel value external.

Remove nav

The category links are not navigation in scope of the article. So remove the nav.

Each category in a dd

You used:

<dt>Categories</dt>
<dd>
 <ul>
  <li><a href="first-category.html">First category</a></li>
  <li><a href="second-category.html">Second category</a></li>
 </ul>
</dd>

Instead, you should list each category in its own dd and remove the ul:

<dt>Categories</dt>
<dd><a href="first-category.html">First category</a></dd>
<dd><a href="second-category.html">Second category</a></dd>

abbr for file size

The unit in "2 kB" should be marked-up with abbr:

2 <abbr title="kilobyte">kB</code>

Remove pubdate attribute

It's not in the spec anymore.

Other things that could be done

  • give hreflang attribute to the link if the linked result has a different language than the search engine
  • give lang attribute to the link description and the summary if it is in a different language than the search engine
  • summary: use blockquote (with cite attribute) instead of p, if the search engine does not create a summary itself but uses the meta-description or a snippet from the page.
  • title/link description: use q (with cite attribute) if the link description is exactly the title from the linked webpage
Retardment answered 15/8, 2012 at 13:40 Comment(0)
H
1

Aiming for a 'perfect' HTML5 template is futile because the spec itself is far from perfect, with most of the prescribed use-cases for the new 'semantic' elements obscure at best. As long as your document is structured in a logical fashion, you won't have any problems with search engines (most of the new tags don't have the slightest impact). Indeed, following the HTML5 spec to the letter - for example, using <h1> tags within each new sectioning element - may make your site less accessible (to screen readers, for example). Don't strive for 'perfect' or close-to, because it doesn't exist - HTML5 is not thought-out well enough for that. Just concentrate on keeping your markup logical and uncluttered.

Hyphenate answered 27/7, 2013 at 5:5 Comment(0)
C
0

I found a good resource for HTML5 is HTML5Doctor. Check the article archive for practical implementations of the new tags. Not a complete reference mind you, but nice enough to ease into it :)

As shown by the Footer element page, sections can contain footers :)

Critter answered 19/7, 2010 at 11:31 Comment(1)
I have already scanned through those articles, without finding anything relevant. Well, the discussion about the <article> tag has some points, but not enough to answer my question. I understand there is no simple answer, so that's why I want opinions from experienced web developers and markup fetishists.Christianly

© 2022 - 2024 — McMap. All rights reserved.