Best practices for adding semantics to a website
Asked Answered
H

2

6

I am a bit confused about the semantics of websites. I understand that every URI should represent a ressource. I assume that all information provided by RDFa inside a webpage describes the ressource represented by the URI of that webpage. My question is: What are best practices for providing semantic data for subpages of a website.

In my case I want to create a website for a theater group called magma using RDFa with schema.org and opengraph vocabularies. Let's say I have the welcome page (http://magma.com/), a contact page (http://magma.com/contact/) and pages for individual plays (http://magma.com/play/<playid>/).

Now I would think that both the welcome page and the contact page represent the same ressource (magma) while providing different information about that ressource. The play pages however represent plays that only happen to be performed by magma. Or is it better to say that the play pages also represent magma but providing information about plays which will be performed by that group? The third option I stumbled upon is http://schema.org/WebPage. Especially subtypes like ContactPage seems to be relevant.

When it comes to implementation, where do I put the RDFa?

And finally: How will my choice change the way the website is treated by 3rd parties (google, facebook, ...)?

I realize this question is a bit blurry. To make it more concrete I will add an example that you might critizise:

<html vocab="http://schema.org/" typeof="TheaterGroup">
  <head>
    <meta charset="UTF-8"/>
    <title>Magma - Romeo and Juliet</title>

    <!-- magma sematics from a template  file -->
    <meta property="name" content="Magma"/>
    <meta property="logo" content="/static/logo.png"/>
    <link rel="home" property="url" content="http://magma.com/"/>
  </head>

  <body>
    <h1>Romeo and Juliet</h1>

    <!-- semantics of the play -->
    <div typeof="CreativeWork" name="Romeo and Juliet">
      ...
    </div>

    <h2>Shows</h2>

    <!-- samantics of magma events -->
    <ul property="events">
      <li typeof="Event"><time property="startDate">...</time></li>
      ...
    </ul>
  </body>
</html>
Hamon answered 14/4, 2013 at 14:19 Comment(0)
M
12

I understand that every URI should represent a ressource. I assume that all information provided by RDFa inside a webpage describes the ressource represented by the URI of that webpage.

Well, a HTTP URI could identify the page itself OR the thing the page is about. You can't tell if an URI identifies the page or the thing by simply looking at it.

Example (in Turtle syntax):

<http://en.wikipedia.org/wiki/The_Lord_of_the_Rings> ex:author "John Doe"

This could mean that the HTML page with the URI http://en.wikipedia.org/wiki/The_Lord_of_the_Rings is authored by "John Doe". Or it could mean that the thing described by that HTML page (→ the novel) is authored by "John Doe". Of course this is an important difference.

There are various ways to differentiate what an URI represents, and there is some dispute about it. The discussion around this is known as httpRange-14 issue. See for example the Wikipedia article Web resource.

One way is using hash URIs (see also this answer). Example: http://magma.com/play/42 could identify the page about the play, http://magma.com/play/42#play could identify the play.

Another way is using HTTP status code 303. The code 200 gives the representation of the page about the thing, the code 303 See Other gives an additional URI identifying the thing. This method is used by DBpedia:

See Choosing between 303 and Hash.

Now, when using RDFa, you can make statements about both, the page itself and the thing represented by the page. Just use the corresponding URI as subject (e.g., by using the resource attribute).

So let's say http://magma.com/#magma represents the theater group. Now you could use this URI on every page (/contact, /play/, …) to make statements about the group resp. to refer to the group.

<div resource="http://magma.com/#magma">
  <span property="ex:name">Magma</span>
</div>

<div resource="http://magma.com/">
  <span property="ex:name">Website of Magma</span>
</div>
Mariettamariette answered 15/4, 2013 at 13:59 Comment(2)
great answer. Could I ask you to expand on how 3rd parties like google and facebook will interpret information provided in this way? Which ressource will they choose to display?Hamon
@tobib: This shouldn't have any effect which URIs they use for their search results etc., as they usually are interested in pages, not the things they might represent. However, of course services might interpret/understand the statements about the things you give and do with that information whatever they like to do. I don't know Facebook, but AFAIK they only use the Open Graph vocabulary. Google probably only uses the documented vocabularies. But I don't their service well.Mariettamariette
P
2

I suggest that you first look at the schema.org straightforward documentation. This vocabulary is very comprehensive for your concerns and supported by the major search engines.

Here is a snippet example for you to get started, you can include this straight in an HTML page. When you speak about the performance of the play on a page you could use:

<div itemscope itemtype="http://schema.org/TheaterEvent">
  <h1 itemprop="name">Romeo and Juliet</h1>
  <span itemprop="location">Council Bluffs, IA, US</span>
  <meta itemprop="startDate" content="2011-05-23">May 23
  <a href="/offers.html" itemprop="offers">Buy tickets</a>
</div>

On your contact page you could include:

<div itemscope itemtype="http://schema.org/TheaterGroup">
  <span itemprop="name">Magma</span>
  Tel:<span itemprop="telephone">( 33 1) 42 68 53 00 </span>
</div>
Predicable answered 14/4, 2013 at 16:21 Comment(2)
Thanks, but I don't think this answers my question. First, I want to use RDFa, not Microdata. But more important I want to know about semantics with websites consisting of several pages. Your examples only show how to mark up single bits of information.Hamon
Schema.org vocabulary is also representable as RDFa: schema.org/docs/datamodel.html. You should add the semantics all the time the information of interest is appearing in a web page, you can add as many entities as you want in one page, just look at the examples. Traditional web pattern design (like MVC) can help you to maintain the content of your HTML pages but it's out of the scope of this discussion.Predicable

© 2022 - 2024 — McMap. All rights reserved.