Steampunk Semantics@en

Seth Ladd starts his post Is Usefulness Inversely Proportional to Specificity? with the line:

Thinking about microformats and RDF, I’m wondering how microformats might succeed where RDF hasn’t.

Although I think Seth conflates certain separate issues in his post, he has some very good points. There are different measures of success, and the one Seth highlights is that of deployment of metadata on the web. In that context, microformats do seem to be reaching areas that RDF hasn't. But RDF succeeds in offering a relatively simple formal data model that can work in the global web environment, across all subject domains. Semantic Web technologies are gaining considerable deployment, but ironically a lot of this is off-web in life science applications and the like. As Seth suggests, microformats may well be succeeding on the web.

However there's danger in viewing microformats and RDF as different competing solutions to the same problem, that of getting (meta)data on the web. Sure, assuming the vocabularies were available, a particular piece of data could be published just the same using microformats or RDF/XML. But that just the same is significant.

For a start, both microformats and RDF are about essentially about knowledge representation (like most of the computer world, one way or another), how to express things we can know on the machine. When it comes to meaning in the real world, the problem of definition of terms for a particular vocabulary is pretty much the same either way. Arguably microformats do this better than RDF in general, because there's a constrained process for deciding which domain vocabularies are expressed this way. This is based on current usage patterns. Microformats formalise conventions based on existing practice, whereas with RDF in general the vocabulary creator is at liberty to invent at will. Success of the knowledge representation in microformats is pretty much predetermined - only successful representations (generally beginning from other formats) may apply. Although the same kind of process can be applied to RDF vocabulary development (and there are also methodologies for developing ontologies where there isn't a pre-existing digital model to work from), a lot of the time it's more like throw stuff against the wall and see what sticks. (Which I'd say is perfectly reasonable, given that it's considerably easier and usually safer in RDF to mess around with different models than in most systems, such as relational DBs or XML schemas).

Seth points to the vagueness of the prose definition of foaf:knows (" We take a broad view of ‘knows’"), suggesting that might key to its relative success. I think he's right in the important sense that we can understand the intended concept pretty easily, and because it's broadly defined, apply it in a wide range of circumstances. But along with the human-oriented definition, there's also the machine oriented definition : foaf:knows is an rdf:Property, the domain and range of which are foaf:Person. Which means that if A foaf:knows B, it can be inferred that both A and B are individuals in the class foaf:Person. Given that, also from the formal-logic FOAF spec it can be inferred that both A and B are individuals in the class foaf:Agent. This kind of semantics is nothing new, but RDF exploits the naming mechanism of the web (URIs) to allow these definitions to be reused and extended independently. So if X doap:maintainer Y, it's possible to infer (from DOAP's formal definition) that X is a doap:Project and Y a foaf:Person. Again, from FOAF we can infer that Y is also a foaf:Agent. Each of those terms is identified using a URI, and hence data is linked across the web using machine-understandable logic. (This is where I think Seth conflates things - the prose definition may be vague, but the machine definition is completely unambiguous, as far as it goes).

This kind of thing - the application of Semantic Web technologies - has utility in itself as linked data on the web. That utility would be massively increased by the network effect of a Semantic Web. But stepping aside for a moment, how might one get the utility of data on the web without Semantic Web technologies?

The Steampunk subgenre of speculative fiction applies Victorian technology to futuristic themes - a rocket to the moon is fired out of a cannon, the computer is a mechanical instrument powered by steam. The Web is still young, its mechanisms are pretty Victorian, in particular the main source of power is coal-fired HTML. A link like <link  xhref="http://example.org/B" rel="friend">B</a> in itself may lack the formalism of FOAF, but it can convey the same meaning, and the same goes for things like hCalendar or hCard data. Whatever the future holds, microformats are an innovative solution to the problem of getting data on the web using technologies that are already widely deployed.

Above I said it's just the same, publishing RDF/XML or publishing microformat data. This might lead to the assumption that it was one approach versus the other. The truth is quite the opposite. Microformat data can be unambiguously defined, and it can be expressed in the RDF model. A HTML link can be backed by formal semantics ( with the aid of a link to a HTML Meta data profile URI in the doc's head). With the GRDDL mechanism in place, as far as the Semantic Web is concerned, microformat data is RDF.

@en

Danny Ayers
2007-01-08T13:09:23+01:00

Related
Comments
Edit