Dave Winer's reviewing his OPML outliner format, a new spec is mentioned, and asking for use cases for extension mechanisms with XML namespaces in mind. I'm not exactly what you'd call a fan of OPML, but people are using it for feedlists, and there's been talk of it being used for other purpose like attention. If the data's going to be out there, being pragmatic it might as well be usable. The starting point for that would be a clear specification, and if Dave's revisiting this then there's hope. So here are some thoughts on OPML extensions and 3 use cases.
Ok, there are two things worth noting to start. OPML itself
doesn't have a namespace. That complicates the reuse elsewhere of
anything defined within OPML. The other thing is that OPML already
has an extension point, the
type attribute:
type is a string, it says how the other attributes of the <outline> are interpreted.
This deviates from the usual namespace-based approach to
extensibility, being closer to what the markup greybeards call
architectural forms. There are few semantics defined in OPML,
except for the loose notion of containership it inherits from XML.
I'd say this increases the significance of the
type attribute. OPML (by design) doesn't support mixed
content, and what content there is, is currently placed in
attributes. This is likely to complicate any inclusion of
namespace-based extension elements. So I'd say that without a lot
of work on defining the meaning of containership within the format
(a fairly fundamental restructuring of the simple-inheritance data
model), there is little to be gained from using namespaced
elements. However it may be feasible and useful to use namespace
extensions on attributes (e.g.
<outline dc:creator="Dave">)
But aside from this, the existing type mechanism could be tightened up so that OPML data could be extended, but making interpretation unambiguous. One way this could be done would be use profiles (in a similar fashion to HTML profiles). So for example the "power OPML" guidelines are being followed, a URI is used to declare this, a lot of the ambiguity disappears. Think Postel. The URI could maybe appear as an attribute on the root element, i.e.
<opml
profile="http://feeds.scripting.com/powerOpmlGuidelines">
Dave's asking for use cases. Right, first I have the semi-serious Gopher NG, the idea of reimplementing what the Gopher protocol could do but with XML over HTTP (I believe this is essentially the same as Dave's "World Outline" idea). This wouldn't actually need much in the way of extensions, mostly just a clear spec.
My second use case is absolutely serious - reading the stuff on the Semantic Web, i.e. using RDF tools. I've got some code for this. There are the same basic requirements as for any XML format to be interpreted in such a fashion: (a) the ability to identify OPML data as such, (b) that there is a domain model expressible in the RDF model, (c) that some kind of parsing is available. On the first point (a), ideally OPML would have its own mime type, but I somehow don't see that happening. But reading the root element name is a reasonable method.
Point (b) is rarely hard in general, and in the case of OPML
it's very straightforward. OPML is a simple tree structure with a
some links as named attributes and labelled literal strings. This
maps easily into the more general graph model of RDF, with links as
URI-identified resources and the attributes as literal properties.
The only tricky part is the typing of things with the
type attribute, but with clear specs that'd be
straightforward too.
On (c), personally I'm sick of writing one-off parsers when there are generic RDF/XML parsers available for every language under the sun, so my approach there has been to use XSLT to transform to RDF/XML. I've done OPML feedlists, both in a simple interpretation ( opml2blogroll.xsl) and one which interprets the categories found in many aggregators as SKOS terms ( opml2skosroll.xsl). Once the data is available as RDF, the rest of the use case is wide open - I've been using it rather predictably in an aggregator which has SPARQL query facilities (the aggregated view is in fact just SPARQL results XSLT'd). I've also had a go a round-tripping OPML via SPARQL+XSLT. A simple feedlist is pretty easy to output, but I've not had motivation to try anything deeply nested. OPML is already listed (slightly optimistically ;-) at micromodels.org.
A third use case, although probably not a path I'm likely to go down myself, is consistent translation to XHTML. Many of the OPML tools I've seen actually do their rendering as HTML, and there is good justification for making the data available to other tools in a consistent fashion. There are standard ways of expressing the structures OPML uses in HTML (ol, li, a, dl etc), but it would be good to encourage consistent use of these. The XOXO microformat offers an entirely suitable approach - XHTML based, encouraging use of standards. I originally posted this bit and the profiles idea as a comment over at Nick Bradbury's, where the discussion was about attention data. In the scenario where people were using both Attention.xml and OPML for the stuff, having an OPML mapping to XOXO would automatically enable interop.
PS. here's an example of the kind of thing I mean, incorporating
Nick Bradbury's
attention
rank suggestion:
<opml version="1.0"
profile="http://nick.typepad.com/attention#">
<head>
<title>Attention Example</title>
</head>
<body>
<outline text="My Rankings" type="attention">
<outline text="Paolo's Weblog"
xmlUrl="http://paolo.evectors.it/rss.xml" rank="43"/>
<outline text="Lessig Blog"
xmlUrl="http://www.lessig.org/blog/index.rdf" rank="12"/>
</outline>
</body>
</opml>