Grazing the SemWeb

In my last post, I asserted that although I liked the idea of "grazing", that OPML isn't all it's cracked up to be. I also said :

I’ll set up one or two “Reading Lists� just to demonstrate that simple stuff like this can still be done simply, without having to dumb down your world entirely.

Well I haven't done that yet. But I did have a look at the OPod app ("an AJAX OPML and RSS viewer widget") that James Corbett had mentioned. It's a really nice little online data browser.

Here's some data, TimBL's FOAF file. Can OPod understand it? Well not really, the FOAF file is RDF data serialised as RDF/XML, whereas OPod is really looking for OPML XML.

Although OPML is a sow's ear, such is the capability of the web (and XML) that folks have been able to use it for things quite silk pursey. The built-in demo for OPod is the Open Ireland Directory OPML, and as the demo shows, that's pretty sweet.

Ok, at this point you're probably thinking, oh, here comes that directory in RDF. Not this time (the data is in DMOZ, so presumably quasi-RDF/XML should already be available, or you could GRDDL the HTML, whatever…).

But I am going to repeat myself : there's nowt wrong with hierarchical views of web data. It's only when you start shoehorning web (or general real-world) data into a tree data structure that I reckon you're on the road to suckiness.

So here's TimBL's FOAF as OPML. Had to throw a lot of data away, but it's just about good enough for OPod. (Actually you might be better trying a FOAF file of your own - there's some caching going on, an old version of the OPML seems to be loitering).

Prefix your stuff like this: http://pragmatron.org/live/outline?uri=http://www.aaronsw.com/about.xrdf

My server's slow and flakey, the code's the result of an hour or so's clunking, but (when it works) it demonstrate that data can be flattened in this way.

PS. Sounds like it fell over right away. Ah well. I've had a day of distractions, this one was marginally more productive than the others. I'd better make a note of how it should work…(bonus points to anyone that can set this up only using existing online services).

Although processing RDF/XML with XSLT is most assuredly possible, there are advantages in dealing with the data at a model (not least because that's the setup I've got on my server).

So the RDF/XML gets loaded into an in-memory RDF model (Redland/Python), onto which the following SPARQL is applied:

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT ?predicate ?object ?name WHERE
{
?subject ?predicate ?object .
 OPTIONAL {
 ?person  rdf:type foaf:Person .
 ?person  rdfs:seeAlso ?object .
 ?person foaf:name ?name .
 }
}

This is currently hard-coded into outline_demplex.py, which is half the code specific to this job. The XML results of the query are passed through results-to-opml.xsl to give the OPML, that's the other half.

If the OPTIONAL above is met, then a link to the remote FOAF is constructed using the person's name as the label, with the URI being prepended with the rdf2opml conversion. Any other rdfs:seeAlsos get the same link treatment but without the label. Literals show as text, bnodes too (OPML hasn't a clue). Any other resources in the object (i.e. as URIs) are also used as links, but directly. Except for the FOAF case, the text label on the OPML element is the predicate URI (a uri2qname xsl snippet would be useful there). Could no doubt build mini-hierarchies with the SPARQL/XSLT around people, places or whatever in the source data, though that's getting to the point where you might as well just use RDF throughout (or even domain-specific XML).

Other caveats : the OPod tool decides that the target files are OPML by the filename extension (*spluttter*), I was able to trick it at one point with the suffix on URIs of &dummy=.opml, but I can't remember which escaping version worked… I did briefly try this OPML browser, but I think the query part in the URI confused it. Bit more escaping/encoding needed.

See also: About the Tabulator

[Danny]

Danny Ayers

2006-02-06T21:26:45Z

Related

Comments