Auto-generating GRDDL transformation skeletons@en

Crunch first, background after -

The (very incomplete) generator XSLT, rng2grddl.xsl, takes as its input a RelaxNG schema like this from MusicBrainz, plus an XSLT, config.xsl, which defines a few simple attribute/value pairs for the mapping. The generated XSLT takes MB instance data like this (plus the config.xsl again), as it stands I've got as far as this output:

<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">

<mo:Artist xmlns:mo="http://purl.org/ontology/mo/" rdf:about="http://purl.org/ontology/mo/individuals/c0b2500e-0cef-4130-869d-732b23ed9df5"/>

</rdf:RDF>

It's crude, doesn't really take deep structure into account, but most mappings I've seen having been more or less 1:1 between a given XML term and the RDF term, with a bit of syntax rearranging for striping etc.

There are tools such as Trang which can generate a RelaxNG schema given some instance XML, so I think there's potential here for quite a bit of work-saving.

This morning I was greeted by a mail on the Music Ontology Specification Group list from danbri referring to the MusicBrainz folk's decision to move to a non-RDF XML schema. I had heard about that before, I won't comment except to say I think they've likely got their cost/benefit predictions wrong - but time will tell. Anyhow, although the choice will make integration with other data and extensibility trickier within the context of MusicBrainz itself, from the Semantic Web's point of view it isn't a big deal. The post LazyWeb-requested a GRDDL transform to turn MB XML into some RDF representation based on the draft music ontology. With that the data can be transparently RDF, irrespective of the syntax choice.

Now I've got loads of half-finished, domain-specific xyz2rdf transformations. They're usually relatively straightforward to write, although very time-consuming (hence the half-finished). Half the problem is enumerating and building template outlines for all the elements. In the past I've used Relaxer to provide a head start, it can generate an identity transformation based on a RelaxNG schema. Except it don't seem to work no more.

So I decided to see how difficult it would be to write a utility to take an RelaxNG schema and create a first approximation of an XSLT geared towards RDF/XML output for GRDDL. I've not tried generating XSLT with XSLT before, but Bob DuCharme has done some nice coverage at xml.com. I ran into quite a few snags, but basically got this far - see above.

As usual, I'd be very grateful if someone could take this of my hands and finish it...

 

 

@en

Danny Ayers

2007-01-06T15:54:39+01:00

Related

Comments