I've just been playing around with the w6 idea a little more, thinking about expressing the data as an XHTML outlines to start, with a bit of CSS for styling and to pass to an XSLT stylesheet to extract RDF, a la GRDDL. I don't use XSLT often enough for it to be second nature, and always find myself learning from scratch and repeating the same mistakes. But I have stumbled on a nice shortcut to getting started.
For today's recipe you'll need a recent Java runtime,
Trang
and
Relaxer, and a stock cube.
First generate a Relax NG schema from a sample of your XML
code - this is the bit of
XHTML-oriented
w6 I started with.
The command couldn't be much easier:
java -jar trang.jar w6.xml w6.rng
Then the
w6.rng file is
copied into the Relaxer directory (assuming you haven't added it to
your path), and the following command run:
java -jar Relaxer.jar -xslt w6.rng w6.xsl
Now in
w6.xsl you have,
in theory at least, a stylesheet that will apply an identity
transformation on the original XML. I may have made a mistake
somewhere, but I had to tweak the namespaces when I tried it
earlier to get the stylesheet to behave properly (I actually added
ns prefixes).
Incidentally, if you don't have another handy, Ant is bundled with a perfectly acceptable XSLT engine, you just need a simple build.xml file. When I can be bothered to look in the manual I'll stick this whole process in the Ant file.
If you're anything like me then producing the starter XSLT this way will save you a couple of hours. If you get bored, you could see which orifices into which the stock cube will fit.
[Danny Ayers]