I've just been playing around with the w6 idea a little more, thinking about expressing the data as an XHTML outlines to start, with a bit of CSS for styling and to pass to an XSLT stylesheet to extract RDF, a la GRDDL. I don't use XSLT often enough for it to be second nature, and always find myself learning from scratch and repeating the same mistakes. But I have stumbled on a nice shortcut to getting started.
For today's recipe you'll need a recent Java runtime,
Relaxer, and a stock cube.
First generate a Relax NG schema from a sample of your XML code - this is the bit of XHTML-oriented w6 I started with.
The command couldn't be much easier:
java -jar trang.jar w6.xml w6.rng
Then the w6.rng file is copied into the Relaxer directory (assuming you haven't added it to your path), and the following command run:
java -jar Relaxer.jar -xslt w6.rng w6.xsl
Now in w6.xsl you have, in theory at least, a stylesheet that will apply an identity transformation on the original XML. I may have made a mistake somewhere, but I had to tweak the namespaces when I tried it earlier to get the stylesheet to behave properly (I actually added ns prefixes).
Incidentally, if you don't have another handy, Ant is bundled with a perfectly acceptable XSLT engine, you just need a simple build.xml file. When I can be bothered to look in the manual I'll stick this whole process in the Ant file.
If you're anything like me then producing the starter XSLT this way will save you a couple of hours. If you get bored, you could see which orifices into which the stock cube will fit.[Danny Ayers]