My original plan was to hook Chumpalogica into SparqlSphere ( baby's named a bad, bad thing). But a couple of hours ago while I was struggling to figure out what it pickled, I realised the implications of using Mark Pilgrim's Universal Feed Parser. That's a very handy bit of kit if you want grab anything remotely like RSS and pull out/display the items in feeds. Unfortunately, it's only really set up to handle core RSS/Atom elements, and one thing I need from the start is the ability to handle RSS with arbitrary extensions (e.g. FOAF Output SKOS). I could maybe have had RDF/XML pass through directly, and another alternative would be to use the tag soup support built into Raptor. But there's a little bit of inference (folksonomy tag equivalence) that may be easier to do before the RDF goes in the store - I really don't know yet. So on balance I thought it would be easier just to pump the raw XMLish stuff through my PSoup SAX-like quasi-XML parser-hyphen-cleaner, followed by Morten's anyfeed-to-RSS 1.0 XSLT.
Turns out PSoup didn't work as well as I remembered (no
characters, oops), prolly because I never finished
namespace support. But I've got the
feed
reader/transformer bits in place, I can live with ignoring
invalid XML. Not sure how many times I've been around this bit of
the circuit but gzip/if-modified-since/etags support is starting to
feel like boilerplate. But overall I'm more or less back at the
position I thought I was at this morning. Ah well.