The pipes are calling@en

Yahoo! Pipes, that is. It's a real shame they ran into problems on their launch day, but that doesn't negate the fact that they seem to have done the first bit of real innovation in the syndication space for years. I couldn't get anything to run myself, but piecing together coverage in the blogosphere they seemed to have grokked a big concept. Material produced by machines on the web can be passed to other machines on the web. The material consumed/processed/produced is data. The current lingua is RSS 2.0, which seriously limits the scope of a system like this ( correction : it seems it can also read Atom - thanks Shelley - and there's some kind of support for passing URIs & JSON, not sure that changes my point though - what's the data model?). But (as I've argued before) there are a lot of paths to the Semantic Web. There is one which passes milestones like RSS and GData on the way (I'm not sure why the likes of Bosworth and Zawodny don't extrapolate along that line, or for that matter notice the short cut). Anyhow, Yahoo! have done amazing work on the UI, and the pipe metaphor works well.

Mark Nottingham makes the ( oh yeah) comparison between Pipes and WS-* style services, seeing Pipes as " a big wake-up call for the “Enterprise” software industry":

To my eye, the difference — and the power — in Pipes is that the data model isn’t an XML Infoset, it’s an Atom or RSS feed (usually; most of the modules push you in this direction). That gives you more structure and semantics to grab onto and use in the modules, building more value into standard components, rather than having to go and re-invent the wheel for each application, because they all have different formats.

(Heh, in comments there Bill deHora says he tagged Pipes "BPML2.0", also mentioning DSLs, which I'll take the liberty of translating to CustomRdfDialects...)

not a pipe

OpenLink Pipology

Coincidentally, yesterday I had a little play with OpenLink's latest demo, following the instructions from Kingsley Idehen (now with screenshots). This also features a rich Ajaxian UI and piping, though it's not as obvious as Yahoo!'s (they have got what looks like a potentially very useful visual query builder). The bit that Kingsley's showing here is taking existing web data, running it through GRDDL and applying a SPARQL query to the result. I believe he's got a hookup to GData in the pipeline [sic] . Oops, they've had GData support since GData appeared, it was GoogleBase support that was in the pipeline.

PS. As I was clicking "Post" my Colloquy icon started bouncing:

[1:31pm] kidehen: danja: here is a Googlebase query URI: <http://www.google.com/base/feeds/snippets?bq=digital+camera&max-results=5>

[1:32pm] kidehen: danja: just pass as Graph URI to <http://demo.openlinksw.com/isparql> and you will see Triples from Googlebase :-)

SemWeb Pipes

(I eagerly await Uldis filling in his placemarker for this). The pipes metaphor is entirely implementable right now for any data thanks to Semantic Web technologies. The piping itself is provided by the web's own protocol. The nodes being piped together can broadly be categorised as data sources, data sinks and processors.

Example sources are existing web pages, feeds and databases. The data can be translated into an integrateable form (the RDF model) using mechanisms like GRDDL and the various SQL/RDF bridging tools.

Typical processing is aggregation and filtering, plus things like inferencing and application of business rules (using the capabilities of SemWeb languages directly, probably augmented by custom programmatic processing).

Data sinks will typically be UIs, allowing search/query, consumption (data passed into local/remote apps - PIMs are a good example, and there's also reading stuff in a browser!) and navigation (link following) across the available data. It's worth bearing in mind that what's being navigated isn't just the traditional web or feed content, but a dimension that maps to real-world (and imaginary) things directly - this is nicely visualised in these old slides from timbl (seeAlso Linked Data). The source/sink/processor model is a bit of a simplification, because its possible for the users of the system to interact with it, adding new data (and potentially new processing). This stuff isn't theory/vision, the OpenLink kit is just one example of a tool that can implement it.

not a pip

The key to the linkage is the use of URIs to name resources, which includes the sources, sinks and processors. One of the coolest parts of SPARQL is that a query result can have a URI (these URIs a bit ugly in their natural form, but where humans have to see them they could easily be aliased to something prettier). This is nice-to-have for SELECT queries, the results of which lend themselves to presentation/interaction (being XML/JSON). But with CONSTRUCT queries, the "output" data (a representation of the resource the URI identifies) is available in a directly machine-readable form (RDF/XML). The CONSTRUCT can in itself filter/remap source data, making it very versatile for reuse as a source endpoint.

Global MVC

Incidentally, there's another programmer-friendly metaphor nearby, I think Kingsley mentioned this on #swig yesterday - MVC. The Model is that of RDF (with domain-specific models expressed that way), the View being projections of that data (SPARQL is in the frame there, allowing interfacing to tools like browsers) and Controller being the wiring, processing and interactiontypically initiated by HTTP method calls. The big difference from OO MVC is that the Semantic Web is a globally distributed object system, and the connections between components/services are very loosely coupled.

Hmm, there's maybe an interesting psycho-social-tech case study somewhere around Idehen & Mazzocchi. Kingsley had gone deep & practical ( award-winning deep & practical) into mainstream approaches to data (SQL, XML etc). Stefano's probably best known for his work on Cocoon, going deep & practical ( award-winning deep & practical) into mainstream approaches to data (servlets, XML, and pipelines). While both of them continue to use mainstream tools as well, SemWeb tech seems core to what they're doing nowadays. I wonder if they had Road to Damascus events, or just drifted into this area...if the former, perhaps Bosworth and Zawodny could be induced down that road ;-)

 

@en

Danny Ayers
2007-02-09T13:36:31+01:00

Related
Comments
Edit