Where does the reasoner go?

Seth Ladd is asking what is the Best Reasoner Available? There are some very good issues raised in his post, which more or less starts with: "ok we've got the RDF storage (Oracle 10g) - now what?"

One big question is whether Semantic Web applications need reasoners. I'm sure this will vary from case to case. The first kind of situation that springs to mind is a 'pure' SemWeb app, one which depends heavily on the logic, e.g. an expert system using something like the Wine Ontology. At the opposite extreme there will be case where no inferencing is needed (or realistic), for instance when a company already has its data in a regular RDBMs, with the business logic being expressed through a combination of SQL and object code. The RDF/OWL expression of the data would then effectively be a view.

But back to Seth's situation, where you do have a triplestore. I honestly have no idea in the situation where you have a largish amount of data and want full inferencing. Coincidentally I was chatting about this over the weekend with Chris Langreiter. I couldn't think of any largish scale triplestore that had built-in RDFS and/or OWL reasoning, unless Jena's OntModel can be persistent. (Hmm, using the DIG interface to reasoners, presumably you'd have to tell them all the statements, which might be tricky if you're talking millions of triple…)

Around this space there are questions of whether the store should do eager or lazy evaluation of the inferences - think when you add triples or when you look for them. Eager would presumably be easiest to implement, but likely suffer from issues due to the increased number of triples present. I'm not really sure how one would manage lazy evaluation of queries in an efficient way that would still be complete. Can tableaux do this? Or maybe a graph path walking approach something along the lines of Jos De Roo's Euler might do the trick.

I can say from experience that a non-reasoning RDF store can be very useful, especially used an agile replacement for a regular relational DB. I've also found limited reasoning very valuable - just using MortenF's IFP smusher with Redland made a huge difference to the capabilities (capabilities that can be considered built-in, rather than hanging off in separate arbitrary code).

In the forseeable future on the Web at large I think it likely that non- or just minimal-inferencing RDF stores will make up the majority, analogously to the way most databases/filesystems behind current sites aren't particularly smart.

Seth also asks more generally about SemWeb app architecture. I guess we need a Patterns book… As it happens I've got a longish post in the pipeline for one such kind of setup (content-oriented).

Anyhow, go read Seth's post (and cc me with comments ;-)

[Danny]

Danny Ayers

2006-01-18T11:58:54Z

Related

Comments