WikiCalc and the Semantic Web@en

Continuing the SemWeb UI theme, in the interests of eyeballs here's something (reformatted) I just mailed to spreadsheet supremo Dan Bricklin (not to be confused with SemWeb supremo Brickley!):

Hi Dan,



I'm in the large group of people looking forward to WikiCalc's completion. There's general agreement that spreadsheets are one of the (if not *the*) most versatile ways of working with data on the desktop. Wikis are in a similar, if less developed, position for the Web.



But most of the Web is human readable content-oriented, there is a step to overcome before we have a more general Web of Data. On the one hand this will involve exposing more databases more directly to the Web (not intermediated by HTML), on the other by providing user interfaces which go beyond simple HTML form-based manipulation of data.



I believe many of the difficult technical issues of both aspects are dealt with by Semantic Web technologies, the cornerstone being the RDF model. Key to this model is the use of URIs as identifiers for not only resources but relationships between those resources. So I was mighty interested to read this remark in a blog post about WikiCalc:

I was talking about this with Pito Salas and saying that every cell could point to a URL. He replied that even more importantly, "every cell would have a unique URL."

I suspect that in these words there' s a potential route to spreadsheets on the Semantic Web.  For starters, in RDF terms of (subject, predicate, object) statements, a naive interpretation of "every cell would have a unique URL" leads to something like :

<http://mydomain.org/spreadsheetX/row123/columnABC> hasValue {content of cell} .

and a naive interpretation "every cell could point to a URL" translates to:



<http://mydomain.org/spreadsheetX/row123/columnABC> hasValue <http://target.com> .

But it should be possible to further exploit the role of URIs in RDF. Commonly individual spreadsheets correspond more-or-less to relations. The column names are names of (typed) places in n-tuples. The rows correspond to individual asserted n-ary statements.



These relations can be normalised to the binary relations of RDF, so in effect the row identifier becomes the subject, the column name becomes the predicate, the contents of the cell the object. [The predicate would be the relation/table, which would have two attributes/columns, subject & object].



An example might look something like:



<http://mydomain.org/spreadsheetX/row123> <http://xmlns.com/foaf/0.1/name> "Joe Lambda" .

<http://mydomain.org/spreadsheetX/row123> <http://financial-ontology.org/salary> "10000" .



There are bound to be practical issues, and there are interesting questions raised - how to express calculated cells, how to manage multiple sub-blocks on a single page, how to import arbitrary RDF... But I'm confident that the requirements of WikiCalc include things (e.g. persisting a description of the data/formulae) that SemWeb interop will be very close.



In implementations terms, it depends a lot on what's behind the scenes of WikiCalc, how data is persisted. But whether it's going into a RDB or is round-tripping to some kind of serialisation (XML or WikiText), the bases should already be mostly covered over in SemWeb-land.



see also :



A bit of commentary on Google and the Semantic Web (esp. last paragraph) 

Grokking Triples from Spreadsheets



Excel RDFizer



Adding Semantics to Excel with Microformats and GRDDL







@en

Danny Ayers
2006-03-31T12:40:19+02:00

Related
Comments
Edit