(PPS. the story has now appeared on ZDNet and InfoWorld remarkably still not yet slashdotted...)
There are gasps of astonishment echoing the globe today, as...well, as Ian Davis puts it:
Iâve just watched MÃ¥rten Mickos from MySQL give a 10 minute talk [at the Web 2.0 Summit] on what he terms the âGreat Database in the Skyâ almost exactly describing the our communityâs vision of a â web of dataâ while remaining completely ignorant of the semantic web.
I can only guess at the specifics of the talk, but a (somewhat
aghast) Nova Spivack pinged me on this, and has a response from
MÃ¥rten to his
blog
post on the subject (as has Ian). Nova mentions wheel
reinvention. In the reply, MÃ¥rten mentions Adam
Bosworth's keynote at the MySQL User Conference in 2005 (IT
Conversations have the
audio).
In it (if I remember correctly) Bosworth suggests database
interconnection based on an RSS kind of model, in essence the
application of DB tech to the vision he described in an ealier talk
(
transcript)
he did for ISCOC04. Not a bad idea, unfortunately it completely
overlooked the progress that has been made on the problems with
putting data on the web since, well, since the web began. A handful
of metadata fields attached to a blob? We can do a *lot* better
than that.
Another point which may be telling in MÃ¥rten's mail to Nova is this:
But I also specifically focused on structured data, and even more specifically on data that is currently stored in relational databases.
Right. Just to take one example, I guess he means like the DBLP bibliography database which is now exposed as semantically-linked data on the web (using MySQL as the backend, funnily enough). Info on more than 800.000 articles and 400.000 authors, that's 15 million RDF triples. Available for consumption by humans or machines on the web, complete with SPARQL query endpoint. This isn't blue sky stuff, it's in the implementation phase now. (Check Ivan Herman's slides).
The structured/unstructured data divide is something of an artificial line. Within the confines of a local database an address book may have flat data, but when you link it to the rest of the world, it's not so flat any more.
But why oh why don't the SQL folks know about all this? (Ok, leaving aside the fact that the chair of the new Semantic Web Education and Outreach interest group is from Oracle...)
Ian again:
What a shame and what a failure of the semantic web community if the CEO of MySQL AB cannot see how his vision for an interconnected web of data is the same as ours!
Quite.
But all the same I think there may be a very positive side to this. Let's say MySQL was to become a " Great Database in the Sky" completely independent of Semantic Web developments. Even if this was just a tiny more connected to the web than a typical MySQL DB is today, integration with the Semantic Web would be fairly trivial (it's not hard now, as the various RDBMS bridges, like the DBLP thing, demonstrate). Why might you want to do this? Because the Semantic Web is designed as an extension of the current web, one which can support data. A large number of the problems have been solved.
Ok, so how would you go about making a distributed RBDMS that might work on a global scale? Well you might want to start with keys that will work in such an environment. No need to invent any GUIDs, just reuse the web's ID field, URIs. What about table (relation) structure? There's obviously going to be a problem trying to create top down schema that could work in such a diverse environment as the world. So you need to break things down into a minimal form, i.e. binary relations, and allow them to be interconnected. How can you enable interlinking on such as scale? Identify the relations with URIs too. Keep going for 5 minutes and you've got RDF. You'd probably want a query language that worked against it too, and maybe even like it to look like SQL. Go on, call it SPARQL. Deploy these on HTTP (which is alsop based on URIs) and you've got a Web of Data, the Semantic Web. (See also : A n Introduction to RDF ((slides that should work for DB folks)))
It's great that the MySQL community is getting exposure to the Web of Data vision, and thinking about how it might be achieved. Meanwhile, let's get this implementation going.
PS. Mad Techie Woman (guess who?!) responds and adds a very good point:
One major difference between the relational model and RDF is that the relational model assumes data agreement before mapping; RDF assumes that data agreement will happen sometime, but isn't too terribly worried about it because any data is welcome, and we can use the data we have now while we work things through.
Andrew Newman points to a few other recent articles about people discussing the same ideas as the Semantic Web. Andrew deserves a special hat-tip here having just published his Honours thesis on a very relevant topic: Applying the relational model to SPARQL (pdf).
@en