A little follow-up to my post Everyone has a Graph Store. Two main things: looking at those graphs from a different perspective and a little initiative I'm putting forward to try and advance a particular aspect of this stuff. (PS. I've gone on about the first point a lot longer than intended and the dogs need walking, so I'll leave the second thing for another day - in lieu of that check SPARQL Box).
"Graphs" are just Structured Data
Given the response I got on twitter, G+ etc. there must have been something right about that post, but the most interesting feedback I got relates to what was wrong with it. Specifically from Kingsley Idehen (@kidehen) :
do the people we need to engage really care about the facts that they've been using 'Graphs' forever? I don't think so. Why not remind them of the fact that they've been working with structured data forever, but in silos prior to the emergence of the ubiquitous Web.
I was bandwaggoning the graph meme, in the sense of the Social Graph that's been talked about a lot in recent years, along with things like Tim Berners-Lee's description of the WWW as the Giant Global Graph. I also had in mind the concrete notion of the graph as found in RDF. But Kingsley's absolutely right to point out that what we're talking about here is really just structured data and how we use it.
I'll borrow a little from Kingsley's own history to help clarify the point. Go back two decades and you'll find Kingsley starting a company (which became OpenLink) focused on data integration sofware. Their products were middleware that allow connections to be made between various kinds of enterprise databases and applications. They were based on industry standards, allowing pluggability between systems (acronym city: SQL, XML, ODBC, JDBC, OLE, ADO...). Kingsley had recognised there was a market for this stuff because, in essence, being able to connect different systems together significantly increased the value and utility of those systems - the whole being greater than the sum of parts. Fast-forward to say a decade ago, and a new kind of data integration was becoming feasible - using the Web. Rather than using standards designed for connecting specific enterprise tools together, this exploited open, global standards, notably URLs and HTTP. While XML was (and is) useful for this purpose (and HTML also has its uses), the emerging Resource Description Framework has Web techologies as its foundations, so is ideally suited for integrating data in this environment. Seeing the advantages of using not only Web technologies as middleware but also the Web as a database in its own right, Kinsgley ensured his company was an early adopter and they've been at the forefront of the development of linked data ever since.
But there's a lot more to this than enterprise databases.
Local Structured Data
Every time we use a computer we are working with structured data. Even if it's just Word documents on a file system, there are relationships and interactions between the pieces of information we're working with. Take a look at your Start Menu or whatever the OS X Toolbar is called: every one of the applications there uses data in a structured fashion. While there will be some system-wide integration of their data, e.g. in allowing intelligent search, essentially each application operates in it's own little isolated world.
Back to the Web again and we see all the different companies, services and application operating in a similar fashion, commonly referred to as data silos. But the take home here, as Kingsley puts it, is that we've all been using structured data forever. The challenge for the next generation of software, whether we interact with it on our cell phone, laptop, desktop, domestic appliance or the Web is genuine integration. The best integration capability we have to date is through Web technologies.
Here I'll quote Kingsley again (from G+). He's talking in the context of linked data advocacy, but the point he makes is a much broader, practical one:
Basically, we should be demonstrating 'Linked Data Inside' effects on existing apps (Access, File Maker, Excel, Google Spreadsheet etc..). Here's the the pleasant surprise and one of my eternal Linked Data frustrations: each of the native tools above have natural bindings to Linked Data courtesy of:
1. HTTP GET support -- so each Linked Data Resource URL is a Data Source Name, easily comparable to an ODBC/JDBC Data Source Name
2. CSV output support -- meaning to make 3-tuples or 4-tuples and then save to a Text file that practically N-Triples .
Let's take this opportunity to collectively fix the broken Linked Data narrative. Fixing that will also enable critical fixes to the broken Semantic Web narrative. Everything is a Remix, but Linked Data (the ultimate remix technology) is described or pitched as the ultimate remix facilitator.
More generally, in other words, the future is already here (it's just not very evenly distributed). Referring back to my previous blog post, you can legitimately search & replace "Graph" with "Structured Data".