Crystal@en

Jeez, I've done a lot of waffling over the past few days. A week or two with little opportunity to blog and look what happens. Here's another one.

Crystal

Les Orchard :

Why don't we have a Xanadu web run on Lisp serving up perfect, crystalline RDF?

His rhetorical point is that successful technologies tend to be a bit dirty. In answer to the tone of Les' question, I'd say "because the real world isn't like that, it's a fractal mess".

It's a nice quote, but a little misleading. For a start there are aspects of the web that look like Xanadu (online aggregators feature pages build of parts from remote sources, trackback is a kind of two-way linkage), some bits of the web run on Lisp, there's an increasing amount of RDF around. That the web can support these isn't really a reflection of dirtiness but more like it not imposing too much up front.

Coincidentally Dan Connolly mentions a 1997 article of his today, it includes the classic:

[the Web is] the minimum amount of distributed object technology necessary to get the job done

It seems to me that the dirtiness visible in many successful technologies is more like an artifact of their success rather than any kind of causal factor. The technology had features that meant it was fit for some purpose in the particular environment in a particular period in time. The desirability of achieving that purpose outweighed any other considerations, any faults were carried along as an appendix. The technology thrived despite the failings.

But the long-term success of something can be jeopardised by such weaknesses, especially if the environment changes (poor little dodos). The software environment changes relatively rapidly. A technology may be adequate for the task that led to its success, but lack the flexibility to be repurposed. I'm also sure that long term, the alchemy needed to turn fools gold into gold is a lot more expensive than buying into gold in the first place.

Les was talking about independent developers actively supporting, evangelising even, what are demonstrably badly-designed technologies. He says they've won something from their use. I reckon this is probably very important. One aspect is that the web infrastructure makes it possible to get results, net gain, despite systematic faults in the technologies being used. Another is the developer feedback loop: something works with tech X, they're more likely to use tech X again, and they'll be willing to put up with more coding hardship every time around the loop. There's a proximity thing too: if they need something tech X can't do, they're more likely to look to tech W or Y than A or B. But even in the worst case this isn't going to be a problem for anyone except the individual, it seems pretty clear the web can even cope with large monoculture communities.

I've a feeling the feedback loops at a larger social scale are probably at least as significant as the developer-local ones - what the corporates adopt, dotcomboombust, web2.0memeology.

Hmm. Things still aren't anywhere that simple. In the human sphere the system can be gamed, advertising and misinformation can tip the balance towards one product over another. Chance plays a part. Oh yeah, and there are a lot of different measures of success. There are a lot of situations where Lisp would be chosen long before PHP.

Whatever, variety is the spice of life. Personally I don't think I could stay sane working with the same stuff all the time, but I still don't really spend enough time playing with other stuff. For the past few years, following any foray outside I've always returned to pretty much the same set of tools: Web, XML, RDF, object-oriented coding. But it is envigorating to play with other languages or whatever, there's always something useful to bring back.

I must confess to strongly preferring Lisp over PHP, although I'm miles behind current practice with the former and find the latter mighty handy for transclusions. But naturally it's the RDF part of Les' line I really want to seize upon.

I find the portrayal of RDF as crystalline pretty funny. My own mental picture of it is quite the opposite, the far side of fractals: lumpy, furry, organic.

fiskurs the zombie cat

Sure, there is the comparative purity of the formal logic underneath, actually the DL side of things would probably smell crystalline to a synaesthete. But where RDF shines is in its ability to cope with messy stuff. In fact the phrase "fractal mess" I nabbed from a TimBL slide targetting the One Big Ontology myth.

A fairly large proportion of the web is backed by relational databases, which are also (in principal) based on "clean" logic. But the data they contain can't be exposed directly to the web because no matter how perfect they are locally, that perfection can't scale in an imperfect environment. If you started joining a few randomly chosen DBs you'd soon run into not only naming clashes but semantic clashes (both in the human what-I-mean sense and the mathematical sense).

The RDF approach is firstly to use the naming scheme of the web - URIs - something which has been shown to scale. A level of global consistency is necessary for interop, but as uniformity increases so does brittleness, monocultures can collapse. What's more the world (including the software bits) is far from homogenous. Uniformity is a feature of standards, but the standards that work best are scoped to just what is absolutely necessary for their role. The layering of specifications (like Atom on XML on Unicode) helps maximise the benefits of uniformity without sacrificing diversity.

RDF is pretty much at the point of the minimum structure needed to express data - the simple 3-part statement. The need for complete global consistency above this simple level is avoided by allowing selective "templating" of the data according to local needs (i.e. the use of ontologies and processing logic).

Probably the most elegant feature of the Web is the 404. By allowing "Not Found" as a valid behaviour, the system as a whole can be robust. I think the open world assumption of the Semantic Web has the same characteristics. Traditional databases offer true (a query solution, a piece of data exists) and false (it doesn't). But on a global scale how can you be sure there isn't the piece of data you're looking for somewhere out there? The open world assumption offers true and unknown.

(In the same way you probably want a book expressed in hypertext to be complete, you can make the closed world assumption locally to manage, query and reason over a specific bunch of RDF data e.g. store it in a relational database)

@en

Danny Ayers
2006-03-19T21:16:23+01:00

Related
Comments
Edit