"Why don't we have a Xanadu web run on Lisp serving up perfect, crystalline RDF?"
Because outside a controlled lab environment all computations and actions are time and resource constrained, That favours a certain style of agent/process over another.
Not appreciating this is the fundamental mistake of classical AI, and it sucks to some extent to see the children of that movement make the same mistakes all over again for the semantic web. Infinite time and resources and scope to get it done right. Except you don't get the chance to do it right, you get killed or obsoleted waiting around for circumstances instead of adapting to them. When economists do this it's called "assume we have a can opener". When software types do it it's called "worse is better".
"Probably the most elegant feature of the Web is the 404. "
Calling it elegant is revisionist. 404 was considered a nasty hack by the kind of people who build things in labs and whine when their creations are clearly correct, but get ignored in the field. Ted couldn't contemplate 404 - game over for Xanudu.
"The open world assumption offers true and unknown."
I think this could be a semweb 404 option - to heck with the open world assumption. And decidibility and syntax neutrality as well. Unknown = false is a really good optimisation for local cases (where all the inference is done anyway) and for time/resource constrained decision making. Your ancestors would have been eaten if all they could assume was "unknown". Going around wondering "what if" is no way to get work done.
In the semweb case if new information arrives and contradicts the false assumption, well that sucks form a formal viewpoint but there are klugey things we can do to fix up the datasets. I had a great book on how to do this once for rule bases, damned if I can find it.
"But on a global scale how can you be sure there isn't the piece of data you're looking for somewhere out there? "
Why care about this? In practical terms open world means information is not going to be rendered as usable now, because things might change. How does that help anyone make decisions?
"The model itself is conceptually simple"
Yeah, but if what you said was true, we'd be awash in RDF backed systems. We're not. Crap like RSS and microformats can come along and wipe the floor with RDF in terms of deployments. RDF is not getting adopted en masse and RDF is as old as XML as makes no difference. Something's fundamentally wrong.
Perhaps what gets adopted means a different notion of simple, one RDF doesn't have. Certainly I see nothing coming down the line that will result in an explosion.
@en