Seconds after I twittered the last post, @LeeFeigenbaum responded.
Ok, so I have two patterns, and I want to find the statements that match either pattern but don't match both. The solution is rather a flexible little idiom for this kind of negation. The specific patterns are:
?set dbpp:wikiPageUsesTemplate <http://dbpedia.org/resource/Template:Infobox_programming_language> .
and
?set a yagoc:ProgrammingLanguage106898352 .
(I'm running this agains dbPedia)
Lee's solution is:
PREFIX dbpp: <http://dbpedia.org/property/>
PREFIX yagoc: <http://dbpedia.org/class/yago/>
SELECT count(?set) where {
{
?set dbpp:wikiPageUsesTemplate <http://dbpedia.org/resource/Template:Infobox_programming_language> .
OPTIONAL {
?set a ?marker .
FILTER(?marker = yagoc:ProgrammingLanguage106898352)
}
FILTER(!bound(?marker))
} UNION {
?set a yagoc:ProgrammingLanguage106898352 .
OPTIONAL {
?set dbpp:wikiPageUsesTemplate ?marker .
FILTER(?marker = <http://dbpedia.org/resource/Template:Infobox_programming_language>)
}
FILTER(!bound(?marker))
}
}
(Note that COUNT isn't (yet) standard SPARQL, but seeing the size of the result sets was handy here).
It's looks convoluted, but each half of the UNION is kind-of the converse of the other (and will give interesting results independently). I was a little surprised it did work as variables are scoped to the whole query and ?marker looked troublesome. But FILTERs are scoped to the local group, and that's where it matters here (it will produce the same results if you had a different variable for each half of the UNION).
There is something slightly odd happening in this particular case (or I'm missing something obvious). The figures I got before were 762 matches for the UNION of the two patterns, 178 for the intersection, so I'd have expected 762 - 178 = 584 results, but this gives 406. So there's a bit of sloppy QED around here. I was missing something obvious.
Lee again via twitter: the numbers look perfect to me - the 762 double-counts the 178 in the intersection. 406+178=584
As @glenn_mcdonald and Lee have pointed out, a DISTINCT would fix my original UNION query to exclude the dupes. Glenn also offers a more concise version taking advantage of a Virtuoso feature:
PREFIX dbpp: <http://dbpedia.org/property/>
PREFIX yagoc: <http://dbpedia.org/class/yago/>
SELECT count(?set1) where {
{
?set1 dbpp:wikiPageUsesTemplate <http://dbpedia.org/resource/Template:Infobox_programming_language> .
FILTER NOT EXISTS {?set1 a yagoc:ProgrammingLanguage106898352}
} UNION {
?set1 a yagoc:ProgrammingLanguage106898352 .
FILTER NOT EXISTS {?set1 dbpp:wikiPageUsesTemplate <http://dbpedia.org/resource/Template:Infobox_programming_language>}
}
}