SPLUCKY FSMs

The other day I suggested Lucky SPARQL (now SPLUCKY, thanks Kingsley), a little convention in which a SPARQL query with an additional parameter, if it's results were a single URI, would redirect to that URI. One detail I'd overlooked is that it would also be desirable to specify the Accept: type passed to the ultimate URI, but that could be given easily enough in the query parameters.

Anyhow, insomnia just brought this thing back to mind, and it occured to me that this form of queries -

http://example.org/sparql?query=SELECT+DISTINCT+... &action=redirect

- could also appear as resources in the dataset being queried and hence appear as the result of a SPLUCKY query. So one SPLUCKY query could pull out another, which would do the auto-GET thing for another and so on, hopping from URI to URI. Rather like a finite state machine.

Ok, observant readers may have noticed that this is just a glorified redirect chain/loop, not quite an FSM. But it's not far off! [said with the enthusiasm of someone who's successfully built a nearly perpetual motion machine]. In fact the shape of it (combined with insomnia) brought to mind the Turing Machine. But as you need to write to the tape, and idempotent GETs won't allow that, implementing that with SPLUCKY seems a non-starter. Although it did occur to me that if you flip it over, the URI itself could correspond to the tape, that can be modified. Whatever, these kind of machines are very similar to the way R/W Web activity happens, the similarity being more evident when shuffling resources/statements as with SPARQL 1.1. (A nice insomniscient thought is that the Web is just a fancy Turing Machine tape, and humans are the rules tables...).

Incidentally, although SPARQL has a media type "application/sparql", I couldn't see any ref in the specs to using the URI of a query in place of the query itself in the query parameters (!), i.e. something like:

endpoint/query?queryURI=http://example.org/abc.rq

Thought I'd seen that set up on an endpoint somewhere, is it specified anywhere?

Comments to G+ please


danja
2012-04-13T04:55:00+01:00
fsm turing sparql machine rdf splucky
Related
Comments
Edit

Lucky SPARQL

tl;dr : how to give SPARQL endpoints an "I'm Feeling Lucky" option and hence support things like WebFinger

Take a query like:

SELECT DISTINCT ?blog WHERE {
   ?person foaf:name "James Snell" .
   ?person foaf:weblog ?blog .
}
LIMIT 1

If I'm asking something like that, then what I'm probably trying to achieve is to get to James' blog. But if use that on an endpoint, what I'll get back is a bunch of XML (or JSON), from which I'll have to parse out the URI, then fire off another GET. So what about having the endpoint server support an additional parameter, something like:

http://example.org/sparql?query=SELECT+DISTINCT+... &action=redirect

which would tell the server to pull out the URI in the results, and return:

HTTP/1.1 302 Found
Location: http://chmod777self.blogspot.com

- thus taking me straight to my actual target.

WebFingering

I've had James Snell's proposal for simplifying WebFinger simmering away in the back of my mind. I'm unconvinced by the architectural style of what he suggests (Gopher?), but he does get bonus points for creativity. (See also James' response on that). In the query above I've used foaf:name which is likely to give ambiguous results. But if it was foaf:mbox_sha1sum instead, you've got a mechanism for WebFinger with James' optimization. Ok, the request URI is a bit cumbersome, but templating a short version for special cases like WebFinger would be easy enough.

PS. A better name might be "Optimistic SPARQL" (and probably return a 404 if the query doesn't return a suitable pattern).

Comments to G+ please


danja
2012-03-29T15:51:36+01:00
sparql semweb rdf gopher webfinger
Related
Comments
Edit

Debunking the 27 Club with SPARQL

This morning I stumbled across a Fortean Times piece about the "27 Club". The story goes that an awful lot of popular musicians have died at the age of 27. A recent new member of the club is Amy Winehouse, and there was a notable cluster with Brian Jones, Jim Morrison, Jimi Hendrix and Janis Joplin all joining around 1970. The idea of the 27 Club appears to have started soon after Kurt Cobain's death in 1994. Rock mythology being what it is, the origin of the 27 Club is now taken as being bluesman Robert Johnson's pact with the Devil (at the crossroads).

So, is there any truth in this? According to a rock star biographer quoted in Wikipedia "there is a statistical spike for musicians who die at 27", but also there's been a British Medical Journal study that showed no such spike. So, contradictory evidence, the jury's still out... But as it happens Wikipedia also has a good collection of data about musicians and that data is available in processing-friendly linked data from dbPedia. So I thought I'd look into this myself.

Long story short, does the highlighted column here look like a spike?

lifespan

I started by finding the Wikipedia page for Kurt Cobain: https://en.wikipedia.org/wiki/Kurt_Cobain. Given that it's easy to get dbPedia's identifier for the man: http://dbpedia.org/resource/Kurt_Cobain. Opening Kurt's URI in a browser results in a redirect to a page about him (following the 303 convention): http://dbpedia.org/page/Kurt_Cobain. It displays the pieces of data dbPedia knows about him, the properties and their values. From that I was able to see how the relevant facts were expressed, and translate them to the following triples in Turtle notation :

PREFIX foaf:

PREFIX ont:

PREFIX db:

PREFIX xsd:

db:Kurt_Cobain a ont:MusicalArtist .

db:Kurt_Cobain foaf:name "Kurt Cobain" .

db:Kurt_Cobain ont:birthDate "1967-02-20"^^xsd:date .

db:Kurt_Cobain ont:deathDate "1994-04-05"^^xsd:date .

This is enough to use as a template for a SPARQL query, putting a variable in place of Kurt's identifier. Given the Robert Johnson story it seems reasonable to filter out any musicians born before the 20th century.

PREFIX foaf:

PREFIX ont:

PREFIX xsd:

SELECT ?name ?birth ?death WHERE {

?m a ont:MusicalArtist ;

foaf:name ?name ;

ont:birthDate ?birth ;

ont:deathDate ?death .

FILTER (?birth > "1900-01-01"^^xsd:date)

}

Running that query produces 4099 results, which seemed small enough to handle in a spreadsheet. Had it been a few more I'd probably have opted for the JSON representation of the results and done the processing with a little script. Had the querying been more complex (likely to cause timeouts on dbPedia) I'd probably have had to do some CONSTRUCT queries to extract the chunks of dbPedia of interest in RDF and put those in a local store, running queries against that. But it wasn't and it wasn't, so I ran the query directly, choosing the XML+XSLT stylesheet option to give me results in HTML. These I simply copied from the browser and pasted into a LibreOffice spreadsheet.

The spreadsheet automatically figured out the date format so I was able to get the musician's ages with a trivial calculation. Sorting on this column revealed that the first 33 entries were duff data, mostly invalid format. Neither Wikipedia nor dbPedia are perfect. But 4066 values, even allowing for a few errors along the way, should be a big enough sample size to test the theory.

You can see here another problem with the data - Kurt has two entries. I guess something like Google Refine could be used to tidy this up, but I went with the assumption that such problems would be reasonably evenly distributed. PS. a DISTINCT qualifier in the SELECT clause in the query would be an improvement like this, and the ?name bit would be better dropped (it's not needed and introduces duplicates). I used the SNORQL endpoint of dbPedia.

Here's an online Google Spreadsheet derived from my LibreOffice original.

So, results. I'll leave the statistical significance measuring to someone else, but to my eyes at least there doesn't seem to be a spike at 27, with only 40 deaths (there's a bigger version of the chart here). If anything, there may be a spike at the top value, 95 deaths at age 74. There may well be a 27 Club of accursed musicians, but the 74 Club is more popular. I don't have the figures for normal humans, but the BMJ found that "musicians in their 20s and 30s were two to three times more likely to die prematurely than the general UK population".

Keith Richards is 68.

Comments to G+ please.


danja
2012-03-07T16:21:58+01:00
linkeddata 27club sparql rdf data linked dbpedia journalism
Related
Comments
Edit

Consolidation

A little follow-up to my post Everyone has a Graph Store. Two main things: looking at those graphs from a different perspective and a little initiative I'm putting forward to try and advance a particular aspect of this stuff. (PS. I've gone on about the first point a lot longer than intended and the dogs need walking, so I'll leave the second thing for another day - in lieu of that check SPARQL Box).

"Graphs" are just Structured Data

Given the response I got on twitter, G+ etc. there must have been something right about that post, but the most interesting feedback I got relates to what was wrong with it. Specifically from Kingsley Idehen (@kidehen) :

do the people we need to engage really care about the facts that they've been using 'Graphs' forever? I don't think so. Why not remind them of the fact that they've been working with structured data forever, but in silos prior to the emergence of the ubiquitous Web.

I was bandwaggoning the graph meme, in the sense of the Social Graph that's been talked about a lot in recent years, along with things like Tim Berners-Lee's description of the WWW as the Giant Global Graph. I also had in mind the concrete notion of the graph as found in RDF. But Kingsley's absolutely right to point out that what we're talking about here is really just structured data and how we use it.

I'll borrow a little from Kingsley's own history to help clarify the point. Go back two decades and you'll find Kingsley starting a company (which became OpenLink) focused on data integration sofware. Their products were middleware that allow connections to be made between various kinds of enterprise databases and applications. They were based on industry standards, allowing pluggability between systems (acronym city: SQL, XML, ODBC, JDBC, OLE, ADO...). Kingsley had recognised there was a market for this stuff because, in essence, being able to connect different systems together significantly increased the value and utility of those systems - the whole being greater than the sum of parts. Fast-forward to say a decade ago, and a new kind of data integration was becoming feasible - using the Web. Rather than using standards designed for connecting specific enterprise tools together, this exploited open, global standards, notably URLs and HTTP. While XML was (and is) useful for this purpose (and HTML also has its uses), the emerging Resource Description Framework has Web techologies as its foundations, so is ideally suited for integrating data in this environment. Seeing the advantages of using not only Web technologies as middleware but also the Web as a database in its own right, Kinsgley ensured his company was an early adopter and they've been at the forefront of the development of linked data ever since.

But there's a lot more to this than enterprise databases.

Local Structured Data

Every time we use a computer we are working with structured data. Even if it's just Word documents on a file system, there are relationships and interactions between the pieces of information we're working with. Take a look at your Start Menu or whatever the OS X Toolbar is called: every one of the applications there uses data in a structured fashion. While there will be some system-wide integration of their data, e.g. in allowing intelligent search, essentially each application operates in it's own little isolated world.

Back to the Web again and we see all the different companies, services and application operating in a similar fashion, commonly referred to as data silos. But the take home here, as Kingsley puts it, is that we've all been using structured data forever. The challenge for the next generation of software, whether we interact with it on our cell phone, laptop, desktop, domestic appliance or the Web is genuine integration. The best integration capability we have to date is through Web technologies.

Here I'll quote Kingsley again (from G+). He's talking in the context of linked data advocacy, but the point he makes is a much broader, practical one:

Basically, we should be demonstrating 'Linked Data Inside' effects on existing apps (Access, File Maker, Excel, Google Spreadsheet etc..). Here's the the pleasant surprise and one of my eternal Linked Data frustrations: each of the native tools above have natural bindings to Linked Data courtesy of:

1. HTTP GET support -- so each Linked Data Resource URL is a Data Source Name, easily comparable to an ODBC/JDBC Data Source Name

2. CSV output support -- meaning to make 3-tuples or 4-tuples and then save to a Text file that practically N-Triples .

Let's take this opportunity to collectively fix the broken Linked Data narrative. Fixing that will also enable critical fixes to the broken Semantic Web narrative. Everything is a Remix, but Linked Data (the ultimate remix technology) is described or pitched as the ultimate remix facilitator.

More generally, in other words, the future is already here (it's just not very evenly distributed). Referring back to my previous blog post, you can legitimately search & replace "Graph" with "Structured Data".


danja
2012-03-01T14:52:17+01:00
box kidehen sparql rdf data linked
Related
Comments
Edit

Introducing dork

That's Descriptions of Runtime Klasses. Some simple Java for getting RDF out of code trees.

The RDF can be used to generate class diagrams, like this:

class tree

An interesting aspect of the Web Beep project is processor pipelines. To optimize things I needed to play with parameters easily so wound up building a system interface covering the processors and pipelines. As it stand in the source now, the configuration is set up from Java structures. But to see what the configuration is, a recursive toString() on the Java structures yields a fairly structured text description of the configuration (there's an example on the How It Works page).

This led me to think that if such descriptions could be used to describe existing configurations, they could also be used to set up those configurations. The format's ad hoc, so first it made sense to look at using something standard. The processor pipelines are essentially graphs (with annotations) so RDF was naturally the hammer I chose. The general processors/pipelines model is encoded (better word?) in the Java class structure, so if I could get that in RDF it'd be a good start. It's general-purpose stuff so I've split it off as a separate project at github and given it a silly name.

This kind of thing's been done before, in fact I'm hoping to incorporate David Huynh's doclet (for use with Javadoc to generate RDF) as well in the near future. But that approach gets its data 'statically' from the source, whereas the parameters at runtime are important for Web Beep's processors etc. I've made a start on the write-up with the code (ermm, Javadoc's todo :), but one key thing is just using a describe() method in the kind of places you might use a toString(). It should return a snippet of Turtle-syntax RDF describing the object in which it appears. I've also made a start on some easy-to-use utility methods that use reflection to extract a description of objects which doesn't rely on them having a describe() method, bit of a lighter touch.

As a sanity check on the generated RDF I made a (pretty trivial) SPARQL query with XSLT transform to GraphViz dot format, the result of which can be used (with straightforward command-line tools) to generate images like the one above. [I remembered half way through that Redland's rapper utility can output dot format, but that's RDFy (see screenshot) and I'm after something much more app-specific.] There's a little script which shows how the image was arrived at.


danja
2012-01-19T22:12:46+01:00
java dork dot diagrams class sparql rdf
Related
Comments
Edit

Introducing JEdwards

JEdwards is a little sub-project I've just been putting together in Java. Screenshot.

It's so named for two reasons:

  1. it's roughly a contraction of "towards a Javascript editor"
  2. it's something you probably want to ignore (like twincest :)

Having said that, it does have a couple of features that may be of interest to sane developers:

  1. a Java terminal emulator (bash shell)
  2. syntax highlighting for SPARQL/Turtle

Neither are entirely finished, but both are useable/reusable (Apache 2 license, or somesuch).

evil jedward

I've been using Eclipse for most of my dev stuff for years now. When I was doing things in Node.js I wound up configuring it to have a file explorer pane, a text editor pane (for Javascript, HTML, Turtle or SPARQL) and three terminal panes all connected to the local shell. Eclipse was basically a (slow) sledgehammer to crack a nut. I did spend a while looking for a way of setting these things up using separate apps, but was beaten by the problem of pinning the windows to the workspace. I believe it should be possible using Devil's Pie or similar, but I had no joy. But as it happened I wanted a terminal emulator in Java anyhow and had played with syntax highlighting before.

In Scute I'd put together some basic highlighting for Turtle, except when I came to look at it again it was a bit too hardcoded to reuse, and Javascript is quite complicated... Looking around I came across jsyntaxpane, which is a pluggable highlighter which takes its config from a JFlex lexer. It'd got the necessary for Javascript, so I decided to use that instead of my hacky code. I found a SPARQL/Flex file on the Web that someone had prepared for IntelliJ IDEA which although was geared to do other things saved me a bit of time writing out the SPARQL patterns. Here's sparql.flex.

For the terminal emulator I started with the JConsole UI from BeanShell, to which I've adding the bits which talk to the bash shell. It works ok on this Ubuntu machine, I've no idea what would be needed to set it up for a different OS. The source for that is here.

I started Scute, a desktop RDF toolkit, just over a year ago. I did get some bits working fairly well - I was using the SPARQL bits for real - but then I got distracted and left it largely unusable... This JEdwards bit of coding has got me back into it, and tightened up how I was thinking about the dev process. I must write this up properly. The main idea is, while it should be built from reusable components, the way it's setup as a whole will be optimized for how I want to work. Somewhat inspired by woodcarving, where a lot of the time what's best isn't a general purpose tool (wood router or software IDE) but a highly focused tool (1/4" No.4 fishtail gouge or JEdwards). If the resulting code is useful for other people, great, but the motivation isn't to create a product, just to help my own personal workflow. Horse before cart dogfood.

The reusable components part comes from testing. I'm lazy about tests at the best of times, and Scute is all about GUI so is a bit tricky to test. But I reckon component-level functional tests make a fair a substitute for unit tests. Anyhow, more about this another day.


danja
2012-01-18T19:12:14+01:00
scute terminal emulator jedwards sparql turtle syntax highlighter rdf
Related
Comments
Edit

Hixie's Furniture

Too long; read later - here's a demo : SPARQL Sliders Test

+Ian Hickson posted a lovely semweb use case:

"I'd like a search tool for furniture that works like Google's Flight Search does for flights. That is, with sliders so I can say what type of furniture (table), what range of widths (1-2m), lengths (2-5m), and heights (1-2m), what material (wood), what thickness, what price range, etc, I'd like, with the list of available products updating in real time."

As it happens I wanted a slider thingy ages ago, so this was a good prompt to make a demo of the front end part which takes the values from slider components and uses them in a SPARQL query.

For convenience/lack of available data the demo runs against dpPedia via the SNORQL SPARQL Explorer. As furniture and it's dimensions wasn't available it uses cities and their populations and elevations.

So how would you get real data?

First of all, furniture vendors could either provide dumps of their data or, more Webby, mark up their sites with RDFa and/or HTML5 microdata using e.g. the GoodRelations e-commerce vocabulary.

Ultimately, for a front end like these sliders to work, the data would need to go in a store with a SPARQL endpoint. But, triplestores shouldn't be thought of as just a wacky alternative to a SQL database. A triplestore is just a cache of a little chunk of the Linked Data Web. The question of where the store resides and how the data is collected is entirely open. Following the more traditional DB model, a service might aggregate the data published by known furniture suppliers and provide the endpoint online.

But alternately, a local user agent (I think Chris Bizer had a little Java example, can't find the link...there are others) could crawl the Web to answer the query just-in-time. The advantage of this approach is that it's more thorough and the only real option for totally arbitrary queries, the downside being that it's answer will probably take longer than milliseconds. But remember triplestores are caches, not every little bit of information would have to be discovered and read from every page. There are vocabs for dataset and vocab discovery (remind me of the acronyms please :) Note too that you're not limiting your client agent to a single datastore. traditional backends (SQL or NoSQL) are effectively isolated silos, triplestores are integrated with the links of the Web.

Incidentally, this is something that might be nice to express as a Web Intent, along the lines of "make me a query from this template with these parameters and apply it to this endpoint, putting the results into this widget" (that's a bit verbose for a general-purpose intent, but you get the gist). c.f. RDFAffordances.




danja
2012-01-11T15:01:56+01:00
sparql demo goodrelations rdf hixie furniture
Related
Comments
Edit

Another SPARQL solution

Bravo! A solution to the latest SPARQL puzzle.

@glenn_mcdonald found a way of getting the non-Roman-god solar system bodies:

PREFIX rdfs:  <http://www.w3.org/2000/01/rdf-schema#>
PREFIX wn:    <http://www.w3.org/2006/03/wn/wn20/schema/>
PREFIX id:    <http://wordnet.rkbexplorer.com/id/>

SELECT DISTINCT ?planet WHERE {
  ?s1 wn:memberMeronymOf id:synset-solar_system-noun-1 .
  ?s1 rdfs:label ?planet .
  OPTIONAL {
    ?s1 wn:containsWordSense ?ws1 .
    ?ws1 wn:word ?w .
    ?ws2 wn:word ?w .
    ?s2 wn:containsWordSense ?ws2 .
    ?s2 wn:hyponymOf id:synset-Roman_deity-noun-1 .
  }
  FILTER (!bound(?s2))
}

Isolating just the planets looks to be out of reach using the WordNet endpoint alone, but I guess that can be left as a challenge for federated query e.g. CONSTRUCTs from different datasets into a local store before SELECTing.

Update

From RobVesse -

Here's an even simpler query for yesterdays puzzle - still doesn't isolate real planets though

PREFIX rdfs:    <http://www.w3.org/2000/01/rdf-schema#>
PREFIX wn:  <http://www.w3.org/2006/03/wn/wn20/schema/>

SELECT DISTINCT ?label WHERE 
{
 ?s1 wn:memberMeronymOf <http://wordnet.rkbexplorer.com/id/synset-solar_system-noun-1> .
 ?s1 rdfs:label ?label.
 OPTIONAL
 {
  ?s2 wn:hyponymOf <http://wordnet.rkbexplorer.com/id/synset-Roman_deity-noun-1> .
  ?s2 rdfs:label ?label.
 }
 FILTER (!BOUND(?s2))
}

...plus...

Here's a soln using wordnet and dbpedia to show only planets not named after roman gods, requires a SPARQL 1.1 engine to run

PREFIX rdfs:    <http://www.w3.org/2000/01/rdf-schema#>
PREFIX wn:  <http://www.w3.org/2006/03/wn/wn20/schema/>

SELECT DISTINCT ?label WHERE 
{
 SERVICE <http://wordnet.rkbexplorer.com/sparql/>
 {
   ?s1 wn:memberMeronymOf <http://wordnet.rkbexplorer.com/id/synset-solar_system-noun-1> .
   ?s1 rdfs:label ?label.
 }
 MINUS
 {
  SERVICE <http://wordnet.rkbexplorer.com/sparql/>
  {
    ?s2 wn:hyponymOf <http://wordnet.rkbexplorer.com/id/synset-Roman_deity-noun-1> .
    ?s2 rdfs:label ?label.
  }
 }
 BIND(URI(CONCAT("http://dbpedia.org/resource/", ?label)) AS ?dbpResource)

Here's a suitable engine: Leviathan (a demo of the SPARQL Engine used in dotNetRDF).


danja
2011-04-12T20:19:30+01:00
sparql puzzle rdf
Related
Comments
Edit

Another SPARQL puzzle

Using the WordNet endpoint at http://wordnet.rkbexplorer.com/sparql/ I can get the names of the solar system bodies that are named after Roman gods with :

PREFIX rdfs:		<http://www.w3.org/2000/01/rdf-schema#>
PREFIX wn:	<http://www.w3.org/2006/03/wn/wn20/schema/>

SELECT DISTINCT ?label WHERE {
?s1 wn:memberMeronymOf <http://wordnet.rkbexplorer.com/id/synset-solar_system-noun-1> .
?s1 rdfs:label ?label.
?s2 wn:hyponymOf <http://wordnet.rkbexplorer.com/id/synset-Roman_deity-noun-1> .
?s2 rdfs:label ?label.
}

The challenge is to get the names of the solar system bodies that aren't named after Roman gods. (Ideally I'd like planets in the solar system... rather than ...bodies, but I can't see a suitable class).


danja
2011-04-11T20:42:00+01:00
sparql puzzle rdf
Related
Comments
Edit

Pattern exclusion in SPARQL

Seconds after I twittered the last post, @LeeFeigenbaum responded.

Ok, so I have two patterns, and I want to find the statements that match either pattern but don't match both. The solution is rather a flexible little idiom for this kind of negation. The specific patterns are:

?set dbpp:wikiPageUsesTemplate  <http://dbpedia.org/resource/Template:Infobox_programming_language> .

and

?set a yagoc:ProgrammingLanguage106898352 .

(I'm running this agains dbPedia)

Lee's solution is:

PREFIX dbpp:		<http://dbpedia.org/property/>
PREFIX yagoc: <http://dbpedia.org/class/yago/>

SELECT count(?set) where {
{
?set dbpp:wikiPageUsesTemplate <http://dbpedia.org/resource/Template:Infobox_programming_language> .
OPTIONAL {
?set a ?marker .
FILTER(?marker = yagoc:ProgrammingLanguage106898352)
}
FILTER(!bound(?marker))
   } UNION {

?set a yagoc:ProgrammingLanguage106898352 .
OPTIONAL {
?set dbpp:wikiPageUsesTemplate ?marker .
FILTER(?marker = <http://dbpedia.org/resource/Template:Infobox_programming_language>)
}
FILTER(!bound(?marker))
}
}

(Note that COUNT isn't (yet) standard SPARQL, but seeing the size of the result sets was handy here).

It's looks convoluted, but each half of the UNION is kind-of the converse of the other (and will give interesting results independently). I was a little surprised it did work as variables are scoped to the whole query and ?marker looked troublesome. But FILTERs are scoped to the local group, and that's where it matters here (it will produce the same results if you had a different variable for each half of the UNION).

There is something slightly odd happening in this particular case (or I'm missing something obvious). The figures I got before were 762 matches for the UNION of the two patterns, 178 for the intersection, so I'd have expected 762 - 178 = 584 results, but this gives 406. So there's a bit of sloppy QED around here. I was missing something obvious.

Lee again via twitter: the numbers look perfect to me - the 762 double-counts the 178 in the intersection. 406+178=584

As @glenn_mcdonald and Lee have pointed out, a DISTINCT would fix my original UNION query to exclude the dupes. Glenn also offers a more concise version taking advantage of a Virtuoso feature:

PREFIX dbpp:		<http://dbpedia.org/property/>
PREFIX yagoc: <http://dbpedia.org/class/yago/>

SELECT count(?set1) where {
{
?set1 dbpp:wikiPageUsesTemplate <http://dbpedia.org/resource/Template:Infobox_programming_language> .
FILTER NOT EXISTS {?set1 a yagoc:ProgrammingLanguage106898352}
} UNION {
?set1 a yagoc:ProgrammingLanguage106898352 .
FILTER NOT EXISTS {?set1 dbpp:wikiPageUsesTemplate <http://dbpedia.org/resource/Template:Infobox_programming_language>}
}
}


danja
2011-03-28T19:35:15+01:00
negation sparql rdf
Related
Comments
Edit

Long multiplication

Querying http://dbpedia.org/sparql


PREFIX yagoc:		<http://dbpedia.org/class/yago/>
SELECT COUNT(?set1) where {
?set1 a yagoc:ProgrammingLanguage106898352 .
}

result = 336

PREFIX dbpp:		<http://dbpedia.org/property/>

SELECT COUNT(?set2) where {
?set2 dbpp:wikiPageUsesTemplate <http://dbpedia.org/resource/Template:Infobox_programming_language>
}

result = 426

disjunction, language is in set1 OR set2

PREFIX dbpp:		<http://dbpedia.org/property/>
PREFIX yagoc: <http://dbpedia.org/class/yago/>
SELECT count(?s) where {
{
?s dbpp:wikiPageUsesTemplate <http://dbpedia.org/resource/Template:Infobox_programming_language> .
} UNION {
?s a yagoc:ProgrammingLanguage106898352 .
}
}

result = 762

conjunction, language is in set1 AND set2

PREFIX dbpp:		<http://dbpedia.org/property/>
PREFIX yagoc: <http://dbpedia.org/class/yago/>
SELECT COUNT(?set) where {
?set a yagoc:ProgrammingLanguage106898352 .
?set dbpp:wikiPageUsesTemplate <http://dbpedia.org/resource/Template:Infobox_programming_language> .
}

result = 178

PREFIX dbpp:		<http://dbpedia.org/property/>
PREFIX yagoc: <http://dbpedia.org/class/yago/>

SELECT count(?and) where {
{
?or dbpp:wikiPageUsesTemplate <http://dbpedia.org/resource/Template:Infobox_programming_language> .
} UNION {
?or a yagoc:ProgrammingLanguage106898352 .
}
?and dbpp:wikiPageUsesTemplate <http://dbpedia.org/resource/Template:Infobox_programming_language> .
?and a yagoc:ProgrammingLanguage106898352 .

FILTER(?and = ?or)
}

result = 356!?

Took me a long while to realise what that number represents, definitely time for a break...

What I'm trying to find (if it's possible) are queries to look at the difference between the sets above, the 762 - 178 = 584 part. I'm hoping something along the lines of Finding Resources that don't have a certain property might work. If anyone knows an idiom that'll work (or knows that it isn't possible) please ping me.


danja
2011-03-28T13:59:45+01:00
sparql puzzle rdf
Related
Comments
Edit

SPARQL Results and HTML

A thought in passing. When I've need to display SPARQL results in a browser I've generally either used some kind of programmatic templating (as in this blog) or XSLT on XML results - which can get clunky, but when the transformation is done, it's done. Results XML is straightforward (and I'm still rather fond of XML) but the choice of syntax is pretty arbitrary. The RDF that comes back from a CONSTRUCT is grand, that's a really nice kind of query, the data is immediately ready for reuse (it might an obsessive-compulsive thing, but DESCRIBE still feels a bit messy). I've not got around to playing with JSON results, presumably that lends itself to speedy application in most languages.

But I can't help thinking it'd be neat if SPARQL results came back directly as RDFa so by default you had something that made sense both in a browser and to an RDF agent. Is there anything you can do with a SELECT that a CONSTRUCT-to-HTML couldn't do? Is there any way the stuff could be structured to simplify templating? There's at least one results XML to HTML XSLT around somewhere, I guess that could be tweaked for experimantation.


danja
2010-12-14T07:54:28+01:00
rdfa html sparql rdf
Related
Comments
Edit